All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] zram/xvmalloc: 64K page fixes and optimizations
@ 2011-01-26 17:21 Robert Jennings
  2011-01-26 17:22 ` [PATCH 1/7] zram/vmalloc: Correct tunings to enable use with 64K pages Robert Jennings
                   ` (6 more replies)
  0 siblings, 7 replies; 26+ messages in thread
From: Robert Jennings @ 2011-01-26 17:21 UTC (permalink / raw)
  To: Nitin Gupta
  Cc: Greg Kroah-Hartman, Robert Jennings, Pekka Enberg, devel, linux-kernel

The xvmalloc allocator is non-functional when running with a 64K page
size.  The first two patches fix 64K page related issues.

The remaining patches provide some small optimizations for zram and
xvmalloc.

Regards,
Robert Jennings

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/7] zram/vmalloc: Correct tunings to enable use with 64K pages
  2011-01-26 17:21 [PATCH 0/7] zram/xvmalloc: 64K page fixes and optimizations Robert Jennings
@ 2011-01-26 17:22 ` Robert Jennings
  2011-01-26 17:48   ` Pekka Enberg
  2011-01-26 17:23 ` [PATCH 2/7] zram: Prevent overflow in logical block size Robert Jennings
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Robert Jennings @ 2011-01-26 17:22 UTC (permalink / raw)
  To: Nitin Gupta; +Cc: Greg Kroah-Hartman, Pekka Enberg, devel, linux-kernel

xvmalloc will not currently function with 64K pages.  Newly allocated
pages will be inserted at an offset beyond the end of the first-level
index.  This tuning is needed to properly size the allocator for 64K
pages.

The default 3 byte shift results in a second level list size which can not
be indexed using the 64 bits of the flbitmap in the xv_pool structure.
The shift must increase to 4 bytes between second level list entries to
fit the size of the first level bitmap.

Here are a few statistics for structure sizes on 32- and 64-bit CPUs
with 4KB and 64KB page sizes.

bits_per_long              32        64        64
page_size               4,096     4,096    65,535
xv_align                    4         8         8
fl_delta                    3         3         4
num_free_lists            508       508     4,094
xv_pool size            4,144b    8,216b   66,040b
per object overhead        32        64        64
zram struct 0.5GB disk    512KB    1024KB      64KB

This patch maintains the current tunings for 4K pages, adds an optimal
sizing for 64K pages and adds a safe tuning for any other page sizes.

Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
---
 drivers/staging/zram/xvmalloc_int.h |   18 ++++++++++++++++--
 1 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/zram/xvmalloc_int.h b/drivers/staging/zram/xvmalloc_int.h
index e23ed5c..051a49b 100644
--- a/drivers/staging/zram/xvmalloc_int.h
+++ b/drivers/staging/zram/xvmalloc_int.h
@@ -19,7 +19,11 @@
 /* User configurable params */
 
 /* Must be power of two */
+#ifdef CONFIG_64BIT
+#define XV_ALIGN_SHIFT 3
+#else
 #define XV_ALIGN_SHIFT	2
+#endif
 #define XV_ALIGN	(1 << XV_ALIGN_SHIFT)
 #define XV_ALIGN_MASK	(XV_ALIGN - 1)
 
@@ -27,8 +31,18 @@
 #define XV_MIN_ALLOC_SIZE	32
 #define XV_MAX_ALLOC_SIZE	(PAGE_SIZE - XV_ALIGN)
 
-/* Free lists are separated by FL_DELTA bytes */
-#define FL_DELTA_SHIFT	3
+/*
+ * Free lists are separated by FL_DELTA bytes
+ * This value is 3 for 4k pages and 4 for 64k pages, for any
+ * other page size, a conservative (PAGE_SHIFT - 9) is used.
+ */
+#if PAGE_SHIFT == 12
+#define FL_DELTA_SHIFT 3
+#elif PAGE_SHIFT == 16
+#define FL_DELTA_SHIFT 4
+#else
+#define FL_DELTA_SHIFT (PAGE_SHIFT - 9)
+#endif
 #define FL_DELTA	(1 << FL_DELTA_SHIFT)
 #define FL_DELTA_MASK	(FL_DELTA - 1)
 #define NUM_FREE_LISTS	((XV_MAX_ALLOC_SIZE - XV_MIN_ALLOC_SIZE) \
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/7] zram: Prevent overflow in logical block size
  2011-01-26 17:21 [PATCH 0/7] zram/xvmalloc: 64K page fixes and optimizations Robert Jennings
  2011-01-26 17:22 ` [PATCH 1/7] zram/vmalloc: Correct tunings to enable use with 64K pages Robert Jennings
@ 2011-01-26 17:23 ` Robert Jennings
  2011-01-26 17:42   ` Pekka Enberg
  2011-01-26 17:23 ` [PATCH 3/7] zram: Speed insertion of new pages with cached idx Robert Jennings
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Robert Jennings @ 2011-01-26 17:23 UTC (permalink / raw)
  To: Nitin Gupta; +Cc: Greg Kroah-Hartman, Pekka Enberg, devel, linux-kernel

On a 64K page kernel, the value PAGE_SIZE passed to
blk_queue_logical_block_size would overflow the logical block size
argument (resulting in setting it to 0).

Take the minimum of PAGE_SIZE or 4096 and use this for the block device
logical block size.

Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
---
 drivers/staging/zram/zram_drv.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index d0e9e02..d5e0275 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -621,7 +621,8 @@ static int create_device(struct zram *zram, int device_id)
 	 * and n*PAGE_SIZED sized I/O requests.
 	 */
 	blk_queue_physical_block_size(zram->disk->queue, PAGE_SIZE);
-	blk_queue_logical_block_size(zram->disk->queue, PAGE_SIZE);
+	blk_queue_logical_block_size(zram->disk->queue,
+			(unsigned short) min_t(unsigned int, PAGE_SIZE, 4096));
 	blk_queue_io_min(zram->disk->queue, PAGE_SIZE);
 	blk_queue_io_opt(zram->disk->queue, PAGE_SIZE);
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/7] zram: Speed insertion of new pages with cached idx
  2011-01-26 17:21 [PATCH 0/7] zram/xvmalloc: 64K page fixes and optimizations Robert Jennings
  2011-01-26 17:22 ` [PATCH 1/7] zram/vmalloc: Correct tunings to enable use with 64K pages Robert Jennings
  2011-01-26 17:23 ` [PATCH 2/7] zram: Prevent overflow in logical block size Robert Jennings
@ 2011-01-26 17:23 ` Robert Jennings
  2011-01-26 17:53   ` Pekka Enberg
  2011-01-26 17:25 ` [PATCH 4/7] zram/xvmalloc: free bit block insertion optimization Robert Jennings
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 26+ messages in thread
From: Robert Jennings @ 2011-01-26 17:23 UTC (permalink / raw)
  To: Nitin Gupta; +Cc: Greg Kroah-Hartman, Pekka Enberg, devel, linux-kernel

Calculate the first- and second-level indices for new page when the pool
is initialized rather than calculating them on each insertion.

Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
---
 drivers/staging/zram/xvmalloc.c     |   13 +++++++++++--
 drivers/staging/zram/xvmalloc_int.h |    4 ++++
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/zram/xvmalloc.c b/drivers/staging/zram/xvmalloc.c
index 3fdbb8a..a507f95 100644
--- a/drivers/staging/zram/xvmalloc.c
+++ b/drivers/staging/zram/xvmalloc.c
@@ -184,8 +184,13 @@ static void insert_block(struct xv_pool *pool, struct page *page, u32 offset,
 	u32 flindex, slindex;
 	struct block_header *nextblock;
 
-	slindex = get_index_for_insert(block->size);
-	flindex = slindex / BITS_PER_LONG;
+	if (block->size >= (PAGE_SIZE - XV_ALIGN)) {
+		slindex = pagesize_slindex;
+		flindex = pagesize_flindex;
+	} else {
+		slindex = get_index_for_insert(block->size);
+		flindex = slindex / BITS_PER_LONG;
+	}
 
 	block->link.prev_page = 0;
 	block->link.prev_offset = 0;
@@ -316,6 +321,10 @@ struct xv_pool *xv_create_pool(void)
 	if (!pool)
 		return NULL;
 
+	/* cache the first/second-level indices for PAGE_SIZE allocations */
+	pagesize_slindex = get_index_for_insert(PAGE_SIZE);
+	pagesize_flindex = pagesize_slindex / BITS_PER_LONG;
+
 	spin_lock_init(&pool->lock);
 
 	return pool;
diff --git a/drivers/staging/zram/xvmalloc_int.h b/drivers/staging/zram/xvmalloc_int.h
index 051a49b..68db384 100644
--- a/drivers/staging/zram/xvmalloc_int.h
+++ b/drivers/staging/zram/xvmalloc_int.h
@@ -52,6 +52,10 @@
 
 /* End of user params */
 
+/* Cache of the first/second-level indices for PAGE_SIZE allocations */
+static u32 pagesize_slindex;
+static u32 pagesize_flindex;
+
 enum blockflags {
 	BLOCK_FREE,
 	PREV_FREE,
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 4/7] zram/xvmalloc: free bit block insertion optimization
  2011-01-26 17:21 [PATCH 0/7] zram/xvmalloc: 64K page fixes and optimizations Robert Jennings
                   ` (2 preceding siblings ...)
  2011-01-26 17:23 ` [PATCH 3/7] zram: Speed insertion of new pages with cached idx Robert Jennings
@ 2011-01-26 17:25 ` Robert Jennings
  2011-01-26 17:55   ` Pekka Enberg
  2011-01-26 18:44   ` Nitin Gupta
  2011-01-26 17:26 ` [PATCH 5/7] zram/xvmalloc: wrap debug code in #ifdef DEBUG Robert Jennings
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 26+ messages in thread
From: Robert Jennings @ 2011-01-26 17:25 UTC (permalink / raw)
  To: Nitin Gupta
  Cc: Greg Kroah-Hartman, Robert Jennings, Pekka Enberg, devel, linux-kernel

There is no need to set the bits in the first- and second-level indices
to indicate a free page when we know that a free page existed at this
level.

Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
---
 drivers/staging/zram/xvmalloc.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/zram/xvmalloc.c b/drivers/staging/zram/xvmalloc.c
index a507f95..b3622f1 100644
--- a/drivers/staging/zram/xvmalloc.c
+++ b/drivers/staging/zram/xvmalloc.c
@@ -205,6 +205,8 @@ static void insert_block(struct xv_pool *pool, struct page *page, u32 offset,
 		nextblock->link.prev_page = page;
 		nextblock->link.prev_offset = offset;
 		put_ptr_atomic(nextblock, KM_USER1);
+		/* If there was a next page then the free bits are set. */
+		return;
 	}
 
 	__set_bit(slindex % BITS_PER_LONG, &pool->slbitmap[flindex]);
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 5/7] zram/xvmalloc: wrap debug code in #ifdef DEBUG
  2011-01-26 17:21 [PATCH 0/7] zram/xvmalloc: 64K page fixes and optimizations Robert Jennings
                   ` (3 preceding siblings ...)
  2011-01-26 17:25 ` [PATCH 4/7] zram/xvmalloc: free bit block insertion optimization Robert Jennings
@ 2011-01-26 17:26 ` Robert Jennings
  2011-01-26 17:44   ` Pekka Enberg
  2011-01-26 17:27 ` [PATCH 6/7] zram/xvmalloc: Close 32byte hole on 64-bit CPUs Robert Jennings
  2011-01-26 17:27 ` [PATCH 7/7] zram: mark the disk as non-rotating media Robert Jennings
  6 siblings, 1 reply; 26+ messages in thread
From: Robert Jennings @ 2011-01-26 17:26 UTC (permalink / raw)
  To: Nitin Gupta
  Cc: Greg Kroah-Hartman, Robert Jennings, Pekka Enberg, devel, linux-kernel

This wraps the code, noted by comments as being debug code, with
#ifdef DEBUG so that it is removed from running in non-debug kernels.

Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
---
 drivers/staging/zram/xvmalloc.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/zram/xvmalloc.c b/drivers/staging/zram/xvmalloc.c
index b3622f1..172514e 100644
--- a/drivers/staging/zram/xvmalloc.c
+++ b/drivers/staging/zram/xvmalloc.c
@@ -219,7 +219,6 @@ static void insert_block(struct xv_pool *pool, struct page *page, u32 offset,
 static void remove_block_head(struct xv_pool *pool,
 			struct block_header *block, u32 slindex)
 {
-	struct block_header *tmpblock;
 	u32 flindex = slindex / BITS_PER_LONG;
 
 	pool->freelist[slindex].page = block->link.next_page;
@@ -232,6 +231,8 @@ static void remove_block_head(struct xv_pool *pool,
 		if (!pool->slbitmap[flindex])
 			__clear_bit(flindex, &pool->flbitmap);
 	} else {
+#ifdef DEBUG
+		struct block_header *tmpblock;
 		/*
 		 * DEBUG ONLY: We need not reinitialize freelist head previous
 		 * pointer to 0 - we never depend on its value. But just for
@@ -242,6 +243,7 @@ static void remove_block_head(struct xv_pool *pool,
 		tmpblock->link.prev_page = 0;
 		tmpblock->link.prev_offset = 0;
 		put_ptr_atomic(tmpblock, KM_USER1);
+#endif
 	}
 }
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 6/7] zram/xvmalloc: Close 32byte hole on 64-bit CPUs
  2011-01-26 17:21 [PATCH 0/7] zram/xvmalloc: 64K page fixes and optimizations Robert Jennings
                   ` (4 preceding siblings ...)
  2011-01-26 17:26 ` [PATCH 5/7] zram/xvmalloc: wrap debug code in #ifdef DEBUG Robert Jennings
@ 2011-01-26 17:27 ` Robert Jennings
  2011-01-26 17:43   ` Pekka Enberg
  2011-01-26 17:27 ` [PATCH 7/7] zram: mark the disk as non-rotating media Robert Jennings
  6 siblings, 1 reply; 26+ messages in thread
From: Robert Jennings @ 2011-01-26 17:27 UTC (permalink / raw)
  To: Nitin Gupta
  Cc: Greg Kroah-Hartman, Robert Jennings, Pekka Enberg, devel, linux-kernel

By swapping the total_pages statistic with the lock we close a
hole in the structure for 64-bit CPUs.

Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
---
 drivers/staging/zram/xvmalloc_int.h |    7 ++-----
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/zram/xvmalloc_int.h b/drivers/staging/zram/xvmalloc_int.h
index 68db384..e51eea4 100644
--- a/drivers/staging/zram/xvmalloc_int.h
+++ b/drivers/staging/zram/xvmalloc_int.h
@@ -93,12 +93,9 @@ struct block_header {
 struct xv_pool {
 	ulong flbitmap;
 	ulong slbitmap[MAX_FLI];
-	spinlock_t lock;
-
+	u64 total_pages;	/* stats */
 	struct freelist_entry freelist[NUM_FREE_LISTS];
-
-	/* stats */
-	u64 total_pages;
+	spinlock_t lock;
 };
 
 #endif
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 7/7] zram: mark the disk as non-rotating media
  2011-01-26 17:21 [PATCH 0/7] zram/xvmalloc: 64K page fixes and optimizations Robert Jennings
                   ` (5 preceding siblings ...)
  2011-01-26 17:27 ` [PATCH 6/7] zram/xvmalloc: Close 32byte hole on 64-bit CPUs Robert Jennings
@ 2011-01-26 17:27 ` Robert Jennings
  2011-01-26 17:44   ` Pekka Enberg
                     ` (2 more replies)
  6 siblings, 3 replies; 26+ messages in thread
From: Robert Jennings @ 2011-01-26 17:27 UTC (permalink / raw)
  To: Nitin Gupta
  Cc: Greg Kroah-Hartman, Robert Jennings, Pekka Enberg, devel, linux-kernel

Adding QUEUE_FLAG_NONROT for the in the request_queue flags.  When used
as a swap device, scan_swap_map will avoid the extra work of optimizing
for rotating media.

Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
---
 drivers/staging/zram/zram_drv.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index d5e0275..450c618 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -593,6 +593,7 @@ static int create_device(struct zram *zram, int device_id)
 		goto out;
 	}
 
+	zram->queue->queue_flags ^= QUEUE_FLAG_NONROT;
 	blk_queue_make_request(zram->queue, zram_make_request);
 	zram->queue->queuedata = zram;
 
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/7] zram: Prevent overflow in logical block size
  2011-01-26 17:23 ` [PATCH 2/7] zram: Prevent overflow in logical block size Robert Jennings
@ 2011-01-26 17:42   ` Pekka Enberg
  2011-01-26 19:09     ` Robert Jennings
  0 siblings, 1 reply; 26+ messages in thread
From: Pekka Enberg @ 2011-01-26 17:42 UTC (permalink / raw)
  To: Nitin Gupta, Greg Kroah-Hartman, Pekka Enberg, devel, linux-kernel

On Wed, Jan 26, 2011 at 7:23 PM, Robert Jennings <rcj@linux.vnet.ibm.com> wrote:
> On a 64K page kernel, the value PAGE_SIZE passed to
> blk_queue_logical_block_size would overflow the logical block size
> argument (resulting in setting it to 0).
>
> Take the minimum of PAGE_SIZE or 4096 and use this for the block device
> logical block size.
>
> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
> ---
>  drivers/staging/zram/zram_drv.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
> index d0e9e02..d5e0275 100644
> --- a/drivers/staging/zram/zram_drv.c
> +++ b/drivers/staging/zram/zram_drv.c
> @@ -621,7 +621,8 @@ static int create_device(struct zram *zram, int device_id)
>         * and n*PAGE_SIZED sized I/O requests.
>         */
>        blk_queue_physical_block_size(zram->disk->queue, PAGE_SIZE);
> -       blk_queue_logical_block_size(zram->disk->queue, PAGE_SIZE);
> +       blk_queue_logical_block_size(zram->disk->queue,
> +                       (unsigned short) min_t(unsigned int, PAGE_SIZE, 4096));

I don't get it. No architecture supports PAGE_SIZE less than 4K so
that expression always ends up being 4096, no?

>        blk_queue_io_min(zram->disk->queue, PAGE_SIZE);
>        blk_queue_io_opt(zram->disk->queue, PAGE_SIZE);
>
> --
> 1.6.0.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/7] zram/xvmalloc: Close 32byte hole on 64-bit CPUs
  2011-01-26 17:27 ` [PATCH 6/7] zram/xvmalloc: Close 32byte hole on 64-bit CPUs Robert Jennings
@ 2011-01-26 17:43   ` Pekka Enberg
  0 siblings, 0 replies; 26+ messages in thread
From: Pekka Enberg @ 2011-01-26 17:43 UTC (permalink / raw)
  To: Nitin Gupta, Greg Kroah-Hartman, Pekka Enberg, devel, linux-kernel
  Cc: Robert Jennings

On Wed, Jan 26, 2011 at 7:27 PM, Robert Jennings <rcj@linux.vnet.ibm.com> wrote:
> By swapping the total_pages statistic with the lock we close a
> hole in the structure for 64-bit CPUs.
>
> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>

Reviewed-by: Pekka Enberg <penberg@kernel.org>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/7] zram/xvmalloc: wrap debug code in #ifdef DEBUG
  2011-01-26 17:26 ` [PATCH 5/7] zram/xvmalloc: wrap debug code in #ifdef DEBUG Robert Jennings
@ 2011-01-26 17:44   ` Pekka Enberg
  2011-01-26 18:27     ` Robert Jennings
  0 siblings, 1 reply; 26+ messages in thread
From: Pekka Enberg @ 2011-01-26 17:44 UTC (permalink / raw)
  To: Nitin Gupta, Greg Kroah-Hartman, Pekka Enberg, devel, linux-kernel
  Cc: Robert Jennings

On Wed, Jan 26, 2011 at 7:26 PM, Robert Jennings <rcj@linux.vnet.ibm.com> wrote:
> This wraps the code, noted by comments as being debug code, with
> #ifdef DEBUG so that it is removed from running in non-debug kernels.
>
> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
> ---
>  drivers/staging/zram/xvmalloc.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/staging/zram/xvmalloc.c b/drivers/staging/zram/xvmalloc.c
> index b3622f1..172514e 100644
> --- a/drivers/staging/zram/xvmalloc.c
> +++ b/drivers/staging/zram/xvmalloc.c
> @@ -219,7 +219,6 @@ static void insert_block(struct xv_pool *pool, struct page *page, u32 offset,
>  static void remove_block_head(struct xv_pool *pool,
>                        struct block_header *block, u32 slindex)
>  {
> -       struct block_header *tmpblock;
>        u32 flindex = slindex / BITS_PER_LONG;
>
>        pool->freelist[slindex].page = block->link.next_page;
> @@ -232,6 +231,8 @@ static void remove_block_head(struct xv_pool *pool,
>                if (!pool->slbitmap[flindex])
>                        __clear_bit(flindex, &pool->flbitmap);
>        } else {
> +#ifdef DEBUG

Shouldn't that be something people can enable from Kconfig?

> +               struct block_header *tmpblock;
>                /*
>                 * DEBUG ONLY: We need not reinitialize freelist head previous
>                 * pointer to 0 - we never depend on its value. But just for
> @@ -242,6 +243,7 @@ static void remove_block_head(struct xv_pool *pool,
>                tmpblock->link.prev_page = 0;
>                tmpblock->link.prev_offset = 0;
>                put_ptr_atomic(tmpblock, KM_USER1);
> +#endif
>        }
>  }
>
> --
> 1.6.0.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/7] zram: mark the disk as non-rotating media
  2011-01-26 17:27 ` [PATCH 7/7] zram: mark the disk as non-rotating media Robert Jennings
@ 2011-01-26 17:44   ` Pekka Enberg
  2011-01-26 17:48   ` Dan Carpenter
  2011-01-26 17:52   ` Robert Jennings
  2 siblings, 0 replies; 26+ messages in thread
From: Pekka Enberg @ 2011-01-26 17:44 UTC (permalink / raw)
  To: Nitin Gupta, Greg Kroah-Hartman, Pekka Enberg, devel, linux-kernel
  Cc: Robert Jennings

On Wed, Jan 26, 2011 at 7:27 PM, Robert Jennings <rcj@linux.vnet.ibm.com> wrote:
> Adding QUEUE_FLAG_NONROT for the in the request_queue flags.  When used
> as a swap device, scan_swap_map will avoid the extra work of optimizing
> for rotating media.
>
> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>

Reviewed-by: Pekka Enberg <penberg@kernel.org>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/7] zram: mark the disk as non-rotating media
  2011-01-26 17:27 ` [PATCH 7/7] zram: mark the disk as non-rotating media Robert Jennings
  2011-01-26 17:44   ` Pekka Enberg
@ 2011-01-26 17:48   ` Dan Carpenter
  2011-01-26 17:52   ` Robert Jennings
  2 siblings, 0 replies; 26+ messages in thread
From: Dan Carpenter @ 2011-01-26 17:48 UTC (permalink / raw)
  To: Nitin Gupta, Greg Kroah-Hartman, Pekka Enberg, devel, linux-kernel


On Wed, Jan 26, 2011 at 11:27:44AM -0600, Robert Jennings wrote:
> Adding QUEUE_FLAG_NONROT for the in the request_queue flags.  When used
                           ^^^^^^^^^^^^^^
"to the" perhaps?

> as a swap device, scan_swap_map will avoid the extra work of optimizing
> for rotating media.
> 
> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
> ---
>  drivers/staging/zram/zram_drv.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
> index d5e0275..450c618 100644
> --- a/drivers/staging/zram/zram_drv.c
> +++ b/drivers/staging/zram/zram_drv.c
> @@ -593,6 +593,7 @@ static int create_device(struct zram *zram, int device_id)
>  		goto out;
>  	}
>  
> +	zram->queue->queue_flags ^= QUEUE_FLAG_NONROT;

QUEUE_FLAG_NONROT is 14.  We want to set the 14 bit.  This won't work.

Use queue_flag_set_unlocked(QUEUE_FLAG_NONROT, zram->queue);

regards,
dan carpenter



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/7] zram/vmalloc: Correct tunings to enable use with 64K pages
  2011-01-26 17:22 ` [PATCH 1/7] zram/vmalloc: Correct tunings to enable use with 64K pages Robert Jennings
@ 2011-01-26 17:48   ` Pekka Enberg
  2011-01-26 18:50     ` Robert Jennings
  0 siblings, 1 reply; 26+ messages in thread
From: Pekka Enberg @ 2011-01-26 17:48 UTC (permalink / raw)
  To: Nitin Gupta, Greg Kroah-Hartman, Pekka Enberg, devel, linux-kernel

On Wed, Jan 26, 2011 at 7:22 PM, Robert Jennings <rcj@linux.vnet.ibm.com> wrote:
> xvmalloc will not currently function with 64K pages.  Newly allocated
> pages will be inserted at an offset beyond the end of the first-level
> index.  This tuning is needed to properly size the allocator for 64K
> pages.
>
> The default 3 byte shift results in a second level list size which can not
> be indexed using the 64 bits of the flbitmap in the xv_pool structure.
> The shift must increase to 4 bytes between second level list entries to
> fit the size of the first level bitmap.
>
> Here are a few statistics for structure sizes on 32- and 64-bit CPUs
> with 4KB and 64KB page sizes.
>
> bits_per_long              32        64        64
> page_size               4,096     4,096    65,535
> xv_align                    4         8         8
> fl_delta                    3         3         4
> num_free_lists            508       508     4,094
> xv_pool size            4,144b    8,216b   66,040b
> per object overhead        32        64        64
> zram struct 0.5GB disk    512KB    1024KB      64KB
>
> This patch maintains the current tunings for 4K pages, adds an optimal
> sizing for 64K pages and adds a safe tuning for any other page sizes.
>
> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
> ---
>  drivers/staging/zram/xvmalloc_int.h |   18 ++++++++++++++++--
>  1 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/staging/zram/xvmalloc_int.h b/drivers/staging/zram/xvmalloc_int.h
> index e23ed5c..051a49b 100644
> --- a/drivers/staging/zram/xvmalloc_int.h
> +++ b/drivers/staging/zram/xvmalloc_int.h
> @@ -19,7 +19,11 @@
>  /* User configurable params */
>
>  /* Must be power of two */
> +#ifdef CONFIG_64BIT
> +#define XV_ALIGN_SHIFT 3
> +#else
>  #define XV_ALIGN_SHIFT 2
> +#endif
>  #define XV_ALIGN       (1 << XV_ALIGN_SHIFT)
>  #define XV_ALIGN_MASK  (XV_ALIGN - 1)
>
> @@ -27,8 +31,18 @@
>  #define XV_MIN_ALLOC_SIZE      32
>  #define XV_MAX_ALLOC_SIZE      (PAGE_SIZE - XV_ALIGN)
>
> -/* Free lists are separated by FL_DELTA bytes */
> -#define FL_DELTA_SHIFT 3
> +/*
> + * Free lists are separated by FL_DELTA bytes
> + * This value is 3 for 4k pages and 4 for 64k pages, for any
> + * other page size, a conservative (PAGE_SHIFT - 9) is used.
> + */
> +#if PAGE_SHIFT == 12
> +#define FL_DELTA_SHIFT 3

This is handled by the else branch already, no?

> +#elif PAGE_SHIFT == 16
> +#define FL_DELTA_SHIFT 4
> +#else
> +#define FL_DELTA_SHIFT (PAGE_SHIFT - 9)
> +#endif

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/7] zram: mark the disk as non-rotating media
  2011-01-26 17:27 ` [PATCH 7/7] zram: mark the disk as non-rotating media Robert Jennings
  2011-01-26 17:44   ` Pekka Enberg
  2011-01-26 17:48   ` Dan Carpenter
@ 2011-01-26 17:52   ` Robert Jennings
  2011-01-26 18:26     ` Robert Jennings
  2 siblings, 1 reply; 26+ messages in thread
From: Robert Jennings @ 2011-01-26 17:52 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Nitin Gupta, Greg Kroah-Hartman, Robert Jennings, Pekka Enberg,
	devel, linux-kernel

* Dan Carpenter (error27@gmail.com) wrote:
> On Wed, Jan 26, 2011 at 11:27:44AM -0600, Robert Jennings wrote:
> > Adding QUEUE_FLAG_NONROT for the in the request_queue flags.  When used
>                            ^^^^^^^^^^^^^^
> "to the" perhaps?

Yeah, that was edited a few times and never re-read.  Thanks.

> > as a swap device, scan_swap_map will avoid the extra work of optimizing
> > for rotating media.
> > 
> > Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
> > ---
> >  drivers/staging/zram/zram_drv.c |    1 +
> >  1 files changed, 1 insertions(+), 0 deletions(-)
> > 
> > diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
> > index d5e0275..450c618 100644
> > --- a/drivers/staging/zram/zram_drv.c
> > +++ b/drivers/staging/zram/zram_drv.c
> > @@ -593,6 +593,7 @@ static int create_device(struct zram *zram, int device_id)
> >  		goto out;
> >  	}
> >  
> > +	zram->queue->queue_flags ^= QUEUE_FLAG_NONROT;
> 
> QUEUE_FLAG_NONROT is 14.  We want to set the 14 bit.  This won't work.
> 
> Use queue_flag_set_unlocked(QUEUE_FLAG_NONROT, zram->queue);

I'll correct this.  Thank you.

-Rob Jennings

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/7] zram: Speed insertion of new pages with cached idx
  2011-01-26 17:23 ` [PATCH 3/7] zram: Speed insertion of new pages with cached idx Robert Jennings
@ 2011-01-26 17:53   ` Pekka Enberg
  2011-01-26 19:50     ` Robert Jennings
  0 siblings, 1 reply; 26+ messages in thread
From: Pekka Enberg @ 2011-01-26 17:53 UTC (permalink / raw)
  To: Nitin Gupta, Greg Kroah-Hartman, Pekka Enberg, devel, linux-kernel

On Wed, Jan 26, 2011 at 7:23 PM, Robert Jennings <rcj@linux.vnet.ibm.com> wrote:
> Calculate the first- and second-level indices for new page when the pool
> is initialized rather than calculating them on each insertion.
>
> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
> ---
>  drivers/staging/zram/xvmalloc.c     |   13 +++++++++++--
>  drivers/staging/zram/xvmalloc_int.h |    4 ++++
>  2 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/staging/zram/xvmalloc.c b/drivers/staging/zram/xvmalloc.c
> index 3fdbb8a..a507f95 100644
> --- a/drivers/staging/zram/xvmalloc.c
> +++ b/drivers/staging/zram/xvmalloc.c
> @@ -184,8 +184,13 @@ static void insert_block(struct xv_pool *pool, struct page *page, u32 offset,
>        u32 flindex, slindex;
>        struct block_header *nextblock;
>
> -       slindex = get_index_for_insert(block->size);
> -       flindex = slindex / BITS_PER_LONG;
> +       if (block->size >= (PAGE_SIZE - XV_ALIGN)) {
> +               slindex = pagesize_slindex;
> +               flindex = pagesize_flindex;
> +       } else {
> +               slindex = get_index_for_insert(block->size);
> +               flindex = slindex / BITS_PER_LONG;
> +       }
>
>        block->link.prev_page = 0;
>        block->link.prev_offset = 0;
> @@ -316,6 +321,10 @@ struct xv_pool *xv_create_pool(void)
>        if (!pool)
>                return NULL;
>
> +       /* cache the first/second-level indices for PAGE_SIZE allocations */
> +       pagesize_slindex = get_index_for_insert(PAGE_SIZE);
> +       pagesize_flindex = pagesize_slindex / BITS_PER_LONG;

Why is this in xv_create_pool(). AFAICT, it can be called multiple
times if there's more than one zram device. Do we really need
variables for these? They look like something GCC constant propagation
should take care of if they would be defines or static inline
functions.

> +
>        spin_lock_init(&pool->lock);
>
>        return pool;
> diff --git a/drivers/staging/zram/xvmalloc_int.h b/drivers/staging/zram/xvmalloc_int.h
> index 051a49b..68db384 100644
> --- a/drivers/staging/zram/xvmalloc_int.h
> +++ b/drivers/staging/zram/xvmalloc_int.h
> @@ -52,6 +52,10 @@
>
>  /* End of user params */
>
> +/* Cache of the first/second-level indices for PAGE_SIZE allocations */
> +static u32 pagesize_slindex;
> +static u32 pagesize_flindex;

This doesn't look right. You shouldn't put variable declarations in
header files.

> +
>  enum blockflags {
>        BLOCK_FREE,
>        PREV_FREE,
> --
> 1.6.0.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7] zram/xvmalloc: free bit block insertion optimization
  2011-01-26 17:25 ` [PATCH 4/7] zram/xvmalloc: free bit block insertion optimization Robert Jennings
@ 2011-01-26 17:55   ` Pekka Enberg
  2011-01-26 18:40     ` Robert Jennings
  2011-01-26 18:44   ` Nitin Gupta
  1 sibling, 1 reply; 26+ messages in thread
From: Pekka Enberg @ 2011-01-26 17:55 UTC (permalink / raw)
  To: Nitin Gupta, Greg Kroah-Hartman, Pekka Enberg, devel, linux-kernel
  Cc: Robert Jennings

On Wed, Jan 26, 2011 at 7:25 PM, Robert Jennings <rcj@linux.vnet.ibm.com> wrote:
> There is no need to set the bits in the first- and second-level indices
> to indicate a free page when we know that a free page existed at this
> level.
>
> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>

Why is it not necessary? I don't know that part of the code well
enough to tell if this patch is safe or not.

> ---
>  drivers/staging/zram/xvmalloc.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/staging/zram/xvmalloc.c b/drivers/staging/zram/xvmalloc.c
> index a507f95..b3622f1 100644
> --- a/drivers/staging/zram/xvmalloc.c
> +++ b/drivers/staging/zram/xvmalloc.c
> @@ -205,6 +205,8 @@ static void insert_block(struct xv_pool *pool, struct page *page, u32 offset,
>                nextblock->link.prev_page = page;
>                nextblock->link.prev_offset = offset;
>                put_ptr_atomic(nextblock, KM_USER1);
> +               /* If there was a next page then the free bits are set. */
> +               return;
>        }
>
>        __set_bit(slindex % BITS_PER_LONG, &pool->slbitmap[flindex]);
> --
> 1.6.0.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/7] zram: mark the disk as non-rotating media
  2011-01-26 17:52   ` Robert Jennings
@ 2011-01-26 18:26     ` Robert Jennings
  0 siblings, 0 replies; 26+ messages in thread
From: Robert Jennings @ 2011-01-26 18:26 UTC (permalink / raw)
  To: Dan Carpenter, Nitin Gupta, Greg Kroah-Hartman, Pekka Enberg,
	devel, linux-kernel

* Robert Jennings (rcj@linux.vnet.ibm.com) wrote:
> * Dan Carpenter (error27@gmail.com) wrote:
> > On Wed, Jan 26, 2011 at 11:27:44AM -0600, Robert Jennings wrote:
> > > Adding QUEUE_FLAG_NONROT for the in the request_queue flags.  When used
> >                            ^^^^^^^^^^^^^^
> > "to the" perhaps?
> 
> Yeah, that was edited a few times and never re-read.  Thanks.
> 
> > > as a swap device, scan_swap_map will avoid the extra work of optimizing
> > > for rotating media.
> > > 
> > > Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
> > > ---
> > >  drivers/staging/zram/zram_drv.c |    1 +
> > >  1 files changed, 1 insertions(+), 0 deletions(-)
> > > 
> > > diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
> > > index d5e0275..450c618 100644
> > > --- a/drivers/staging/zram/zram_drv.c
> > > +++ b/drivers/staging/zram/zram_drv.c
> > > @@ -593,6 +593,7 @@ static int create_device(struct zram *zram, int device_id)
> > >  		goto out;
> > >  	}
> > >  
> > > +	zram->queue->queue_flags ^= QUEUE_FLAG_NONROT;
> > 
> > QUEUE_FLAG_NONROT is 14.  We want to set the 14 bit.  This won't work.
> > 
> > Use queue_flag_set_unlocked(QUEUE_FLAG_NONROT, zram->queue);

I need to drop this patch entirely.  QUEUE_FLAG_NONROT is set in
zram_init_device and is unnecessary here.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/7] zram/xvmalloc: wrap debug code in #ifdef DEBUG
  2011-01-26 17:44   ` Pekka Enberg
@ 2011-01-26 18:27     ` Robert Jennings
  0 siblings, 0 replies; 26+ messages in thread
From: Robert Jennings @ 2011-01-26 18:27 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Nitin Gupta, Greg Kroah-Hartman, Robert Jennings, Pekka Enberg,
	devel, linux-kernel

* Pekka Enberg (penberg@kernel.org) wrote:
> On Wed, Jan 26, 2011 at 7:26 PM, Robert Jennings <rcj@linux.vnet.ibm.com> wrote:
> > This wraps the code, noted by comments as being debug code, with
> > #ifdef DEBUG so that it is removed from running in non-debug kernels.
> >
> > Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
> > ---
> >  drivers/staging/zram/xvmalloc.c |    4 +++-
> >  1 files changed, 3 insertions(+), 1 deletions(-)
> >
> > diff --git a/drivers/staging/zram/xvmalloc.c b/drivers/staging/zram/xvmalloc.c
> > index b3622f1..172514e 100644
> > --- a/drivers/staging/zram/xvmalloc.c
> > +++ b/drivers/staging/zram/xvmalloc.c
> > @@ -219,7 +219,6 @@ static void insert_block(struct xv_pool *pool, struct page *page, u32 offset,
> >  static void remove_block_head(struct xv_pool *pool,
> >                        struct block_header *block, u32 slindex)
> >  {
> > -       struct block_header *tmpblock;
> >        u32 flindex = slindex / BITS_PER_LONG;
> >
> >        pool->freelist[slindex].page = block->link.next_page;
> > @@ -232,6 +231,8 @@ static void remove_block_head(struct xv_pool *pool,
> >                if (!pool->slbitmap[flindex])
> >                        __clear_bit(flindex, &pool->flbitmap);
> >        } else {
> > +#ifdef DEBUG
> 
> Shouldn't that be something people can enable from Kconfig?

It should be, I'll correct this and re-post.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7] zram/xvmalloc: free bit block insertion optimization
  2011-01-26 17:55   ` Pekka Enberg
@ 2011-01-26 18:40     ` Robert Jennings
  2011-01-26 18:51       ` Pekka Enberg
  0 siblings, 1 reply; 26+ messages in thread
From: Robert Jennings @ 2011-01-26 18:40 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Nitin Gupta, Greg Kroah-Hartman, Robert Jennings, Pekka Enberg,
	devel, linux-kernel

* Pekka Enberg (penberg@kernel.org) wrote:
> On Wed, Jan 26, 2011 at 7:25 PM, Robert Jennings <rcj@linux.vnet.ibm.com> wrote:
> > There is no need to set the bits in the first- and second-level indices
> > to indicate a free page when we know that a free page existed at this
> > level.
> >
> > Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
> 
> Why is it not necessary? I don't know that part of the code well
> enough to tell if this patch is safe or not.

This change is in a conditional block which is entered only when there is
an existing data block on the freelist where the insert has taken place.

The new block is pushed onto the freelist stack and this conditional block
is updating links in the prior stack head to point to the new stack head.
After this conditional block the first-/second-level indices are updated
to indicate that there is a free block at this location.

This patch adds an immediate return from the conditional block to avoid
setting bits again to indicate a free block on this freelist. They would
already be set because there was an existing free block on this freelist.

> > ---
> >  drivers/staging/zram/xvmalloc.c |    2 ++
> >  1 files changed, 2 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/staging/zram/xvmalloc.c b/drivers/staging/zram/xvmalloc.c
> > index a507f95..b3622f1 100644
> > --- a/drivers/staging/zram/xvmalloc.c
> > +++ b/drivers/staging/zram/xvmalloc.c
> > @@ -205,6 +205,8 @@ static void insert_block(struct xv_pool *pool, struct page *page, u32 offset,
> >                nextblock->link.prev_page = page;
> >                nextblock->link.prev_offset = offset;
> >                put_ptr_atomic(nextblock, KM_USER1);
> > +               /* If there was a next page then the free bits are set. */
> > +               return;
> >        }
> >
> >        __set_bit(slindex % BITS_PER_LONG, &pool->slbitmap[flindex]);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7] zram/xvmalloc: free bit block insertion optimization
  2011-01-26 17:25 ` [PATCH 4/7] zram/xvmalloc: free bit block insertion optimization Robert Jennings
  2011-01-26 17:55   ` Pekka Enberg
@ 2011-01-26 18:44   ` Nitin Gupta
  1 sibling, 0 replies; 26+ messages in thread
From: Nitin Gupta @ 2011-01-26 18:44 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Pekka Enberg, devel, linux-kernel

On 01/26/2011 12:25 PM, Robert Jennings wrote:
> There is no need to set the bits in the first- and second-level indices
> to indicate a free page when we know that a free page existed at this
> level.
>
> Signed-off-by: Robert Jennings<rcj@linux.vnet.ibm.com>
>

Reviewed-by: Nitin Gupta <ngupta@vflare.org>


Thanks for all the fixes and cleanups.
Nitin


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/7] zram/vmalloc: Correct tunings to enable use with 64K pages
  2011-01-26 17:48   ` Pekka Enberg
@ 2011-01-26 18:50     ` Robert Jennings
  0 siblings, 0 replies; 26+ messages in thread
From: Robert Jennings @ 2011-01-26 18:50 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Nitin Gupta, Greg Kroah-Hartman, Robert Jennings, Pekka Enberg,
	devel, linux-kernel

* Pekka Enberg <penberg@kernel.org> wrote:
> On Wed, Jan 26, 2011 at 7:22 PM, Robert Jennings> <rcj@linux.vnet.ibm.com> wrote:
>> xvmalloc will not currently function with 64K pages.  Newly allocated
>> pages will be inserted at an offset beyond the end of the first-level
>> index.  This tuning is needed to properly size the allocator for 64K
>> pages.
>> 
>> The default 3 byte shift results in a second level list size which can not
>> be indexed using the 64 bits of the flbitmap in the xv_pool structure.
>> The shift must increase to 4 bytes between second level list entries to
>> fit the size of the first level bitmap.
>> 
>> Here are a few statistics for structure sizes on 32- and 64-bit CPUs
>> with 4KB and 64KB page sizes.
>> 
>> bits_per_long              32        64        64
>> page_size               4,096     4,096    65,535
>> xv_align                    4         8         8
>> fl_delta                    3         3         4
>> num_free_lists            508       508     4,094
>> xv_pool size            4,144b    8,216b   66,040b
>> per object overhead        32        64        64
>> zram struct 0.5GB disk    512KB    1024KB      64KB
>> 
>> This patch maintains the current tunings for 4K pages, adds an optimal
>> sizing for 64K pages and adds a safe tuning for any other page sizes.
>> 
>> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
>> ---
>>  drivers/staging/zram/xvmalloc_int.h |   18 ++++++++++++++++--
>>  1 files changed, 16 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/staging/zram/xvmalloc_int.h b/drivers/staging/zram/xvmalloc_int.h
>> index e23ed5c..051a49b 100644
>> --- a/drivers/staging/zram/xvmalloc_int.h
>> +++ b/drivers/staging/zram/xvmalloc_int.h
<snip>
>> @@ -27,8 +31,18 @@
>>  #define XV_MIN_ALLOC_SIZE	32
>>  #define XV_MAX_ALLOC_SIZE	(PAGE_SIZE - XV_ALIGN)
>>  
>> -/* Free lists are separated by FL_DELTA bytes */
>> -#define FL_DELTA_SHIFT	3
>> +/*
>> + * Free lists are separated by FL_DELTA bytes
>> + * This value is 3 for 4k pages and 4 for 64k pages, for any
>> + * other page size, a conservative (PAGE_SHIFT - 9) is used.
>> + */
>> +#if PAGE_SHIFT == 12
>> +#define FL_DELTA_SHIFT 3
>
> This is handled by the else branch already, no?

Yes, it does not need to be there.  I will repost.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/7] zram/xvmalloc: free bit block insertion optimization
  2011-01-26 18:40     ` Robert Jennings
@ 2011-01-26 18:51       ` Pekka Enberg
  0 siblings, 0 replies; 26+ messages in thread
From: Pekka Enberg @ 2011-01-26 18:51 UTC (permalink / raw)
  To: Pekka Enberg, Nitin Gupta, Greg Kroah-Hartman, Pekka Enberg,
	devel, linux-kernel
  Cc: Robert Jennings

On Wed, Jan 26, 2011 at 8:40 PM, Robert Jennings <rcj@linux.vnet.ibm.com> wrote:
> * Pekka Enberg (penberg@kernel.org) wrote:
>> On Wed, Jan 26, 2011 at 7:25 PM, Robert Jennings <rcj@linux.vnet.ibm.com> wrote:
>> > There is no need to set the bits in the first- and second-level indices
>> > to indicate a free page when we know that a free page existed at this
>> > level.
>> >
>> > Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
>>
>> Why is it not necessary? I don't know that part of the code well
>> enough to tell if this patch is safe or not.
>
> This change is in a conditional block which is entered only when there is
> an existing data block on the freelist where the insert has taken place.
>
> The new block is pushed onto the freelist stack and this conditional block
> is updating links in the prior stack head to point to the new stack head.
> After this conditional block the first-/second-level indices are updated
> to indicate that there is a free block at this location.
>
> This patch adds an immediate return from the conditional block to avoid
> setting bits again to indicate a free block on this freelist. They would
> already be set because there was an existing free block on this freelist.

Some of that information could be put in the changelog.

Reviewed-by: Pekka Enberg <penberg@kernel.org>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/7] zram: Prevent overflow in logical block size
  2011-01-26 17:42   ` Pekka Enberg
@ 2011-01-26 19:09     ` Robert Jennings
  2011-01-26 19:14       ` Pekka Enberg
  0 siblings, 1 reply; 26+ messages in thread
From: Robert Jennings @ 2011-01-26 19:09 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Nitin Gupta, Greg Kroah-Hartman, Robert Jennings, devel, linux-kernel

Pekka Enberg <penberg@kernel.org> wrote:
> On Wed, Jan 26, 2011 at 7:23 PM, Robert Jennings <rcj@linux.vnet.ibm.com> wrote:
>> On a 64K page kernel, the value PAGE_SIZE passed to
>> blk_queue_logical_block_size would overflow the logical block size
>> argument (resulting in setting it to 0).
>> 
>> Take the minimum of PAGE_SIZE or 4096 and use this for the block device
>> logical block size.
>> 
>> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
>> ---
>>  drivers/staging/zram/zram_drv.c |    3 ++-
>>  1 files changed, 2 insertions(+), 1 deletions(-)
>> 
>> diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
>> index d0e9e02..d5e0275 100644
>> --- a/drivers/staging/zram/zram_drv.c
>> +++ b/drivers/staging/zram/zram_drv.c
>> @@ -621,7 +621,8 @@ static int create_device(struct zram *zram, int device_id)
>>  	 * and n*PAGE_SIZED sized I/O requests.
>>  	 */
>>  	blk_queue_physical_block_size(zram->disk->queue, PAGE_SIZE);
>> -	blk_queue_logical_block_size(zram->disk->queue, PAGE_SIZE);
>> +	blk_queue_logical_block_size(zram->disk->queue,
>> +			(unsigned short) min_t(unsigned int, PAGE_SIZE, 4096));
> 
> I don't get it. No architecture supports PAGE_SIZE less than 4K so
> that expression always ends up being 4096, no?

Yes, I will cut out the min_t and just hard-code this to 4096.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/7] zram: Prevent overflow in logical block size
  2011-01-26 19:09     ` Robert Jennings
@ 2011-01-26 19:14       ` Pekka Enberg
  0 siblings, 0 replies; 26+ messages in thread
From: Pekka Enberg @ 2011-01-26 19:14 UTC (permalink / raw)
  To: Pekka Enberg, Nitin Gupta, Greg Kroah-Hartman, devel, linux-kernel
  Cc: Robert Jennings

On Wed, Jan 26, 2011 at 9:09 PM, Robert Jennings <rcj@linux.vnet.ibm.com> wrote:
> Pekka Enberg <penberg@kernel.org> wrote:
>> On Wed, Jan 26, 2011 at 7:23 PM, Robert Jennings <rcj@linux.vnet.ibm.com> wrote:
>>> On a 64K page kernel, the value PAGE_SIZE passed to
>>> blk_queue_logical_block_size would overflow the logical block size
>>> argument (resulting in setting it to 0).
>>>
>>> Take the minimum of PAGE_SIZE or 4096 and use this for the block device
>>> logical block size.
>>>
>>> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
>>> ---
>>>  drivers/staging/zram/zram_drv.c |    3 ++-
>>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
>>> index d0e9e02..d5e0275 100644
>>> --- a/drivers/staging/zram/zram_drv.c
>>> +++ b/drivers/staging/zram/zram_drv.c
>>> @@ -621,7 +621,8 @@ static int create_device(struct zram *zram, int device_id)
>>>       * and n*PAGE_SIZED sized I/O requests.
>>>       */
>>>      blk_queue_physical_block_size(zram->disk->queue, PAGE_SIZE);
>>> -    blk_queue_logical_block_size(zram->disk->queue, PAGE_SIZE);
>>> +    blk_queue_logical_block_size(zram->disk->queue,
>>> +                    (unsigned short) min_t(unsigned int, PAGE_SIZE, 4096));
>>
>> I don't get it. No architecture supports PAGE_SIZE less than 4K so
>> that expression always ends up being 4096, no?
>
> Yes, I will cut out the min_t and just hard-code this to 4096.

Please make a ZRAM_BLOCK_SIZE constant or something for that.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/7] zram: Speed insertion of new pages with cached idx
  2011-01-26 17:53   ` Pekka Enberg
@ 2011-01-26 19:50     ` Robert Jennings
  0 siblings, 0 replies; 26+ messages in thread
From: Robert Jennings @ 2011-01-26 19:50 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Nitin Gupta, Greg Kroah-Hartman, Robert Jennings, devel, linux-kernel

* Pekka Enberg (penberg@cs.helsinki.fi) wrote:
> On Wed, Jan 26, 2011 at 7:23 PM, Robert Jennings> <rcj@linux.vnet.ibm.com> wrote:
>> Calculate the first- and second-level indices for new page when the pool
>> is initialized rather than calculating them on each insertion.
>> 
>> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
>> ---
>>  drivers/staging/zram/xvmalloc.c     |   13 +++++++++++--
>>  drivers/staging/zram/xvmalloc_int.h |    4 ++++
>>  2 files changed, 15 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/staging/zram/xvmalloc.c b/drivers/staging/zram/xvmalloc.c
>> index 3fdbb8a..a507f95 100644
>> --- a/drivers/staging/zram/xvmalloc.c
>> +++ b/drivers/staging/zram/xvmalloc.c
>> @@ -184,8 +184,13 @@ static void insert_block(struct xv_pool *pool, struct page *page, u32 offset,
>>  	u32 flindex, slindex;
>>  	struct block_header *nextblock;
>>  
>> -	slindex = get_index_for_insert(block->size);
>> -	flindex = slindex / BITS_PER_LONG;
>> +	if (block->size >= (PAGE_SIZE - XV_ALIGN)) {
>> +		slindex = pagesize_slindex;
>> +		flindex = pagesize_flindex;
>> +	} else {
>> +		slindex = get_index_for_insert(block->size);
>> +		flindex = slindex / BITS_PER_LONG;
>> +	}
>>  
>>  	block->link.prev_page = 0;
>>  	block->link.prev_offset = 0;
>> @@ -316,6 +321,10 @@ struct xv_pool *xv_create_pool(void)
>>  	if (!pool)
>>  		return NULL;
>>  
>> +	/* cache the first/second-level indices for PAGE_SIZE allocations */
>> +	pagesize_slindex = get_index_for_insert(PAGE_SIZE);
>> +	pagesize_flindex = pagesize_slindex / BITS_PER_LONG;
> 
> Why is this in xv_create_pool(). AFAICT, it can be called multiple
> times if there's more than one zram device. Do we really need
> variables for these? They look like something GCC constant propagation
> should take care of if they would be defines or static inline
> functions.

It should have been a define rather than in xv_create_pool but as I read
more about GCC constant propagation and look at the get_index_for_insert
I believe that this patch is unnecessary.  For sizes near PAGE_SIZE
(>XV_MAX_ALLOC_SIZE) I believe GCC constant propagation should do
exactly what I though I was trying to do.  I will drop this patch.
Thank you for your reviews.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2011-01-26 19:50 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-26 17:21 [PATCH 0/7] zram/xvmalloc: 64K page fixes and optimizations Robert Jennings
2011-01-26 17:22 ` [PATCH 1/7] zram/vmalloc: Correct tunings to enable use with 64K pages Robert Jennings
2011-01-26 17:48   ` Pekka Enberg
2011-01-26 18:50     ` Robert Jennings
2011-01-26 17:23 ` [PATCH 2/7] zram: Prevent overflow in logical block size Robert Jennings
2011-01-26 17:42   ` Pekka Enberg
2011-01-26 19:09     ` Robert Jennings
2011-01-26 19:14       ` Pekka Enberg
2011-01-26 17:23 ` [PATCH 3/7] zram: Speed insertion of new pages with cached idx Robert Jennings
2011-01-26 17:53   ` Pekka Enberg
2011-01-26 19:50     ` Robert Jennings
2011-01-26 17:25 ` [PATCH 4/7] zram/xvmalloc: free bit block insertion optimization Robert Jennings
2011-01-26 17:55   ` Pekka Enberg
2011-01-26 18:40     ` Robert Jennings
2011-01-26 18:51       ` Pekka Enberg
2011-01-26 18:44   ` Nitin Gupta
2011-01-26 17:26 ` [PATCH 5/7] zram/xvmalloc: wrap debug code in #ifdef DEBUG Robert Jennings
2011-01-26 17:44   ` Pekka Enberg
2011-01-26 18:27     ` Robert Jennings
2011-01-26 17:27 ` [PATCH 6/7] zram/xvmalloc: Close 32byte hole on 64-bit CPUs Robert Jennings
2011-01-26 17:43   ` Pekka Enberg
2011-01-26 17:27 ` [PATCH 7/7] zram: mark the disk as non-rotating media Robert Jennings
2011-01-26 17:44   ` Pekka Enberg
2011-01-26 17:48   ` Dan Carpenter
2011-01-26 17:52   ` Robert Jennings
2011-01-26 18:26     ` Robert Jennings

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.