All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] zram: try to avoid worst-case scenario on same element pages
@ 2020-01-10  7:40 Taejoon Song
  2020-01-10 16:45 ` Minchan Kim
  0 siblings, 1 reply; 5+ messages in thread
From: Taejoon Song @ 2020-01-10  7:40 UTC (permalink / raw)
  To: minchan, ngupta, sergey.senozhatsky.work, axboe
  Cc: linux-kernel, linux-block, yjay.kim, Taejoon Song

The worst-case scenario on finding same element pages is that almost
all elements are same at the first glance but only last few elements
are different.

Since the same element tends to be grouped from the beginning of the
pages, if we check the first element with the last element before
looping through all elements, we might have some chances to quickly
detect non-same element pages.

1. Test is done under LG webOS TV (64-bit arch)
2. Dump the swap-out pages (~819200 pages)
3. Analyze the pages with simple test script which counts the iteration
   number and measures the speed at off-line

Under 64-bit arch, the worst iteration count is PAGE_SIZE / 8 bytes = 512.
The speed is based on the time to consume page_same_filled() function only.
The result, on average, is listed as below:

                                   Num of Iter    Speed(MB/s)
Looping-Forward (Orig)                 38            99265
Looping-Backward                       36           102725
Last-element-check (This Patch)        33           125072

The result shows that the average iteration count decreases by 13% and
the speed increases by 25% with this patch. This patch does not increase
the overall time complexity, though.

I also ran simpler version which uses backward loop. Just looping backward
also makes some improvement, but less than this patch.

Signed-off-by: Taejoon Song <taejoon.song@lge.com>
---
 drivers/block/zram/zram_drv.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 4285e75..71d5946 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -207,14 +207,17 @@ static inline void zram_fill_page(void *ptr, unsigned long len,
 
 static bool page_same_filled(void *ptr, unsigned long *element)
 {
-	unsigned int pos;
 	unsigned long *page;
 	unsigned long val;
+	unsigned int pos, last_pos = PAGE_SIZE / sizeof(*page) - 1;
 
 	page = (unsigned long *)ptr;
 	val = page[0];
 
-	for (pos = 1; pos < PAGE_SIZE / sizeof(*page); pos++) {
+	if (val != page[last_pos])
+		return false;
+
+	for (pos = 1; pos < last_pos; pos++) {
 		if (val != page[pos])
 			return false;
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] zram: try to avoid worst-case scenario on same element pages
  2020-01-10  7:40 [PATCH] zram: try to avoid worst-case scenario on same element pages Taejoon Song
@ 2020-01-10 16:45 ` Minchan Kim
  0 siblings, 0 replies; 5+ messages in thread
From: Minchan Kim @ 2020-01-10 16:45 UTC (permalink / raw)
  To: Taejoon Song, Andrew Morton
  Cc: ngupta, sergey.senozhatsky.work, axboe, linux-kernel,
	linux-block, yjay.kim

Hi Andrew,

Previous version has a off-by-one error to check every unsigned long bytes
so it misses a unsigned long bytes in PAGE_SIZE. It could make zram
believe it's same page but not true.

This is a new patch Tejun fixed so could you replace [1] with this
version?

Thanks.

[1] mmotm: zram-try-to-avoid-worst-case-scenario-on-same-element-pages.patch with

On Fri, Jan 10, 2020 at 04:40:01PM +0900, Taejoon Song wrote:
> The worst-case scenario on finding same element pages is that almost
> all elements are same at the first glance but only last few elements
> are different.
> 
> Since the same element tends to be grouped from the beginning of the
> pages, if we check the first element with the last element before
> looping through all elements, we might have some chances to quickly
> detect non-same element pages.
> 
> 1. Test is done under LG webOS TV (64-bit arch)
> 2. Dump the swap-out pages (~819200 pages)
> 3. Analyze the pages with simple test script which counts the iteration
>    number and measures the speed at off-line
> 
> Under 64-bit arch, the worst iteration count is PAGE_SIZE / 8 bytes = 512.
> The speed is based on the time to consume page_same_filled() function only.
> The result, on average, is listed as below:
> 
>                                    Num of Iter    Speed(MB/s)
> Looping-Forward (Orig)                 38            99265
> Looping-Backward                       36           102725
> Last-element-check (This Patch)        33           125072
> 
> The result shows that the average iteration count decreases by 13% and
> the speed increases by 25% with this patch. This patch does not increase
> the overall time complexity, though.
> 
> I also ran simpler version which uses backward loop. Just looping backward
> also makes some improvement, but less than this patch.
> 
> Signed-off-by: Taejoon Song <taejoon.song@lge.com>

Acked-by: Minchan Kim <minchan@kernel.org>

> ---
>  drivers/block/zram/zram_drv.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 4285e75..71d5946 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -207,14 +207,17 @@ static inline void zram_fill_page(void *ptr, unsigned long len,
>  
>  static bool page_same_filled(void *ptr, unsigned long *element)
>  {
> -	unsigned int pos;
>  	unsigned long *page;
>  	unsigned long val;
> +	unsigned int pos, last_pos = PAGE_SIZE / sizeof(*page) - 1;
>  
>  	page = (unsigned long *)ptr;
>  	val = page[0];
>  
> -	for (pos = 1; pos < PAGE_SIZE / sizeof(*page); pos++) {
> +	if (val != page[last_pos])
> +		return false;
> +
> +	for (pos = 1; pos < last_pos; pos++) {

FYI, this is fixed part from previous one.

>  		if (val != page[pos])
>  			return false;
>  	}
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] zram: try to avoid worst-case scenario on same element pages
  2019-12-04  1:53 Taejoon Song
@ 2019-12-04 22:54 ` Minchan Kim
  0 siblings, 0 replies; 5+ messages in thread
From: Minchan Kim @ 2019-12-04 22:54 UTC (permalink / raw)
  To: Taejoon Song, Andrew Morton
  Cc: ngupta, sergey.senozhatsky.work, axboe, linux-kernel,
	linux-block, yjay.kim

On Wed, Dec 04, 2019 at 10:53:38AM +0900, Taejoon Song wrote:
> The worst-case scenario on finding same element pages is that almost
> all elements are same at the first glance but only last few elements
> are different.
> 
> Since the same element tends to be grouped from the beginning of the
> pages, if we check the first element with the last element before
> looping through all elements, we might have some chances to quickly
> detect non-same element pages.
> 
> 1. Test is done under LG webOS TV (64-bit arch)
> 2. Dump the swap-out pages (~819200 pages)
> 3. Analyze the pages with simple test script which counts the iteration
>    number and measures the speed at off-line
> 
> Under 64-bit arch, the worst iteration count is PAGE_SIZE / 8 bytes = 512.
> The speed is based on the time to consume page_same_filled() function only.
> The result, on average, is listed as below:
> 
>                                    Num of Iter    Speed(MB/s)
> Looping-Forward (Orig)                 38            99265
> Looping-Backward                       36           102725
> Last-element-check (This Patch)        33           125072
> 
> The result shows that the average iteration count decreases by 13% and
> the speed increases by 25% with this patch. This patch does not increase
> the overall time complexity, though.
> 
> I also ran simpler version which uses backward loop. Just looping backward
> also makes some improvement, but less than this patch.
> 
> Signed-off-by: Taejoon Song <taejoon.song@lge.com>
Acked-by: Minchan Kim <minchan@kernel.org>

I think it's very reasonable optimization with small cost.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] zram: try to avoid worst-case scenario on same element pages
@ 2019-12-04  1:53 Taejoon Song
  2019-12-04 22:54 ` Minchan Kim
  0 siblings, 1 reply; 5+ messages in thread
From: Taejoon Song @ 2019-12-04  1:53 UTC (permalink / raw)
  To: minchan, ngupta, sergey.senozhatsky.work, axboe
  Cc: linux-kernel, linux-block, yjay.kim, Taejoon Song

The worst-case scenario on finding same element pages is that almost
all elements are same at the first glance but only last few elements
are different.

Since the same element tends to be grouped from the beginning of the
pages, if we check the first element with the last element before
looping through all elements, we might have some chances to quickly
detect non-same element pages.

1. Test is done under LG webOS TV (64-bit arch)
2. Dump the swap-out pages (~819200 pages)
3. Analyze the pages with simple test script which counts the iteration
   number and measures the speed at off-line

Under 64-bit arch, the worst iteration count is PAGE_SIZE / 8 bytes = 512.
The speed is based on the time to consume page_same_filled() function only.
The result, on average, is listed as below:

                                   Num of Iter    Speed(MB/s)
Looping-Forward (Orig)                 38            99265
Looping-Backward                       36           102725
Last-element-check (This Patch)        33           125072

The result shows that the average iteration count decreases by 13% and
the speed increases by 25% with this patch. This patch does not increase
the overall time complexity, though.

I also ran simpler version which uses backward loop. Just looping backward
also makes some improvement, but less than this patch.

Signed-off-by: Taejoon Song <taejoon.song@lge.com>
---
 drivers/block/zram/zram_drv.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 4285e75..afadd7f 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -207,14 +207,17 @@ static inline void zram_fill_page(void *ptr, unsigned long len,
 
 static bool page_same_filled(void *ptr, unsigned long *element)
 {
-	unsigned int pos;
 	unsigned long *page;
 	unsigned long val;
+	unsigned int pos, last_pos = PAGE_SIZE / sizeof(*page) - 1;
 
 	page = (unsigned long *)ptr;
 	val = page[0];
 
-	for (pos = 1; pos < PAGE_SIZE / sizeof(*page); pos++) {
+	if (val != page[last_pos])
+		return false;
+
+	for (pos = 1; pos < last_pos - 1; pos++) {
 		if (val != page[pos])
 			return false;
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH] zram: try to avoid worst-case scenario on same element pages
@ 2019-12-03 10:43 Taejoon Song
  0 siblings, 0 replies; 5+ messages in thread
From: Taejoon Song @ 2019-12-03 10:43 UTC (permalink / raw)
  To: minchan, ngupta, sergey.senozhatsky.work, axboe
  Cc: linux-kernel, linux-block, yjay.kim, Taejoon Song

The worst-case scenario on finding same element pages is that almost
all elements are same at the first glance but only last few elements
are different.

Since the same element tends to be grouped from the beginning of the
pages, if we check the first element with the last element before
looping through all elements, we might have some chances to quickly
detect non-same element pages.

1. Test is done under LG webOS TV (64-bit arch)
2. Dump the swap-out pages (~819200 pages)
3. Analyze the pages with simple test script which counts the iteration
   number and measures the speed at off-line

Under 64-bit arch, the worst iteration count is PAGE_SIZE / 8 bytes = 512.
The speed is based on the time to consume page_same_filled() function only.
The result, on average, is listed as below:

                                   Num of Iter    Speed(MB/s)
Looping-Forward (Orig)                 38            99265
Looping-Backward                       36           102725
Last-element-check (This Patch)        33           125072

The result shows that the average iteration count decreases by 13% and
the speed increases by 25% with this patch. This patch does not increase
the overall time complexity, though.

I also ran simpler version which uses backward loop. Just looping backward
also makes some improvement, but less than this patch.

Signed-off-by: Taejoon Song <taejoon.song@lge.com>
---
 drivers/block/zram/zram_drv.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 4285e75..afadd7f 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -207,14 +207,17 @@ static inline void zram_fill_page(void *ptr, unsigned long len,
 
 static bool page_same_filled(void *ptr, unsigned long *element)
 {
-	unsigned int pos;
 	unsigned long *page;
 	unsigned long val;
+	unsigned int pos, last_pos = PAGE_SIZE / sizeof(*page) - 1;
 
 	page = (unsigned long *)ptr;
 	val = page[0];
 
-	for (pos = 1; pos < PAGE_SIZE / sizeof(*page); pos++) {
+	if (val != page[last_pos])
+		return false;
+
+	for (pos = 1; pos < last_pos - 1; pos++) {
 		if (val != page[pos])
 			return false;
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-01-10 16:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-10  7:40 [PATCH] zram: try to avoid worst-case scenario on same element pages Taejoon Song
2020-01-10 16:45 ` Minchan Kim
  -- strict thread matches above, loose matches on Subject: below --
2019-12-04  1:53 Taejoon Song
2019-12-04 22:54 ` Minchan Kim
2019-12-03 10:43 Taejoon Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.