linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] zram: Add a huge_idle writeback mode
@ 2022-03-15 17:22 Brian Geffon
  2022-03-15 17:28 ` Matthew Wilcox
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Brian Geffon @ 2022-03-15 17:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Minchan Kim, Nitin Gupta, Sergey Senozhatsky, linux-kernel,
	linux-doc, linux-block, Brian Geffon

Today it's only possible to write back as a page, idle, or huge.
A user might want to writeback pages which are huge and idle first
as these idle pages do not require decompression and make a good
first pass for writeback.

Signed-off-by: Brian Geffon <bgeffon@google.com>
---
 Documentation/admin-guide/blockdev/zram.rst |  6 ++++++
 drivers/block/zram/zram_drv.c               | 10 ++++++----
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst
index 3e11926a4df9..af1123bfaf92 100644
--- a/Documentation/admin-guide/blockdev/zram.rst
+++ b/Documentation/admin-guide/blockdev/zram.rst
@@ -343,6 +343,12 @@ Admin can request writeback of those idle pages at right timing via::
 
 With the command, zram writeback idle pages from memory to the storage.
 
+Additionally, if a user choose to writeback only huge and idle pages
+this can be accomplished with::
+
+        echo huge_idle > /sys/block/zramX/writeback
+
+
 If admin want to write a specific page in zram device to backing device,
 they could write a page index into the interface.
 
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index cb253d80d72b..f196902ae554 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -643,8 +643,8 @@ static int read_from_bdev_async(struct zram *zram, struct bio_vec *bvec,
 #define PAGE_WB_SIG "page_index="
 
 #define PAGE_WRITEBACK 0
-#define HUGE_WRITEBACK 1
-#define IDLE_WRITEBACK 2
+#define HUGE_WRITEBACK (1<<0)
+#define IDLE_WRITEBACK (1<<1)
 
 
 static ssize_t writeback_store(struct device *dev,
@@ -664,6 +664,8 @@ static ssize_t writeback_store(struct device *dev,
 		mode = IDLE_WRITEBACK;
 	else if (sysfs_streq(buf, "huge"))
 		mode = HUGE_WRITEBACK;
+	else if (sysfs_streq(buf, "huge_idle"))
+		mode = IDLE_WRITEBACK | HUGE_WRITEBACK;
 	else {
 		if (strncmp(buf, PAGE_WB_SIG, sizeof(PAGE_WB_SIG) - 1))
 			return -EINVAL;
@@ -725,10 +727,10 @@ static ssize_t writeback_store(struct device *dev,
 				zram_test_flag(zram, index, ZRAM_UNDER_WB))
 			goto next;
 
-		if (mode == IDLE_WRITEBACK &&
+		if (mode & IDLE_WRITEBACK &&
 			  !zram_test_flag(zram, index, ZRAM_IDLE))
 			goto next;
-		if (mode == HUGE_WRITEBACK &&
+		if (mode & HUGE_WRITEBACK &&
 			  !zram_test_flag(zram, index, ZRAM_HUGE))
 			goto next;
 		/*
-- 
2.35.1.723.g4982287a31-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] zram: Add a huge_idle writeback mode
  2022-03-15 17:22 [PATCH] zram: Add a huge_idle writeback mode Brian Geffon
@ 2022-03-15 17:28 ` Matthew Wilcox
  2022-03-15 17:34   ` Brian Geffon
  2022-03-18 16:41 ` Minchan Kim
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Matthew Wilcox @ 2022-03-15 17:28 UTC (permalink / raw)
  To: Brian Geffon
  Cc: Andrew Morton, Minchan Kim, Nitin Gupta, Sergey Senozhatsky,
	linux-kernel, linux-doc, linux-block, linnux-mm

On Tue, Mar 15, 2022 at 10:22:21AM -0700, Brian Geffon wrote:
> Today it's only possible to write back as a page, idle, or huge.
> A user might want to writeback pages which are huge and idle first
> as these idle pages do not require decompression and make a good
> first pass for writeback.

We're moving towards having many different sizes of page in play,
not just PMD and PTE sizes.  Is this patch actually a good idea in
a case where we have, eg, a 32kB anonymous page on a system with 4kB
pages?  How should zram handle this case?  What's our cut-off for
declaring a page to be "huge"?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] zram: Add a huge_idle writeback mode
  2022-03-15 17:28 ` Matthew Wilcox
@ 2022-03-15 17:34   ` Brian Geffon
  2022-03-15 17:44     ` Matthew Wilcox
  0 siblings, 1 reply; 12+ messages in thread
From: Brian Geffon @ 2022-03-15 17:34 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrew Morton, Minchan Kim, Nitin Gupta, Sergey Senozhatsky,
	LKML, linux-doc, linux-block, linnux-mm

On Tue, Mar 15, 2022 at 1:28 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Mar 15, 2022 at 10:22:21AM -0700, Brian Geffon wrote:
> > Today it's only possible to write back as a page, idle, or huge.
> > A user might want to writeback pages which are huge and idle first
> > as these idle pages do not require decompression and make a good
> > first pass for writeback.
>
> We're moving towards having many different sizes of page in play,
> not just PMD and PTE sizes.  Is this patch actually a good idea in
> a case where we have, eg, a 32kB anonymous page on a system with 4kB
> pages?  How should zram handle this case?  What's our cut-off for
> declaring a page to be "huge"?
>

Huge isn't a great term IMO, but it is what it is. ZRAM_HUGE is used
to identify pages which are incompressible. Since zram is a block
device which presents PAGE_SIZED blocks, do these new changes which
involve many different page sizes matter as that seems orthogonal to
the block subsystem. Correct me if I'm misunderstanding.

Thanks
Brian

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] zram: Add a huge_idle writeback mode
  2022-03-15 17:34   ` Brian Geffon
@ 2022-03-15 17:44     ` Matthew Wilcox
  2022-03-16  0:01       ` Brian Geffon
  0 siblings, 1 reply; 12+ messages in thread
From: Matthew Wilcox @ 2022-03-15 17:44 UTC (permalink / raw)
  To: Brian Geffon
  Cc: Andrew Morton, Minchan Kim, Nitin Gupta, Sergey Senozhatsky,
	LKML, linux-doc, linux-block, linux-mm

On Tue, Mar 15, 2022 at 01:34:21PM -0400, Brian Geffon wrote:
> On Tue, Mar 15, 2022 at 1:28 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Tue, Mar 15, 2022 at 10:22:21AM -0700, Brian Geffon wrote:
> > > Today it's only possible to write back as a page, idle, or huge.
> > > A user might want to writeback pages which are huge and idle first
> > > as these idle pages do not require decompression and make a good
> > > first pass for writeback.
> >
> > We're moving towards having many different sizes of page in play,
> > not just PMD and PTE sizes.  Is this patch actually a good idea in
> > a case where we have, eg, a 32kB anonymous page on a system with 4kB
> > pages?  How should zram handle this case?  What's our cut-off for
> > declaring a page to be "huge"?
> >
> 
> Huge isn't a great term IMO, but it is what it is. ZRAM_HUGE is used
> to identify pages which are incompressible. Since zram is a block
> device which presents PAGE_SIZED blocks, do these new changes which
> involve many different page sizes matter as that seems orthogonal to
> the block subsystem. Correct me if I'm misunderstanding.

Oh, so ZRAM's concept of huge is not the same as the "huge" in
"hugetlbfs" or "THP"?  That's not at all confusing ...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] zram: Add a huge_idle writeback mode
  2022-03-15 17:44     ` Matthew Wilcox
@ 2022-03-16  0:01       ` Brian Geffon
  0 siblings, 0 replies; 12+ messages in thread
From: Brian Geffon @ 2022-03-16  0:01 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrew Morton, Minchan Kim, Nitin Gupta, Sergey Senozhatsky,
	LKML, linux-doc, linux-block, linux-mm

On Tue, Mar 15, 2022 at 1:44 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Mar 15, 2022 at 01:34:21PM -0400, Brian Geffon wrote:
> > On Tue, Mar 15, 2022 at 1:28 PM Matthew Wilcox <willy@infradead.org> wrote:
> > >
> > > On Tue, Mar 15, 2022 at 10:22:21AM -0700, Brian Geffon wrote:
> > > > Today it's only possible to write back as a page, idle, or huge.
> > > > A user might want to writeback pages which are huge and idle first
> > > > as these idle pages do not require decompression and make a good
> > > > first pass for writeback.
> > >
> > > We're moving towards having many different sizes of page in play,
> > > not just PMD and PTE sizes.  Is this patch actually a good idea in
> > > a case where we have, eg, a 32kB anonymous page on a system with 4kB
> > > pages?  How should zram handle this case?  What's our cut-off for
> > > declaring a page to be "huge"?
> > >
> >
> > Huge isn't a great term IMO, but it is what it is. ZRAM_HUGE is used
> > to identify pages which are incompressible. Since zram is a block
> > device which presents PAGE_SIZED blocks, do these new changes which
> > involve many different page sizes matter as that seems orthogonal to
> > the block subsystem. Correct me if I'm misunderstanding.
>
> Oh, so ZRAM's concept of huge is not the same as the "huge" in
> "hugetlbfs" or "THP"?  That's not at all confusing ...

I do not disagree, but there isn't much that can be done about it at
this point given the sysfs file takes an argument called "huge"

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] zram: Add a huge_idle writeback mode
  2022-03-15 17:22 [PATCH] zram: Add a huge_idle writeback mode Brian Geffon
  2022-03-15 17:28 ` Matthew Wilcox
@ 2022-03-18 16:41 ` Minchan Kim
  2022-03-18 16:51   ` Brian Geffon
  2022-03-21 14:50 ` Brian Geffon
  2022-03-22 21:58 ` [PATCH v2] " Brian Geffon
  3 siblings, 1 reply; 12+ messages in thread
From: Minchan Kim @ 2022-03-18 16:41 UTC (permalink / raw)
  To: Brian Geffon
  Cc: Andrew Morton, Nitin Gupta, Sergey Senozhatsky, linux-kernel,
	linux-doc, linux-block

On Tue, Mar 15, 2022 at 10:22:21AM -0700, Brian Geffon wrote:
> Today it's only possible to write back as a page, idle, or huge.
> A user might want to writeback pages which are huge and idle first
> as these idle pages do not require decompression and make a good
> first pass for writeback.

Hi Brian,

I am not sure how much the decompression overhead matter for idle pages
writeback since it's already *very slow* path in zram but I agree that
it would be a good first pass since the memory saving for huge writing
would be cost efficient.

Just out of curiosity. Do you have real usecase?

> 
> Signed-off-by: Brian Geffon <bgeffon@google.com>
> ---
>  Documentation/admin-guide/blockdev/zram.rst |  6 ++++++
>  drivers/block/zram/zram_drv.c               | 10 ++++++----
>  2 files changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst
> index 3e11926a4df9..af1123bfaf92 100644
> --- a/Documentation/admin-guide/blockdev/zram.rst
> +++ b/Documentation/admin-guide/blockdev/zram.rst
> @@ -343,6 +343,12 @@ Admin can request writeback of those idle pages at right timing via::
>  
>  With the command, zram writeback idle pages from memory to the storage.
>  
> +Additionally, if a user choose to writeback only huge and idle pages
> +this can be accomplished with::
> +
> +        echo huge_idle > /sys/block/zramX/writeback
> +
> +
>  If admin want to write a specific page in zram device to backing device,
>  they could write a page index into the interface.
>  
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index cb253d80d72b..f196902ae554 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -643,8 +643,8 @@ static int read_from_bdev_async(struct zram *zram, struct bio_vec *bvec,
>  #define PAGE_WB_SIG "page_index="
>  
>  #define PAGE_WRITEBACK 0
> -#define HUGE_WRITEBACK 1
> -#define IDLE_WRITEBACK 2
> +#define HUGE_WRITEBACK (1<<0)
> +#define IDLE_WRITEBACK (1<<1)
>  
>  
>  static ssize_t writeback_store(struct device *dev,
> @@ -664,6 +664,8 @@ static ssize_t writeback_store(struct device *dev,
>  		mode = IDLE_WRITEBACK;
>  	else if (sysfs_streq(buf, "huge"))
>  		mode = HUGE_WRITEBACK;
> +	else if (sysfs_streq(buf, "huge_idle"))
> +		mode = IDLE_WRITEBACK | HUGE_WRITEBACK;
>  	else {
>  		if (strncmp(buf, PAGE_WB_SIG, sizeof(PAGE_WB_SIG) - 1))
>  			return -EINVAL;
> @@ -725,10 +727,10 @@ static ssize_t writeback_store(struct device *dev,
>  				zram_test_flag(zram, index, ZRAM_UNDER_WB))
>  			goto next;
>  
> -		if (mode == IDLE_WRITEBACK &&
> +		if (mode & IDLE_WRITEBACK &&
>  			  !zram_test_flag(zram, index, ZRAM_IDLE))
>  			goto next;
> -		if (mode == HUGE_WRITEBACK &&
> +		if (mode & HUGE_WRITEBACK &&
>  			  !zram_test_flag(zram, index, ZRAM_HUGE))
>  			goto next;
>  		/*
> -- 
> 2.35.1.723.g4982287a31-goog
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] zram: Add a huge_idle writeback mode
  2022-03-18 16:41 ` Minchan Kim
@ 2022-03-18 16:51   ` Brian Geffon
  2022-03-18 17:30     ` Minchan Kim
  0 siblings, 1 reply; 12+ messages in thread
From: Brian Geffon @ 2022-03-18 16:51 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Nitin Gupta, Sergey Senozhatsky, LKML, linux-doc,
	linux-block

On Fri, Mar 18, 2022 at 12:41 PM Minchan Kim <minchan@kernel.org> wrote:
>
> On Tue, Mar 15, 2022 at 10:22:21AM -0700, Brian Geffon wrote:
> > Today it's only possible to write back as a page, idle, or huge.
> > A user might want to writeback pages which are huge and idle first
> > as these idle pages do not require decompression and make a good
> > first pass for writeback.
>
> Hi Brian,
>
> I am not sure how much the decompression overhead matter for idle pages
> writeback since it's already *very slow* path in zram but I agree that
> it would be a good first pass since the memory saving for huge writing
> would be cost efficient.
>
> Just out of curiosity. Do you have real usecase?

Hi Minchan,
Thank you for taking a look. When we are thinking about writeback
we're trying to be very sensitive to our devices storage endurance,
for this reason we will have a fairly conservative writeback limit.
Given that, we want to make sure we're maximizing what lands on disk
while still minimizing the refault time. We could take the approach
where we always writeback huge pages but then we may result in very
quick refaults which would be a huge waste of time. So idle writeback
is a must for us and being able to writeback the pages which have
maximum value (huge) would be very useful.

Brian




>
> >
> > Signed-off-by: Brian Geffon <bgeffon@google.com>
> > ---
> >  Documentation/admin-guide/blockdev/zram.rst |  6 ++++++
> >  drivers/block/zram/zram_drv.c               | 10 ++++++----
> >  2 files changed, 12 insertions(+), 4 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst
> > index 3e11926a4df9..af1123bfaf92 100644
> > --- a/Documentation/admin-guide/blockdev/zram.rst
> > +++ b/Documentation/admin-guide/blockdev/zram.rst
> > @@ -343,6 +343,12 @@ Admin can request writeback of those idle pages at right timing via::
> >
> >  With the command, zram writeback idle pages from memory to the storage.
> >
> > +Additionally, if a user choose to writeback only huge and idle pages
> > +this can be accomplished with::
> > +
> > +        echo huge_idle > /sys/block/zramX/writeback
> > +
> > +
> >  If admin want to write a specific page in zram device to backing device,
> >  they could write a page index into the interface.
> >
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index cb253d80d72b..f196902ae554 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -643,8 +643,8 @@ static int read_from_bdev_async(struct zram *zram, struct bio_vec *bvec,
> >  #define PAGE_WB_SIG "page_index="
> >
> >  #define PAGE_WRITEBACK 0
> > -#define HUGE_WRITEBACK 1
> > -#define IDLE_WRITEBACK 2
> > +#define HUGE_WRITEBACK (1<<0)
> > +#define IDLE_WRITEBACK (1<<1)
> >
> >
> >  static ssize_t writeback_store(struct device *dev,
> > @@ -664,6 +664,8 @@ static ssize_t writeback_store(struct device *dev,
> >               mode = IDLE_WRITEBACK;
> >       else if (sysfs_streq(buf, "huge"))
> >               mode = HUGE_WRITEBACK;
> > +     else if (sysfs_streq(buf, "huge_idle"))
> > +             mode = IDLE_WRITEBACK | HUGE_WRITEBACK;
> >       else {
> >               if (strncmp(buf, PAGE_WB_SIG, sizeof(PAGE_WB_SIG) - 1))
> >                       return -EINVAL;
> > @@ -725,10 +727,10 @@ static ssize_t writeback_store(struct device *dev,
> >                               zram_test_flag(zram, index, ZRAM_UNDER_WB))
> >                       goto next;
> >
> > -             if (mode == IDLE_WRITEBACK &&
> > +             if (mode & IDLE_WRITEBACK &&
> >                         !zram_test_flag(zram, index, ZRAM_IDLE))
> >                       goto next;
> > -             if (mode == HUGE_WRITEBACK &&
> > +             if (mode & HUGE_WRITEBACK &&
> >                         !zram_test_flag(zram, index, ZRAM_HUGE))
> >                       goto next;
> >               /*
> > --
> > 2.35.1.723.g4982287a31-goog
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] zram: Add a huge_idle writeback mode
  2022-03-18 16:51   ` Brian Geffon
@ 2022-03-18 17:30     ` Minchan Kim
  0 siblings, 0 replies; 12+ messages in thread
From: Minchan Kim @ 2022-03-18 17:30 UTC (permalink / raw)
  To: Brian Geffon
  Cc: Andrew Morton, Nitin Gupta, Sergey Senozhatsky, LKML, linux-doc,
	linux-block

On Fri, Mar 18, 2022 at 12:51:14PM -0400, Brian Geffon wrote:
> On Fri, Mar 18, 2022 at 12:41 PM Minchan Kim <minchan@kernel.org> wrote:
> >
> > On Tue, Mar 15, 2022 at 10:22:21AM -0700, Brian Geffon wrote:
> > > Today it's only possible to write back as a page, idle, or huge.
> > > A user might want to writeback pages which are huge and idle first
> > > as these idle pages do not require decompression and make a good
> > > first pass for writeback.
> >
> > Hi Brian,
> >
> > I am not sure how much the decompression overhead matter for idle pages
> > writeback since it's already *very slow* path in zram but I agree that
> > it would be a good first pass since the memory saving for huge writing
> > would be cost efficient.
> >
> > Just out of curiosity. Do you have real usecase?
> 
> Hi Minchan,
> Thank you for taking a look. When we are thinking about writeback
> we're trying to be very sensitive to our devices storage endurance,
> for this reason we will have a fairly conservative writeback limit.
> Given that, we want to make sure we're maximizing what lands on disk
> while still minimizing the refault time. We could take the approach
> where we always writeback huge pages but then we may result in very
> quick refaults which would be a huge waste of time. So idle writeback
> is a must for us and being able to writeback the pages which have
> maximum value (huge) would be very useful.

Thanks for sharing the thought. It really make sense to me and
would be great if it goes on the description.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] zram: Add a huge_idle writeback mode
  2022-03-15 17:22 [PATCH] zram: Add a huge_idle writeback mode Brian Geffon
  2022-03-15 17:28 ` Matthew Wilcox
  2022-03-18 16:41 ` Minchan Kim
@ 2022-03-21 14:50 ` Brian Geffon
  2022-03-22 21:13   ` Minchan Kim
  2022-03-22 21:58 ` [PATCH v2] " Brian Geffon
  3 siblings, 1 reply; 12+ messages in thread
From: Brian Geffon @ 2022-03-21 14:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Minchan Kim, Nitin Gupta, Sergey Senozhatsky, linux-kernel,
	linux-doc, linux-block, Brian Geffon

Today it's only possible to write back as a page, idle, or huge.
A user might want to writeback pages which are huge and idle first
as these idle pages do not require decompression and make a good
first pass for writeback.

Idle writeback specifically has the advantage that a refault is
unlikely given that the page has been swapped for some amount of
time without being refaulted.

Huge writeback has the advantage that you're guaranteed to get
the maximum benefit from a single page writeback, that is, you're
reclaiming one full page of memory. Pages which are compressed in
zram being written back result in some benefit which is always
less than a page size because of the fact that it was compressed.

This change allows for users to write back huge pages which are
also idle.

Signed-off-by: Brian Geffon <bgeffon@google.com>
---
 Documentation/admin-guide/blockdev/zram.rst |  6 ++++++
 drivers/block/zram/zram_drv.c               | 10 ++++++----
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst
index 3e11926a4df9..af1123bfaf92 100644
--- a/Documentation/admin-guide/blockdev/zram.rst
+++ b/Documentation/admin-guide/blockdev/zram.rst
@@ -343,6 +343,12 @@ Admin can request writeback of those idle pages at right timing via::
 
 With the command, zram writeback idle pages from memory to the storage.
 
+Additionally, if a user choose to writeback only huge and idle pages
+this can be accomplished with::
+
+        echo huge_idle > /sys/block/zramX/writeback
+
+
 If admin want to write a specific page in zram device to backing device,
 they could write a page index into the interface.
 
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index cb253d80d72b..f196902ae554 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -643,8 +643,8 @@ static int read_from_bdev_async(struct zram *zram, struct bio_vec *bvec,
 #define PAGE_WB_SIG "page_index="
 
 #define PAGE_WRITEBACK 0
-#define HUGE_WRITEBACK 1
-#define IDLE_WRITEBACK 2
+#define HUGE_WRITEBACK (1<<0)
+#define IDLE_WRITEBACK (1<<1)
 
 
 static ssize_t writeback_store(struct device *dev,
@@ -664,6 +664,8 @@ static ssize_t writeback_store(struct device *dev,
 		mode = IDLE_WRITEBACK;
 	else if (sysfs_streq(buf, "huge"))
 		mode = HUGE_WRITEBACK;
+	else if (sysfs_streq(buf, "huge_idle"))
+		mode = IDLE_WRITEBACK | HUGE_WRITEBACK;
 	else {
 		if (strncmp(buf, PAGE_WB_SIG, sizeof(PAGE_WB_SIG) - 1))
 			return -EINVAL;
@@ -725,10 +727,10 @@ static ssize_t writeback_store(struct device *dev,
 				zram_test_flag(zram, index, ZRAM_UNDER_WB))
 			goto next;
 
-		if (mode == IDLE_WRITEBACK &&
+		if (mode & IDLE_WRITEBACK &&
 			  !zram_test_flag(zram, index, ZRAM_IDLE))
 			goto next;
-		if (mode == HUGE_WRITEBACK &&
+		if (mode & HUGE_WRITEBACK &&
 			  !zram_test_flag(zram, index, ZRAM_HUGE))
 			goto next;
 		/*
-- 
2.35.1.894.gb6a874cedc-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] zram: Add a huge_idle writeback mode
  2022-03-21 14:50 ` Brian Geffon
@ 2022-03-22 21:13   ` Minchan Kim
  2022-03-22 21:52     ` Brian Geffon
  0 siblings, 1 reply; 12+ messages in thread
From: Minchan Kim @ 2022-03-22 21:13 UTC (permalink / raw)
  To: Brian Geffon
  Cc: Andrew Morton, Nitin Gupta, Sergey Senozhatsky, linux-kernel,
	linux-doc, linux-block

On Mon, Mar 21, 2022 at 07:50:37AM -0700, Brian Geffon wrote:
> Today it's only possible to write back as a page, idle, or huge.
> A user might want to writeback pages which are huge and idle first
> as these idle pages do not require decompression and make a good
> first pass for writeback.
> 
> Idle writeback specifically has the advantage that a refault is
> unlikely given that the page has been swapped for some amount of
> time without being refaulted.
> 
> Huge writeback has the advantage that you're guaranteed to get
> the maximum benefit from a single page writeback, that is, you're
> reclaiming one full page of memory. Pages which are compressed in
> zram being written back result in some benefit which is always
> less than a page size because of the fact that it was compressed.
> 
> This change allows for users to write back huge pages which are
> also idle.

Hey Brian,

I really want to add your explanation about the storage endurance
because it's real issue.

So, could't you add up below in the description?

From your previous reply
"
we're trying to be very sensitive to our devices storage endurance,
for this reason we will have a fairly conservative writeback limit.
Given that, we want to make sure we're maximizing what lands on disk
while still minimizing the refault time. We could take the approach
where we always writeback huge pages but then we may result in very
quick refaults which would be a huge waste of time. So idle writeback
is a must for us and being able to writeback the pages which have
maximum value (huge) would be very useful
"

> 
> Signed-off-by: Brian Geffon <bgeffon@google.com>

Other than that, feel free to add my
Acked-by: Minchan Kim <minchan@kernel.org>

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] zram: Add a huge_idle writeback mode
  2022-03-22 21:13   ` Minchan Kim
@ 2022-03-22 21:52     ` Brian Geffon
  0 siblings, 0 replies; 12+ messages in thread
From: Brian Geffon @ 2022-03-22 21:52 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Nitin Gupta, Sergey Senozhatsky, LKML, linux-doc,
	linux-block

On Tue, Mar 22, 2022 at 5:13 PM Minchan Kim <minchan@kernel.org> wrote:
>
> On Mon, Mar 21, 2022 at 07:50:37AM -0700, Brian Geffon wrote:
> > Today it's only possible to write back as a page, idle, or huge.
> > A user might want to writeback pages which are huge and idle first
> > as these idle pages do not require decompression and make a good
> > first pass for writeback.
> >
> > Idle writeback specifically has the advantage that a refault is
> > unlikely given that the page has been swapped for some amount of
> > time without being refaulted.
> >
> > Huge writeback has the advantage that you're guaranteed to get
> > the maximum benefit from a single page writeback, that is, you're
> > reclaiming one full page of memory. Pages which are compressed in
> > zram being written back result in some benefit which is always
> > less than a page size because of the fact that it was compressed.
> >
> > This change allows for users to write back huge pages which are
> > also idle.
>
> Hey Brian,
>
> I really want to add your explanation about the storage endurance
> because it's real issue.
>
> So, could't you add up below in the description?

Sure thing.

>
> From your previous reply
> "
> we're trying to be very sensitive to our devices storage endurance,
> for this reason we will have a fairly conservative writeback limit.
> Given that, we want to make sure we're maximizing what lands on disk
> while still minimizing the refault time. We could take the approach
> where we always writeback huge pages but then we may result in very
> quick refaults which would be a huge waste of time. So idle writeback
> is a must for us and being able to writeback the pages which have
> maximum value (huge) would be very useful
> "
>
> >
> > Signed-off-by: Brian Geffon <bgeffon@google.com>
>
> Other than that, feel free to add my
> Acked-by: Minchan Kim <minchan@kernel.org>

Thanks Minchan.

>
> Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2] zram: Add a huge_idle writeback mode
  2022-03-15 17:22 [PATCH] zram: Add a huge_idle writeback mode Brian Geffon
                   ` (2 preceding siblings ...)
  2022-03-21 14:50 ` Brian Geffon
@ 2022-03-22 21:58 ` Brian Geffon
  3 siblings, 0 replies; 12+ messages in thread
From: Brian Geffon @ 2022-03-22 21:58 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Minchan Kim, Nitin Gupta, Sergey Senozhatsky, linux-kernel,
	linux-doc, linux-block, Brian Geffon

Today it's only possible to write back as a page, idle, or huge.
A user might want to writeback pages which are huge and idle first
as these idle pages do not require decompression and make a good
first pass for writeback.

Idle writeback specifically has the advantage that a refault is
unlikely given that the page has been swapped for some amount of
time without being refaulted.

Huge writeback has the advantage that you're guaranteed to get
the maximum benefit from a single page writeback, that is, you're
reclaiming one full page of memory. Pages which are compressed in
zram being written back result in some benefit which is always
less than a page size because of the fact that it was compressed.

The primary use of this is for minimizing refaults in situations
where the device has to be sensitive to storage endurance. On
ChromeOS we have devices with slow eMMC and repeated writes and
refaults can negatively affect performance and endurance.

Signed-off-by: Brian Geffon <bgeffon@google.com>
Acked-by: Minchan Kim <minchan@kernel.org>
---
 Documentation/admin-guide/blockdev/zram.rst |  5 +++++
 drivers/block/zram/zram_drv.c               | 10 ++++++----
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst
index 54fe63745ed8..c73b16930449 100644
--- a/Documentation/admin-guide/blockdev/zram.rst
+++ b/Documentation/admin-guide/blockdev/zram.rst
@@ -343,6 +343,11 @@ Admin can request writeback of those idle pages at right timing via::
 
 With the command, zram will writeback idle pages from memory to the storage.
 
+Additionally, if a user choose to writeback only huge and idle pages
+this can be accomplished with::
+
+        echo huge_idle > /sys/block/zramX/writeback
+
 If an admin wants to write a specific page in zram device to the backing device,
 they could write a page index into the interface.
 
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index e9474b02012d..8562a7cce558 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -639,8 +639,8 @@ static int read_from_bdev_async(struct zram *zram, struct bio_vec *bvec,
 #define PAGE_WB_SIG "page_index="
 
 #define PAGE_WRITEBACK 0
-#define HUGE_WRITEBACK 1
-#define IDLE_WRITEBACK 2
+#define HUGE_WRITEBACK (1<<0)
+#define IDLE_WRITEBACK (1<<1)
 
 
 static ssize_t writeback_store(struct device *dev,
@@ -660,6 +660,8 @@ static ssize_t writeback_store(struct device *dev,
 		mode = IDLE_WRITEBACK;
 	else if (sysfs_streq(buf, "huge"))
 		mode = HUGE_WRITEBACK;
+	else if (sysfs_streq(buf, "huge_idle"))
+		mode = IDLE_WRITEBACK | HUGE_WRITEBACK;
 	else {
 		if (strncmp(buf, PAGE_WB_SIG, sizeof(PAGE_WB_SIG) - 1))
 			return -EINVAL;
@@ -721,10 +723,10 @@ static ssize_t writeback_store(struct device *dev,
 				zram_test_flag(zram, index, ZRAM_UNDER_WB))
 			goto next;
 
-		if (mode == IDLE_WRITEBACK &&
+		if (mode & IDLE_WRITEBACK &&
 			  !zram_test_flag(zram, index, ZRAM_IDLE))
 			goto next;
-		if (mode == HUGE_WRITEBACK &&
+		if (mode & HUGE_WRITEBACK &&
 			  !zram_test_flag(zram, index, ZRAM_HUGE))
 			goto next;
 		/*
-- 
2.35.1.894.gb6a874cedc-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-03-22 21:58 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-15 17:22 [PATCH] zram: Add a huge_idle writeback mode Brian Geffon
2022-03-15 17:28 ` Matthew Wilcox
2022-03-15 17:34   ` Brian Geffon
2022-03-15 17:44     ` Matthew Wilcox
2022-03-16  0:01       ` Brian Geffon
2022-03-18 16:41 ` Minchan Kim
2022-03-18 16:51   ` Brian Geffon
2022-03-18 17:30     ` Minchan Kim
2022-03-21 14:50 ` Brian Geffon
2022-03-22 21:13   ` Minchan Kim
2022-03-22 21:52     ` Brian Geffon
2022-03-22 21:58 ` [PATCH v2] " Brian Geffon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).