zsmalloc: consider ZS_ALMOST_FULL as migrate source
diff mbox series

Message ID 1436491929-6617-1-git-send-email-minchan@kernel.org
State New, archived
Headers show
Series
  • zsmalloc: consider ZS_ALMOST_FULL as migrate source
Related show

Commit Message

Minchan Kim July 10, 2015, 1:32 a.m. UTC
There is no reason to prevent select ZS_ALMOST_FULL as migration
source if we cannot find source from ZS_ALMOST_EMPTY.

With this patch, zs_can_compact will return more exact result.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/zsmalloc.c |   19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

Comments

Sergey Senozhatsky July 10, 2015, 1:58 a.m. UTC | #1
On (07/10/15 10:32), Minchan Kim wrote:
> There is no reason to prevent select ZS_ALMOST_FULL as migration
> source if we cannot find source from ZS_ALMOST_EMPTY.
> 
> With this patch, zs_can_compact will return more exact result.
> 

wouldn't that be too aggresive?

drainig 'only ZS_ALMOST_EMPTY classes' sounds safer than draining
'ZS_ALMOST_EMPTY and ZS_ALMOST_FULL clases'. you seemed to be worried
that compaction can leave no unused objects in classes, which will
result in zspage allocation happening right after compaction. it looks
like here the chances to cause zspage allocation are even higher. don't
you think so?

> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/zsmalloc.c |   19 ++++++++++++-------
>  1 file changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 8c78bcb..7bd7dde 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -1687,12 +1687,20 @@ static enum fullness_group putback_zspage(struct zs_pool *pool,
>  static struct page *isolate_source_page(struct size_class *class)
>  {
>  	struct page *page;
> +	int i;
> +	bool found = false;
>  
> -	page = class->fullness_list[ZS_ALMOST_EMPTY];
> -	if (page)
> -		remove_zspage(page, class, ZS_ALMOST_EMPTY);
> +	for (i = ZS_ALMOST_EMPTY; i >= ZS_ALMOST_FULL; i--) {
> +		page = class->fullness_list[i];
> +		if (!page)
> +			continue;
>  
> -	return page;
> +		remove_zspage(page, class, i);
> +		found = true;
> +		break;
> +	}
> +
> +	return found ? page : NULL;
>  }
>  
>  /*
> @@ -1706,9 +1714,6 @@ static unsigned long zs_can_compact(struct size_class *class)
>  {
>  	unsigned long obj_wasted;
>  
> -	if (!zs_stat_get(class, CLASS_ALMOST_EMPTY))
> -		return 0;
> -

well, you asked to add this check like a week or two ago (it's not even
in -next yet) and now you remove it.

>  	obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
>  		zs_stat_get(class, OBJ_USED);
>  

	-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Sergey Senozhatsky July 10, 2015, 2:06 a.m. UTC | #2
On (07/10/15 10:32), Minchan Kim wrote:
>  static struct page *isolate_source_page(struct size_class *class)
>  {
>  	struct page *page;
> +	int i;
> +	bool found = false;
>  

why use 'bool found'? just return `page', which will be either NULL
or !NULL?

	-ss

> -	page = class->fullness_list[ZS_ALMOST_EMPTY];
> -	if (page)
> -		remove_zspage(page, class, ZS_ALMOST_EMPTY);
> +	for (i = ZS_ALMOST_EMPTY; i >= ZS_ALMOST_FULL; i--) {
> +		page = class->fullness_list[i];
> +		if (!page)
> +			continue;
>  
> -	return page;
> +		remove_zspage(page, class, i);
> +		found = true;
> +		break;
> +	}
> +
> +	return found ? page : NULL;
>  }

	-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Minchan Kim July 10, 2015, 2:29 a.m. UTC | #3
Hi Sergey,

On Fri, Jul 10, 2015 at 10:58:28AM +0900, Sergey Senozhatsky wrote:
> On (07/10/15 10:32), Minchan Kim wrote:
> > There is no reason to prevent select ZS_ALMOST_FULL as migration
> > source if we cannot find source from ZS_ALMOST_EMPTY.
> > 
> > With this patch, zs_can_compact will return more exact result.
> > 
> 
> wouldn't that be too aggresive?
> 
> drainig 'only ZS_ALMOST_EMPTY classes' sounds safer than draining
> 'ZS_ALMOST_EMPTY and ZS_ALMOST_FULL clases'. you seemed to be worried
> that compaction can leave no unused objects in classes, which will
> result in zspage allocation happening right after compaction. it looks
> like here the chances to cause zspage allocation are even higher. don't
> you think so?

Good question.

My worry was failure of order-0 page allocation in zram-swap path
when memory presssure is really heavy but I didn't insist to you
from sometime. The reason I changed my mind was

1. It's almost dead system if there is no order-0 page
2. If old might be working well, it's not our design, just luck.


> 
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  mm/zsmalloc.c |   19 ++++++++++++-------
> >  1 file changed, 12 insertions(+), 7 deletions(-)
> > 
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index 8c78bcb..7bd7dde 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -1687,12 +1687,20 @@ static enum fullness_group putback_zspage(struct zs_pool *pool,
> >  static struct page *isolate_source_page(struct size_class *class)
> >  {
> >  	struct page *page;
> > +	int i;
> > +	bool found = false;
> >  
> > -	page = class->fullness_list[ZS_ALMOST_EMPTY];
> > -	if (page)
> > -		remove_zspage(page, class, ZS_ALMOST_EMPTY);
> > +	for (i = ZS_ALMOST_EMPTY; i >= ZS_ALMOST_FULL; i--) {
> > +		page = class->fullness_list[i];
> > +		if (!page)
> > +			continue;
> >  
> > -	return page;
> > +		remove_zspage(page, class, i);
> > +		found = true;
> > +		break;
> > +	}
> > +
> > +	return found ? page : NULL;
> >  }
> >  
> >  /*
> > @@ -1706,9 +1714,6 @@ static unsigned long zs_can_compact(struct size_class *class)
> >  {
> >  	unsigned long obj_wasted;
> >  
> > -	if (!zs_stat_get(class, CLASS_ALMOST_EMPTY))
> > -		return 0;
> > -
> 
> well, you asked to add this check like a week or two ago (it's not even
> in -next yet) and now you remove it.

The reason I wanted to check CLASS_ALMOST_EMPTY is to make zs_can_compact exact.
But with this patch, we can achieve it without above.

> 
> >  	obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
> >  		zs_stat_get(class, OBJ_USED);
> >  
> 
> 	-ss
Minchan Kim July 10, 2015, 2:34 a.m. UTC | #4
On Fri, Jul 10, 2015 at 11:06:24AM +0900, Sergey Senozhatsky wrote:
> On (07/10/15 10:32), Minchan Kim wrote:
> >  static struct page *isolate_source_page(struct size_class *class)
> >  {
> >  	struct page *page;
> > +	int i;
> > +	bool found = false;
> >  
> 
> why use 'bool found'? just return `page', which will be either NULL
> or !NULL?

It seems my old version which had a bug during test. :(
I will resend with the fix.

Thanks, Sergey!

> 
> 	-ss
> 
> > -	page = class->fullness_list[ZS_ALMOST_EMPTY];
> > -	if (page)
> > -		remove_zspage(page, class, ZS_ALMOST_EMPTY);
> > +	for (i = ZS_ALMOST_EMPTY; i >= ZS_ALMOST_FULL; i--) {
> > +		page = class->fullness_list[i];
> > +		if (!page)
> > +			continue;
> >  
> > -	return page;
> > +		remove_zspage(page, class, i);
> > +		found = true;
> > +		break;
> > +	}
> > +
> > +	return found ? page : NULL;
> >  }
> 
> 	-ss
Sergey Senozhatsky July 10, 2015, 4:19 a.m. UTC | #5
On (07/10/15 11:29), Minchan Kim wrote:
> Good question.
> 
> My worry was failure of order-0 page allocation in zram-swap path
> when memory presssure is really heavy but I didn't insist to you
> from sometime. The reason I changed my mind was
> 
> 1. It's almost dead system if there is no order-0 page
> 2. If old might be working well, it's not our design, just luck.

I mean I find your argument that some level of fragmentation
can be of use to be valid, to some degree.


hm... by the way,

unsigned long zs_malloc(struct zs_pool *pool, size_t size)
{
...
   size += ZS_HANDLE_SIZE;
   class = pool->size_class[get_size_class_index(size)];
...
   if (!first_page) {
	   spin_unlock(&class->lock);
	   first_page = alloc_zspage(class, pool->flags);
	   if (unlikely(!first_page)) {
		   free_handle(pool, handle);
		   return 0;
	   }
   ...

I'm thinking now, does it make sense to try harder here? if we
failed to alloc_zspage(), then may be we can try any of unused
objects from a 'upper' (larger/next) class?  there might be a
plenty of them.

	-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Minchan Kim July 10, 2015, 5:21 a.m. UTC | #6
On Fri, Jul 10, 2015 at 01:19:29PM +0900, Sergey Senozhatsky wrote:
> On (07/10/15 11:29), Minchan Kim wrote:
> > Good question.
> > 
> > My worry was failure of order-0 page allocation in zram-swap path
> > when memory presssure is really heavy but I didn't insist to you
> > from sometime. The reason I changed my mind was
> > 
> > 1. It's almost dead system if there is no order-0 page
> > 2. If old might be working well, it's not our design, just luck.
> 
> I mean I find your argument that some level of fragmentation
> can be of use to be valid, to some degree.

The benefit I had in mind was to prevent failure of allocation.

> 
> 
> hm... by the way,
> 
> unsigned long zs_malloc(struct zs_pool *pool, size_t size)
> {
> ...
>    size += ZS_HANDLE_SIZE;
>    class = pool->size_class[get_size_class_index(size)];
> ...
>    if (!first_page) {
> 	   spin_unlock(&class->lock);
> 	   first_page = alloc_zspage(class, pool->flags);
> 	   if (unlikely(!first_page)) {
> 		   free_handle(pool, handle);
> 		   return 0;
> 	   }
>    ...
> 
> I'm thinking now, does it make sense to try harder here? if we
> failed to alloc_zspage(), then may be we can try any of unused
> objects from a 'upper' (larger/next) class?  there might be a
> plenty of them.

I actually thought about that but I didn't have any report from
community and product division of my compamy until now.
But with auto-compaction, the chance would be higher than old
so let's keep an eye on it(I think users can find it easily because
swap layer emits "write write failure").

If it happens(ie, any report from someone), we could try to compact
and then if it fails, we could fall back to upper class as a last
resort.

Thanks.
> 
> 	-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Sergey Senozhatsky July 10, 2015, 5:34 a.m. UTC | #7
On (07/10/15 14:21), Minchan Kim wrote:
> > I mean I find your argument that some level of fragmentation
> > can be of use to be valid, to some degree.
> 
> The benefit I had in mind was to prevent failure of allocation.
> 

Sure. I tested the patch.

cat /sys/block/zram0/mm_stat
3122102272 2882639758 2890366976        0 2969432064       55    79294

cat /sys/block/zram0/stat
    7212        0    57696       73  7513254        0 60106032    52096     0    52106    52113

Compaction stats:

[14637.002961] compaction nr:89 (full:528 part:3027)  ~= 0.148

Nothing `alarming'.


> > I'm thinking now, does it make sense to try harder here? if we
> > failed to alloc_zspage(), then may be we can try any of unused
> > objects from a 'upper' (larger/next) class?  there might be a
> > plenty of them.
> 
> I actually thought about that but I didn't have any report from
> community and product division of my compamy until now.
> But with auto-compaction, the chance would be higher than old
> so let's keep an eye on it(I think users can find it easily because
> swap layer emits "write write failure").
> 
> If it happens(ie, any report from someone), we could try to compact
> and then if it fails, we could fall back to upper class as a last
> resort.
> 

OK.

	-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Patch
diff mbox series

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 8c78bcb..7bd7dde 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1687,12 +1687,20 @@  static enum fullness_group putback_zspage(struct zs_pool *pool,
 static struct page *isolate_source_page(struct size_class *class)
 {
 	struct page *page;
+	int i;
+	bool found = false;
 
-	page = class->fullness_list[ZS_ALMOST_EMPTY];
-	if (page)
-		remove_zspage(page, class, ZS_ALMOST_EMPTY);
+	for (i = ZS_ALMOST_EMPTY; i >= ZS_ALMOST_FULL; i--) {
+		page = class->fullness_list[i];
+		if (!page)
+			continue;
 
-	return page;
+		remove_zspage(page, class, i);
+		found = true;
+		break;
+	}
+
+	return found ? page : NULL;
 }
 
 /*
@@ -1706,9 +1714,6 @@  static unsigned long zs_can_compact(struct size_class *class)
 {
 	unsigned long obj_wasted;
 
-	if (!zs_stat_get(class, CLASS_ALMOST_EMPTY))
-		return 0;
-
 	obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
 		zs_stat_get(class, OBJ_USED);