* [PATCH 0/3] zsmalloc: small compaction improvements @ 2015-07-11 9:45 Sergey Senozhatsky 2015-07-11 9:45 ` [PATCH 1/3] zsmalloc: factor out zs_pages_to_compact() Sergey Senozhatsky ` (3 more replies) 0 siblings, 4 replies; 15+ messages in thread From: Sergey Senozhatsky @ 2015-07-11 9:45 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky Hello, First two patches introduce new zsmalloc zs_pages_to_compact() symbol and change zram's `compact' sysfs attribute to be read-write: -- write triggers compaction, no changes -- read returns the number of pages that compaction can potentially free This lets user space to make a bit better decisions and to avoid unneeded (which will not result in any significant memory savings) compaction calls: Example: if [ `cat /sys/block/zram<id>/compact` -gt 10 ]; then echo 1 > /sys/block/zram<id>/compact; fi Up until now user space could not tell whether compaction will result in any gain. The third patch removes class locking around zs_can_compact() in zs_pages_to_compact(), the motivation and details are provided in the commit message. Sergey Senozhatsky (3): zsmalloc: factor out zs_pages_to_compact() zram: make compact a read-write sysfs node zsmalloc: do not take class lock in zs_pages_to_compact() Documentation/ABI/testing/sysfs-block-zram | 7 +++--- Documentation/blockdev/zram.txt | 4 +++- drivers/block/zram/zram_drv.c | 16 ++++++++++++- include/linux/zsmalloc.h | 1 + mm/zsmalloc.c | 37 +++++++++++++++++------------- 5 files changed, 44 insertions(+), 21 deletions(-) -- 2.4.5 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 1/3] zsmalloc: factor out zs_pages_to_compact() 2015-07-11 9:45 [PATCH 0/3] zsmalloc: small compaction improvements Sergey Senozhatsky @ 2015-07-11 9:45 ` Sergey Senozhatsky 2015-07-11 9:45 ` [PATCH 2/3] zram: make compact a read-write sysfs node Sergey Senozhatsky ` (2 subsequent siblings) 3 siblings, 0 replies; 15+ messages in thread From: Sergey Senozhatsky @ 2015-07-11 9:45 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky Factor out the code that calculates how many pages compaction can free into zs_pages_to_compact() function and export it as zsmalloc API symbol. We still use it in zs_shrinker_count(), just like we did before, and at the same time we now let zram know this number (and provide it to user space) so user space can make better assumptions about manual compaction effectiveness. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> --- include/linux/zsmalloc.h | 1 + mm/zsmalloc.c | 39 +++++++++++++++++++++++---------------- 2 files changed, 24 insertions(+), 16 deletions(-) diff --git a/include/linux/zsmalloc.h b/include/linux/zsmalloc.h index 6398dfa..8f4de78 100644 --- a/include/linux/zsmalloc.h +++ b/include/linux/zsmalloc.h @@ -53,6 +53,7 @@ void zs_unmap_object(struct zs_pool *pool, unsigned long handle); unsigned long zs_get_total_pages(struct zs_pool *pool); unsigned long zs_compact(struct zs_pool *pool); +unsigned long zs_pages_to_compact(struct zs_pool *pool); void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats); #endif diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index c10885c..b10a228 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1798,6 +1798,28 @@ void zs_pool_stats(struct zs_pool *pool, struct zs_pool_stats *stats) } EXPORT_SYMBOL_GPL(zs_pool_stats); +unsigned long zs_pages_to_compact(struct zs_pool *pool) +{ + unsigned long pages_to_free = 0; + int i; + struct size_class *class; + + for (i = zs_size_classes - 1; i >= 0; i--) { + class = pool->size_class[i]; + if (!class) + continue; + if (class->index != i) + continue; + + spin_lock(&class->lock); + pages_to_free += zs_can_compact(class); + spin_unlock(&class->lock); + } + + return pages_to_free; +} +EXPORT_SYMBOL_GPL(zs_pages_to_compact); + static unsigned long zs_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { @@ -1819,28 +1841,13 @@ static unsigned long zs_shrinker_scan(struct shrinker *shrinker, static unsigned long zs_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) { - int i; - struct size_class *class; - unsigned long pages_to_free = 0; struct zs_pool *pool = container_of(shrinker, struct zs_pool, shrinker); if (!pool->shrinker_enabled) return 0; - for (i = zs_size_classes - 1; i >= 0; i--) { - class = pool->size_class[i]; - if (!class) - continue; - if (class->index != i) - continue; - - spin_lock(&class->lock); - pages_to_free += zs_can_compact(class); - spin_unlock(&class->lock); - } - - return pages_to_free; + return zs_pages_to_compact(pool); } static void zs_unregister_shrinker(struct zs_pool *pool) -- 2.4.5 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/3] zram: make compact a read-write sysfs node 2015-07-11 9:45 [PATCH 0/3] zsmalloc: small compaction improvements Sergey Senozhatsky 2015-07-11 9:45 ` [PATCH 1/3] zsmalloc: factor out zs_pages_to_compact() Sergey Senozhatsky @ 2015-07-11 9:45 ` Sergey Senozhatsky 2015-07-11 9:45 ` [PATCH 3/3] zsmalloc: do not take class lock in zs_pages_to_compact() Sergey Senozhatsky 2015-07-13 23:36 ` [PATCH 0/3] zsmalloc: small compaction improvements Minchan Kim 3 siblings, 0 replies; 15+ messages in thread From: Sergey Senozhatsky @ 2015-07-11 9:45 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky Change zram's `compact' sysfs node to be a read-write attribute. Write triggers zsmalloc compaction, just as before, read returns the number of pages that zsmalloc can potentially compact. User space now has a chance to estimate possible compaction memory savings and avoid unnecessary compactions. Example: if [ `cat /sys/block/zram<id>/compact` -gt 10 ]; then echo 1 > /sys/block/zram<id>/compact; fi Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> --- Documentation/ABI/testing/sysfs-block-zram | 7 ++++--- Documentation/blockdev/zram.txt | 4 +++- drivers/block/zram/zram_drv.c | 16 +++++++++++++++- 3 files changed, 22 insertions(+), 5 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-block-zram b/Documentation/ABI/testing/sysfs-block-zram index 2e69e83..0093998 100644 --- a/Documentation/ABI/testing/sysfs-block-zram +++ b/Documentation/ABI/testing/sysfs-block-zram @@ -146,9 +146,10 @@ What: /sys/block/zram<id>/compact Date: August 2015 Contact: Minchan Kim <minchan@kernel.org> Description: - The compact file is write-only and trigger compaction for - allocator zrm uses. The allocator moves some objects so that - it could free fragment space. + The compact file is read/write. Write triggers underlying + allocator's memory compaction, which may result in memory + savings. Read returns the number of pages that compaction + can potentially (but not guaranteed to) free. What: /sys/block/zram<id>/io_stat Date: August 2015 diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt index 62435bb..1854f62 100644 --- a/Documentation/blockdev/zram.txt +++ b/Documentation/blockdev/zram.txt @@ -146,7 +146,9 @@ mem_limit RW the maximum amount of memory ZRAM can use to store the compressed data pages_compacted RO the number of pages freed during compaction (available only via zram<id>/mm_stat node) -compact WO trigger memory compaction +compact RW write triggers memory compaction, read shows how many + pages can potentially (but not necessarily will) be + compacted WARNING ======= diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index f5ef9e0..def9b8a 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -404,6 +404,20 @@ static ssize_t compact_store(struct device *dev, return len; } +static ssize_t compact_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct zram *zram = dev_to_zram(dev); + unsigned long num_pages = 0; + + down_read(&zram->init_lock); + if (init_done(zram)) + num_pages = zs_pages_to_compact(zram->meta->mem_pool); + up_read(&zram->init_lock); + + return scnprintf(buf, PAGE_SIZE, "%lu\n", num_pages); +} + static ssize_t io_stat_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -1145,7 +1159,7 @@ static const struct block_device_operations zram_devops = { .owner = THIS_MODULE }; -static DEVICE_ATTR_WO(compact); +static DEVICE_ATTR_RW(compact); static DEVICE_ATTR_RW(disksize); static DEVICE_ATTR_RO(initstate); static DEVICE_ATTR_WO(reset); -- 2.4.5 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 3/3] zsmalloc: do not take class lock in zs_pages_to_compact() 2015-07-11 9:45 [PATCH 0/3] zsmalloc: small compaction improvements Sergey Senozhatsky 2015-07-11 9:45 ` [PATCH 1/3] zsmalloc: factor out zs_pages_to_compact() Sergey Senozhatsky 2015-07-11 9:45 ` [PATCH 2/3] zram: make compact a read-write sysfs node Sergey Senozhatsky @ 2015-07-11 9:45 ` Sergey Senozhatsky 2015-07-15 4:07 ` Sergey Senozhatsky 2015-07-13 23:36 ` [PATCH 0/3] zsmalloc: small compaction improvements Minchan Kim 3 siblings, 1 reply; 15+ messages in thread From: Sergey Senozhatsky @ 2015-07-11 9:45 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky We can avoid taking class ->lock around zs_can_compact() in zs_pages_to_compact(), because the number that we return back is outdated in general case, by design. We have different source that are able to change class's state right after we return from zs_can_compact() -- ongoing IO operations, manually triggered compaction or automatic compaction, or all three simultaneously. We re-do this calculations during compaction on a per class basis anyway. zs_unregister_shrinker() will not return until we have an active shrinker, so classes won't unexpectedly disappear while zs_pages_to_compact(), invoked by zs_shrinker_count(), iterates them. When called from zram, we are protected by zram's ->init_lock, so, again, classes will be there until zs_pages_to_compact() iterates them. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> --- mm/zsmalloc.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index b10a228..824c182 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1811,9 +1811,7 @@ unsigned long zs_pages_to_compact(struct zs_pool *pool) if (class->index != i) continue; - spin_lock(&class->lock); pages_to_free += zs_can_compact(class); - spin_unlock(&class->lock); } return pages_to_free; -- 2.4.5 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 3/3] zsmalloc: do not take class lock in zs_pages_to_compact() 2015-07-11 9:45 ` [PATCH 3/3] zsmalloc: do not take class lock in zs_pages_to_compact() Sergey Senozhatsky @ 2015-07-15 4:07 ` Sergey Senozhatsky 2015-07-15 23:38 ` Minchan Kim 0 siblings, 1 reply; 15+ messages in thread From: Sergey Senozhatsky @ 2015-07-15 4:07 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky, Sergey Senozhatsky On (07/11/15 18:45), Sergey Senozhatsky wrote: [..] > We re-do this calculations during compaction on a per class basis > anyway. > > zs_unregister_shrinker() will not return until we have an active > shrinker, so classes won't unexpectedly disappear while > zs_pages_to_compact(), invoked by zs_shrinker_count(), iterates > them. > > When called from zram, we are protected by zram's ->init_lock, > so, again, classes will be there until zs_pages_to_compact() > iterates them. > > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> > --- > mm/zsmalloc.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > index b10a228..824c182 100644 > --- a/mm/zsmalloc.c > +++ b/mm/zsmalloc.c > @@ -1811,9 +1811,7 @@ unsigned long zs_pages_to_compact(struct zs_pool *pool) > if (class->index != i) > continue; > > - spin_lock(&class->lock); > pages_to_free += zs_can_compact(class); > - spin_unlock(&class->lock); > } > > return pages_to_free; This patch still makes sense. Agree? -ss ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/3] zsmalloc: do not take class lock in zs_pages_to_compact() 2015-07-15 4:07 ` Sergey Senozhatsky @ 2015-07-15 23:38 ` Minchan Kim 2015-07-15 23:59 ` Sergey Senozhatsky 0 siblings, 1 reply; 15+ messages in thread From: Minchan Kim @ 2015-07-15 23:38 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky Hi Sergey, On Wed, Jul 15, 2015 at 01:07:03PM +0900, Sergey Senozhatsky wrote: > On (07/11/15 18:45), Sergey Senozhatsky wrote: > [..] > > We re-do this calculations during compaction on a per class basis > > anyway. > > > > zs_unregister_shrinker() will not return until we have an active > > shrinker, so classes won't unexpectedly disappear while > > zs_pages_to_compact(), invoked by zs_shrinker_count(), iterates > > them. > > > > When called from zram, we are protected by zram's ->init_lock, > > so, again, classes will be there until zs_pages_to_compact() > > iterates them. > > > > Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> > > --- > > mm/zsmalloc.c | 2 -- > > 1 file changed, 2 deletions(-) > > > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > > index b10a228..824c182 100644 > > --- a/mm/zsmalloc.c > > +++ b/mm/zsmalloc.c > > @@ -1811,9 +1811,7 @@ unsigned long zs_pages_to_compact(struct zs_pool *pool) > > if (class->index != i) > > continue; > > > > - spin_lock(&class->lock); > > pages_to_free += zs_can_compact(class); > > - spin_unlock(&class->lock); > > } > > > > return pages_to_free; > > This patch still makes sense. Agree? There is already race window between shrink_count and shrink_slab so it would be okay if we return stale stat with removing the lock if the difference is not huge. Even, now we don't obey nr_to_scan of shrinker in zs_shrinker_scan so the such accuracy would be pointless. Please resend the patch and correct zs_can_compact's comment. Thanks. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/3] zsmalloc: do not take class lock in zs_pages_to_compact() 2015-07-15 23:38 ` Minchan Kim @ 2015-07-15 23:59 ` Sergey Senozhatsky 0 siblings, 0 replies; 15+ messages in thread From: Sergey Senozhatsky @ 2015-07-15 23:59 UTC (permalink / raw) To: Minchan Kim Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky Hi, On (07/16/15 08:38), Minchan Kim wrote: > > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > > > index b10a228..824c182 100644 > > > --- a/mm/zsmalloc.c > > > +++ b/mm/zsmalloc.c > > > @@ -1811,9 +1811,7 @@ unsigned long zs_pages_to_compact(struct zs_pool *pool) > > > if (class->index != i) > > > continue; > > > > > > - spin_lock(&class->lock); > > > pages_to_free += zs_can_compact(class); > > > - spin_unlock(&class->lock); > > > } > > > > > > return pages_to_free; > > > > This patch still makes sense. Agree? > > There is already race window between shrink_count and shrink_slab so > it would be okay if we return stale stat with removing the lock if > the difference is not huge. > > Even, now we don't obey nr_to_scan of shrinker in zs_shrinker_scan > so the such accuracy would be pointless. Yeah, automatic shrinker may work concurrently with the user triggered one, so it may be hard (time consuming) to release the exact amount of pages that we returned from _count(). We can look at `sc->nr_to_reclaim' to avoid releasing more pages than shrinker wants us to release, but I'd probably prefer to keep the existing behaviour if we were called by the shrinker. OK, will resend later today. -ss ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] zsmalloc: small compaction improvements 2015-07-11 9:45 [PATCH 0/3] zsmalloc: small compaction improvements Sergey Senozhatsky ` (2 preceding siblings ...) 2015-07-11 9:45 ` [PATCH 3/3] zsmalloc: do not take class lock in zs_pages_to_compact() Sergey Senozhatsky @ 2015-07-13 23:36 ` Minchan Kim 2015-07-14 0:31 ` Sergey Senozhatsky 3 siblings, 1 reply; 15+ messages in thread From: Minchan Kim @ 2015-07-13 23:36 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky Hello Sergey, On Sat, Jul 11, 2015 at 06:45:29PM +0900, Sergey Senozhatsky wrote: > Hello, > > First two patches introduce new zsmalloc zs_pages_to_compact() > symbol and change zram's `compact' sysfs attribute to be > read-write: > -- write triggers compaction, no changes > -- read returns the number of pages that compaction can > potentially free > > This lets user space to make a bit better decisions and to > avoid unneeded (which will not result in any significant > memory savings) compaction calls: > > Example: > > if [ `cat /sys/block/zram<id>/compact` -gt 10 ]; then > echo 1 > /sys/block/zram<id>/compact; > fi > > Up until now user space could not tell whether compaction > will result in any gain. First of all, thanks for the looking this. Question: What is motivation? IOW, did you see big overhead by user-triggered compaction? so, do you want to throttle it by userspace? > > The third patch removes class locking around zs_can_compact() > in zs_pages_to_compact(), the motivation and details are > provided in the commit message. > > Sergey Senozhatsky (3): > zsmalloc: factor out zs_pages_to_compact() > zram: make compact a read-write sysfs node > zsmalloc: do not take class lock in zs_pages_to_compact() > > Documentation/ABI/testing/sysfs-block-zram | 7 +++--- > Documentation/blockdev/zram.txt | 4 +++- > drivers/block/zram/zram_drv.c | 16 ++++++++++++- > include/linux/zsmalloc.h | 1 + > mm/zsmalloc.c | 37 +++++++++++++++++------------- > 5 files changed, 44 insertions(+), 21 deletions(-) > > -- > 2.4.5 > -- Kind regards, Minchan Kim ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] zsmalloc: small compaction improvements 2015-07-13 23:36 ` [PATCH 0/3] zsmalloc: small compaction improvements Minchan Kim @ 2015-07-14 0:31 ` Sergey Senozhatsky 2015-07-14 0:55 ` Minchan Kim 0 siblings, 1 reply; 15+ messages in thread From: Sergey Senozhatsky @ 2015-07-14 0:31 UTC (permalink / raw) To: Minchan Kim Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel, Sergey Senozhatsky Hello Minchan, On (07/14/15 08:36), Minchan Kim wrote: [..] > > if [ `cat /sys/block/zram<id>/compact` -gt 10 ]; then > > echo 1 > /sys/block/zram<id>/compact; > > fi > > > > Up until now user space could not tell whether compaction > > will result in any gain. > > First of all, thanks for the looking this. > > Question: > > What is motivation? > IOW, did you see big overhead by user-triggered compaction? so, > do you want to throttle it by userspace? It depends on 'big overhead' definition, of course. We don't care that much when compaction is issued by the shrinker, because things are getting bad and we can sacrifice performance. But user triggered compaction on a I/O pressured device can needlessly slow things down, especially now, when we drain ALMOST_FULL classes. /sys/block/zram<id>/compact is a black box. We provide it, we don't throttle it in the kernel, and user space is absolutely clueless when it invokes compaction. From some remote (or alternative) point of view compaction can be seen as "zsmalloc's cache flush" (unused objects make write path quicker - no zspage allocation needed) and it won't hurt to give user space some numbers so it can decide if the whole thing is worth it (that decision is, once again, I/O pattern and setup specific -- some users may be interested in compaction only if it will reduce zsmalloc's memory consumption by, say, 15%). -ss ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] zsmalloc: small compaction improvements 2015-07-14 0:31 ` Sergey Senozhatsky @ 2015-07-14 0:55 ` Minchan Kim 2015-07-14 12:29 ` Sergey Senozhatsky 0 siblings, 1 reply; 15+ messages in thread From: Minchan Kim @ 2015-07-14 0:55 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel On Tue, Jul 14, 2015 at 09:31:32AM +0900, Sergey Senozhatsky wrote: > Hello Minchan, > > On (07/14/15 08:36), Minchan Kim wrote: > [..] > > > if [ `cat /sys/block/zram<id>/compact` -gt 10 ]; then > > > echo 1 > /sys/block/zram<id>/compact; > > > fi > > > > > > Up until now user space could not tell whether compaction > > > will result in any gain. > > > > First of all, thanks for the looking this. > > > > Question: > > > > What is motivation? > > IOW, did you see big overhead by user-triggered compaction? so, > > do you want to throttle it by userspace? > > It depends on 'big overhead' definition, of course. We don't care > that much when compaction is issued by the shrinker, because things > are getting bad and we can sacrifice performance. But user triggered > compaction on a I/O pressured device can needlessly slow things down, > especially now, when we drain ALMOST_FULL classes. You mean performance overhead by additional alloc_pages? If so, you mean ALMOST_EMPTY|ALMOST_FULL, not only ALMOST_FULL? So, it's performance enhance patch? Please give the some number to justify patchset. > > /sys/block/zram<id>/compact is a black box. We provide it, we don't > throttle it in the kernel, and user space is absolutely clueless when > it invokes compaction. From some remote (or alternative) point of But we have zs_can_compact so it can effectively skip the class if it is not proper class. > view compaction can be seen as "zsmalloc's cache flush" (unused objects > make write path quicker - no zspage allocation needed) and it won't > hurt to give user space some numbers so it can decide if the whole > thing is worth it (that decision is, once again, I/O pattern and > setup specific -- some users may be interested in compaction only > if it will reduce zsmalloc's memory consumption by, say, 15%). Again, your claim is performace so I need number. If it's really horrible, I guess below interface makes user handy without peeking nr_can_compact ad doing compact. /* Tell zram to compact if fragment ration is higher 15% */ echo 15% > /sys/block/zram0/compact or echo 15% > /sys/block/zram/compact_condition Anyway, we need a number before starting discussion. Thanks. > > -ss -- Kind regards, Minchan Kim ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] zsmalloc: small compaction improvements 2015-07-14 0:55 ` Minchan Kim @ 2015-07-14 12:29 ` Sergey Senozhatsky 2015-07-14 16:52 ` Minchan Kim 0 siblings, 1 reply; 15+ messages in thread From: Sergey Senozhatsky @ 2015-07-14 12:29 UTC (permalink / raw) To: Minchan Kim Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel On (07/14/15 09:55), Minchan Kim wrote: > > It depends on 'big overhead' definition, of course. We don't care > > that much when compaction is issued by the shrinker, because things > > are getting bad and we can sacrifice performance. But user triggered > > compaction on a I/O pressured device can needlessly slow things down, > > especially now, when we drain ALMOST_FULL classes. > > You mean performance overhead by additional alloc_pages? not only performance, but yes, performance mostly. > If so, you mean ALMOST_EMPTY|ALMOST_FULL, not only ALMOST_FULL? of course, I meant your recent patch here. should have been 'we _ALSO_ drain ALMOST_FULL classes' > > So, it's performance enhance patch? > Please give the some number to justify patchset. alrighty... again... > > > > /sys/block/zram<id>/compact is a black box. We provide it, we don't > > throttle it in the kernel, and user space is absolutely clueless when > > it invokes compaction. From some remote (or alternative) point of > > But we have zs_can_compact so it can effectively skip the class if it > is not proper class. user triggered compaction can compact too much. in its current state triggering a compaction from user space is like playing a lottery or a russian roulette. a simple script for i in {1..1000}; do echo -n 'compact... '; cat /sys/block/zram0/compact; echo 1 > /sys/block/zram0/compact; sleep 1; done (and this is not so crazy. love it or not, but this is the only way how user space can use compaction at the moment). the output ... compact... 0 compact... 0 compact... 0 compact... 0 compact... 0 compact... 0 compact... 409 compact... 3550 compact... 0 compact... 0 compact... 0 compact... 2129 compact... 765 compact... 0 compact... 0 compact... 0 compact... 784 compact... 0 compact... 0 compact... 0 compact... 0 ... (f.e., we compacted 3550 pages on device being under I/O pressure. that's quite a lot, don't you think so?). first -- no enforced compaction second -- with enforced compaction ./iozone -t 8 -R -r 4K -s 200M -I +Z w/o w/compaction " Initial write " 549240.49 538710.62 " Rewrite " 2447973.19 2442312.38 " Read " 5533620.69 5611562.00 " Re-read " 5689199.81 4916373.62 " Reverse Read " 4094576.16 5280551.56 " Stride read " 5382067.75 5395350.00 " Random read " 5384945.56 5298079.62 " Mixed workload " 3986770.06 3918897.78 " Random write " 2290869.12 2201346.50 " Pwrite " 502619.36 493527.64 " Pread " 2675312.28 2732118.19 " Fwrite " 4198686.06 3373855.09 " Fread " 18054401.25 17895797.00 > > view compaction can be seen as "zsmalloc's cache flush" (unused objects > > make write path quicker - no zspage allocation needed) and it won't > > hurt to give user space some numbers so it can decide if the whole > > thing is worth it (that decision is, once again, I/O pattern and > > setup specific -- some users may be interested in compaction only > > if it will reduce zsmalloc's memory consumption by, say, 15%). > > Again, your claim is performace so I need number. > If it's really horrible, I guess below interface makes user handy > without peeking nr_can_compact ad doing compact. > > /* Tell zram to compact if fragment ration is higher 15% */ > echo 15% > /sys/block/zram0/compact > or > echo 15% > /sys/block/zram/compact_condition no, this is the least of the things we need to do -- enforced and pre-defined policy engine in zram/zsmalloc 'that will work for all'. user space has almost all the numbers to do it, the only missing bit of the puzzle is `nr_can_compact' number. it's up to user space then to decide how it wishes things to be done. for example: "don't compact if compaction will flush 35% of zsmalloc pages on a I/O pressured device" or something else. -ss ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] zsmalloc: small compaction improvements 2015-07-14 12:29 ` Sergey Senozhatsky @ 2015-07-14 16:52 ` Minchan Kim 2015-07-15 0:21 ` Sergey Senozhatsky 0 siblings, 1 reply; 15+ messages in thread From: Minchan Kim @ 2015-07-14 16:52 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel Hi Sergey, On Tue, Jul 14, 2015 at 09:29:32PM +0900, Sergey Senozhatsky wrote: > On (07/14/15 09:55), Minchan Kim wrote: > > > It depends on 'big overhead' definition, of course. We don't care > > > that much when compaction is issued by the shrinker, because things > > > are getting bad and we can sacrifice performance. But user triggered > > > compaction on a I/O pressured device can needlessly slow things down, > > > especially now, when we drain ALMOST_FULL classes. > > > > You mean performance overhead by additional alloc_pages? > > not only performance, but yes, performance mostly. > > > If so, you mean ALMOST_EMPTY|ALMOST_FULL, not only ALMOST_FULL? > > of course, I meant your recent patch here. should have been 'we _ALSO_ > drain ALMOST_FULL classes' > > > > > So, it's performance enhance patch? > > Please give the some number to justify patchset. > > alrighty... again... > > > > > > > /sys/block/zram<id>/compact is a black box. We provide it, we don't > > > throttle it in the kernel, and user space is absolutely clueless when > > > it invokes compaction. From some remote (or alternative) point of > > > > But we have zs_can_compact so it can effectively skip the class if it > > is not proper class. > > user triggered compaction can compact too much. > in its current state triggering a compaction from user space is like > playing a lottery or a russian roulette. We were on different page. I thought the motivation from this patchset is to prevent compaction overhead by frequent user-driven compaction request because user don't know how they can get free pages by compaction so they should ask compact frequently with blind. That's why I mentioned zs_can_compact which can skip compaction effectively. But your montivation is too much drain so loses performance. First of all, user-triggered compaction should be highly avoided because we introduced auto-compaction now so it should be enough. If auto-compaction has trobule with someone's requirement, we should fix/enhance it rather than enhancing manual compact. For example, if there is some problems in compaction of VM, we don't enhance manual compact(ie, /proc/sys/vm/compact_memory). Instead, we enhance dynamic compaction. If there is some problems in reclaim algorithm, we don't enhance /proc/sys/vm/drop_caches but fixes alrogithm itself. Because it's really hard for user to decide when/how it triggers and even every system doesn't have such userspace program. If user want to compact by manual, we should catch up the voice from them and try to find why they are not enough with auto-compaction and fix it rather than relying on user's decision with manual compaction. Nonetheless, if user decied to trigger manual compaction, he should have a cost(ie, performance lose). It's natural for all caches that performance and cache size is proportional. If you have a concern about too much drain, we should fix auto-compaction which can reclaim pages as many as VM request, rather than every zspages of all classes. Slab shrinker already passes a hint how many objects subsystem should reclaim so we could stop reclaim at some point, not draining all classes. But the problem in this approach is we need balance reclaiming for all classes with round-robin to prevent too much draining from a class, I think it is not hard but I really suspect we need it in this point. > > a simple script > > for i in {1..1000}; do > echo -n 'compact... '; > cat /sys/block/zram0/compact; > echo 1 > /sys/block/zram0/compact; > sleep 1; > done > > (and this is not so crazy. love it or not, but this is the only way > how user space can use compaction at the moment). > > the output > ... > compact... 0 > compact... 0 > compact... 0 > compact... 0 > compact... 0 > compact... 0 > compact... 409 > compact... 3550 > compact... 0 > compact... 0 > compact... 0 > compact... 2129 > compact... 765 > compact... 0 > compact... 0 > compact... 0 > compact... 784 > compact... 0 > compact... 0 > compact... 0 > compact... 0 > ... > > (f.e., we compacted 3550 pages on device being under I/O pressure. > that's quite a lot, don't you think so?). > > first -- no enforced compaction > second -- with enforced compaction > > ./iozone -t 8 -R -r 4K -s 200M -I +Z > > w/o w/compaction > " Initial write " 549240.49 538710.62 > " Rewrite " 2447973.19 2442312.38 > " Read " 5533620.69 5611562.00 > " Re-read " 5689199.81 4916373.62 > " Reverse Read " 4094576.16 5280551.56 > " Stride read " 5382067.75 5395350.00 > " Random read " 5384945.56 5298079.62 > " Mixed workload " 3986770.06 3918897.78 > " Random write " 2290869.12 2201346.50 > " Pwrite " 502619.36 493527.64 > " Pread " 2675312.28 2732118.19 > " Fwrite " 4198686.06 3373855.09 > " Fread " 18054401.25 17895797.00 It seems to be not significant if we consider it's benchmark to focus to reveal performance and even sometime w/compaction is better. Although it might be apparenlty slow, it is expected. Because run-time compaction overhead and small cache size by compaction will absolutely affect performance so user should avoid such manual compaction and relies on cache eviction to VM reclaim. It's true for all caches in current linux kernel. > > > > > view compaction can be seen as "zsmalloc's cache flush" (unused objects > > > make write path quicker - no zspage allocation needed) and it won't > > > hurt to give user space some numbers so it can decide if the whole > > > thing is worth it (that decision is, once again, I/O pattern and > > > setup specific -- some users may be interested in compaction only > > > if it will reduce zsmalloc's memory consumption by, say, 15%). > > > > Again, your claim is performace so I need number. > > If it's really horrible, I guess below interface makes user handy > > without peeking nr_can_compact ad doing compact. > > > > /* Tell zram to compact if fragment ration is higher 15% */ > > echo 15% > /sys/block/zram0/compact > > or > > echo 15% > /sys/block/zram/compact_condition > > no, this is the least of the things we need to do -- enforced and > pre-defined policy engine in zram/zsmalloc 'that will work for all'. > user space has almost all the numbers to do it, the only missing bit > of the puzzle is `nr_can_compact' number. it's up to user space then > to decide how it wishes things to be done. for example: I don't think so. It seems you claims kernel should provide basic stat and user should implement the policy based on it. Normally it might be true but if kernel can do it better, it's better to handle it in kernel space. In case of your interface, every such system should have a userspace manager to control it. IOW, if there is no such program in the system, the feature is void. But with my suggestion, we could make default value(ex, 30 or 15) and user can change it anytime. At least, it will work without any userspace intervention. Other flaw of your interface is we cannot provide right vaule from reading /sys/block/zram0/compact, which is really racy so user's decision is really fragile to work. It's bad for kernel to provide such vague value to the user for his policy decision. NACK. -- Kind regards, Minchan Kim ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] zsmalloc: small compaction improvements 2015-07-14 16:52 ` Minchan Kim @ 2015-07-15 0:21 ` Sergey Senozhatsky 2015-07-15 0:24 ` Minchan Kim 0 siblings, 1 reply; 15+ messages in thread From: Sergey Senozhatsky @ 2015-07-15 0:21 UTC (permalink / raw) To: Minchan Kim Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel On (07/15/15 01:52), Minchan Kim wrote: > > alrighty... again... > > > > > > > > > > /sys/block/zram<id>/compact is a black box. We provide it, we don't > > > > throttle it in the kernel, and user space is absolutely clueless when > > > > it invokes compaction. From some remote (or alternative) point of > > > > > > But we have zs_can_compact so it can effectively skip the class if it > > > is not proper class. > > > > user triggered compaction can compact too much. > > in its current state triggering a compaction from user space is like > > playing a lottery or a russian roulette. > > We were on different page. > I thought the motivation from this patchset is to prevent compaction > overhead by frequent user-driven compaction request because user > don't know how they can get free pages by compaction so they should > ask compact frequently with blind. this is exactly the motivation for this patchset. seriously. whatever. -ss ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] zsmalloc: small compaction improvements 2015-07-15 0:21 ` Sergey Senozhatsky @ 2015-07-15 0:24 ` Minchan Kim 2015-07-15 11:16 ` Sergey Senozhatsky 0 siblings, 1 reply; 15+ messages in thread From: Minchan Kim @ 2015-07-15 0:24 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel On Wed, Jul 15, 2015 at 09:21:06AM +0900, Sergey Senozhatsky wrote: > On (07/15/15 01:52), Minchan Kim wrote: > > > alrighty... again... > > > > > > > > > > > > > /sys/block/zram<id>/compact is a black box. We provide it, we don't > > > > > throttle it in the kernel, and user space is absolutely clueless when > > > > > it invokes compaction. From some remote (or alternative) point of > > > > > > > > But we have zs_can_compact so it can effectively skip the class if it > > > > is not proper class. > > > > > > user triggered compaction can compact too much. > > > in its current state triggering a compaction from user space is like > > > playing a lottery or a russian roulette. > > > > We were on different page. > > > I thought the motivation from this patchset is to prevent compaction > > overhead by frequent user-driven compaction request because user > > don't know how they can get free pages by compaction so they should > > ask compact frequently with blind. > > this is exactly the motivation for this patchset. seriously. User should rely on the auto-compaction. > > whatever. > > -ss -- Kind regards, Minchan Kim ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 0/3] zsmalloc: small compaction improvements 2015-07-15 0:24 ` Minchan Kim @ 2015-07-15 11:16 ` Sergey Senozhatsky 0 siblings, 0 replies; 15+ messages in thread From: Sergey Senozhatsky @ 2015-07-15 11:16 UTC (permalink / raw) To: Minchan Kim Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel On (07/15/15 09:24), Minchan Kim wrote: > On Wed, Jul 15, 2015 at 09:21:06AM +0900, Sergey Senozhatsky wrote: > > On (07/15/15 01:52), Minchan Kim wrote: > > > > alrighty... again... > > > > > > > > > > > > > > > > /sys/block/zram<id>/compact is a black box. We provide it, we don't > > > > > > throttle it in the kernel, and user space is absolutely clueless when > > > > > > it invokes compaction. From some remote (or alternative) point of > > > > > > > > > > But we have zs_can_compact so it can effectively skip the class if it > > > > > is not proper class. > > > > > > > > user triggered compaction can compact too much. > > > > in its current state triggering a compaction from user space is like > > > > playing a lottery or a russian roulette. > > > > > > We were on different page. > > > > > I thought the motivation from this patchset is to prevent compaction > > > overhead by frequent user-driven compaction request because user > > > don't know how they can get free pages by compaction so they should > > > ask compact frequently with blind. > > > > this is exactly the motivation for this patchset. seriously. > > User should rely on the auto-compaction. yep, which will be available in 5-6 months... right behind the corner. -ss ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2015-07-15 23:59 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-07-11 9:45 [PATCH 0/3] zsmalloc: small compaction improvements Sergey Senozhatsky 2015-07-11 9:45 ` [PATCH 1/3] zsmalloc: factor out zs_pages_to_compact() Sergey Senozhatsky 2015-07-11 9:45 ` [PATCH 2/3] zram: make compact a read-write sysfs node Sergey Senozhatsky 2015-07-11 9:45 ` [PATCH 3/3] zsmalloc: do not take class lock in zs_pages_to_compact() Sergey Senozhatsky 2015-07-15 4:07 ` Sergey Senozhatsky 2015-07-15 23:38 ` Minchan Kim 2015-07-15 23:59 ` Sergey Senozhatsky 2015-07-13 23:36 ` [PATCH 0/3] zsmalloc: small compaction improvements Minchan Kim 2015-07-14 0:31 ` Sergey Senozhatsky 2015-07-14 0:55 ` Minchan Kim 2015-07-14 12:29 ` Sergey Senozhatsky 2015-07-14 16:52 ` Minchan Kim 2015-07-15 0:21 ` Sergey Senozhatsky 2015-07-15 0:24 ` Minchan Kim 2015-07-15 11:16 ` Sergey Senozhatsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).