linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH V3] zram:calculate available memory when zram is used
@ 2021-06-07 15:39 yongw.pur
  2021-06-08  9:28 ` Greg KH
  2021-06-08  9:29 ` Greg KH
  0 siblings, 2 replies; 6+ messages in thread
From: yongw.pur @ 2021-06-07 15:39 UTC (permalink / raw)
  To: gregkh, minchan, ngupta, senozhatsky, axboe, akpm, songmuchun,
	david, linux-kernel, linux-block, linux-mm, willy, linux-api,
	lu.zhongjun, yang.yang29, zhang.wenya1, wang.yong12

From: wangyong <wang.yong12@zte.com.cn>

When zram is used, available+Swap free memory is obviously bigger than we
actually can use, because zram can compress memory by compression
algorithm and zram compressed data will occupy memory too.

So, we can count the compression ratio of zram in the kernel. The space
will be saved by zram and other swap device are calculated as follows:
zram[swapfree - swapfree * compress ratio] + swapdev[swapfree]
We can evaluate the available memory of the whole system as:
MemAvailable+zram[swapfree - swapfree * compress ratio]+swapdev[swapfree]

Add an entry to the /proc/meminfo file, returns swap will save space.
Which name is more appropriate is still under consideration.
There are several alternative names: SwapAvailable, SwapSaved,
SwapCompressible, Which is better?

Adding new entries has little effect on user program, since parsers
usually parse by keywords

Changes from v2:
*Add interface description document
*Other mistakes and problems fix

Changes from v1:
*Use a new interface to return memory savings when using swap devices
*Zram add min_compr_ratio attr

Signed-off-by: wangyong <wang.yong12@zte.com.cn>
---
 Documentation/admin-guide/blockdev/zram.rst |  6 ++
 Documentation/filesystems/proc.rst          |  4 ++
 drivers/block/zram/zcomp.h                  |  8 +++
 drivers/block/zram/zram_drv.c               | 19 ++++++
 drivers/block/zram/zram_drv.h               |  1 +
 fs/proc/meminfo.c                           |  1 +
 include/linux/swap.h                        | 11 ++++
 mm/swapfile.c                               | 95 +++++++++++++++++++++++++++++
 mm/vmscan.c                                 |  1 +
 9 files changed, 146 insertions(+)

diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst
index 700329d..3b7c4c4 100644
--- a/Documentation/admin-guide/blockdev/zram.rst
+++ b/Documentation/admin-guide/blockdev/zram.rst
@@ -283,6 +283,12 @@ a single line of text and contains the following stats separated by whitespace:
 		Unit: 4K bytes
  ============== =============================================================
 
+File /sys/block/zram<id>/min_compr_ratio
+
+The min_compr_ratio file represents the min_compr_ratio during zram swapping out.The calculation formula is as follows:
+(orig_size * 100) / compr_data_size
+
+
 9) Deactivate
 =============
 
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index 042c418..15d35ae 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -961,6 +961,7 @@ You may not have all of these fields.
     LowFree:          4432 kB
     SwapTotal:           0 kB
     SwapFree:            0 kB
+    SwapAvailable:       0 kB
     Dirty:             968 kB
     Writeback:           0 kB
     AnonPages:      861800 kB
@@ -1032,6 +1033,9 @@ SwapTotal
 SwapFree
               Memory which has been evicted from RAM, and is temporarily
               on the disk
+SwapAvailable
+              The memory savings when use swap devices. it takes zram
+              compression ratio into considerations, when zram is used    
 Dirty
               Memory which is waiting to get written back to the disk
 Writeback
diff --git a/drivers/block/zram/zcomp.h b/drivers/block/zram/zcomp.h
index 40f6420..9c9cb96 100644
--- a/drivers/block/zram/zcomp.h
+++ b/drivers/block/zram/zcomp.h
@@ -40,4 +40,12 @@ int zcomp_decompress(struct zcomp_strm *zstrm,
 		const void *src, unsigned int src_len, void *dst);
 
 bool zcomp_set_max_streams(struct zcomp *comp, int num_strm);
+#ifdef CONFIG_ZRAM
+int get_zram_major(void);
+#else
+int get_zram_major(void)
+{
+	return -1;
+}
+#endif
 #endif /* _ZCOMP_H_ */
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index fcaf275..8f527e0 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -59,6 +59,10 @@ static void zram_free_page(struct zram *zram, size_t index);
 static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec,
 				u32 index, int offset, struct bio *bio);
 
+int get_zram_major(void)
+{
+	return zram_major;
+}
 
 static int zram_slot_trylock(struct zram *zram, u32 index)
 {
@@ -1040,6 +1044,19 @@ static ssize_t compact_store(struct device *dev,
 	return len;
 }
 
+static ssize_t min_compr_ratio_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct zram *zram = dev_to_zram(dev);
+	ssize_t ret;
+
+	down_read(&zram->init_lock);
+	ret = scnprintf(buf, PAGE_SIZE, "%d\n", atomic_read(&zram->stats.min_compr_ratio));
+	up_read(&zram->init_lock);
+
+	return ret;
+}
+
 static ssize_t io_stat_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
 {
@@ -1132,6 +1149,7 @@ static ssize_t debug_stat_show(struct device *dev,
 	return ret;
 }
 
+static DEVICE_ATTR_RO(min_compr_ratio);
 static DEVICE_ATTR_RO(io_stat);
 static DEVICE_ATTR_RO(mm_stat);
 #ifdef CONFIG_ZRAM_WRITEBACK
@@ -1859,6 +1877,7 @@ static struct attribute *zram_disk_attrs[] = {
 	&dev_attr_idle.attr,
 	&dev_attr_max_comp_streams.attr,
 	&dev_attr_comp_algorithm.attr,
+	&dev_attr_min_compr_ratio.attr,
 #ifdef CONFIG_ZRAM_WRITEBACK
 	&dev_attr_backing_dev.attr,
 	&dev_attr_writeback.attr,
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 80c3b43..5717e06 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -88,6 +88,7 @@ struct zram_stats {
 	atomic64_t bd_reads;		/* no. of reads from backing device */
 	atomic64_t bd_writes;		/* no. of writes from backing device */
 #endif
+	atomic_t min_compr_ratio;
 };
 
 struct zram {
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 6fa761c..34a174b 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -86,6 +86,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
 
 	show_val_kb(m, "SwapTotal:      ", i.totalswap);
 	show_val_kb(m, "SwapFree:       ", i.freeswap);
+	show_val_kb(m, "SwapAvailable:	", count_avail_swaps());
 	show_val_kb(m, "Dirty:          ",
 		    global_node_page_state(NR_FILE_DIRTY));
 	show_val_kb(m, "Writeback:      ",
diff --git a/include/linux/swap.h b/include/linux/swap.h
index bb48893..deed141 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -515,6 +515,8 @@ extern int init_swap_address_space(unsigned int type, unsigned long nr_pages);
 extern void exit_swap_address_space(unsigned int type);
 extern struct swap_info_struct *get_swap_device(swp_entry_t entry);
 sector_t swap_page_sector(struct page *page);
+extern void update_zram_zstats(void);
+extern u64 count_avail_swaps(void);
 
 static inline void put_swap_device(struct swap_info_struct *si)
 {
@@ -689,6 +691,15 @@ static inline swp_entry_t get_swap_page(struct page *page)
 	return entry;
 }
 
+void update_zram_zstats(void)
+{
+}
+
+u64 count_avail_swaps(void)
+{
+	return 0;
+}
+
 #endif /* CONFIG_SWAP */
 
 #ifdef CONFIG_THP_SWAP
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 1e07d1c..5ce5100 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -44,6 +44,7 @@
 #include <asm/tlbflush.h>
 #include <linux/swapops.h>
 #include <linux/swap_cgroup.h>
+#include "../drivers/block/zram/zram_drv.h"
 
 static bool swap_count_continued(struct swap_info_struct *, pgoff_t,
 				 unsigned char);
@@ -3387,6 +3388,100 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
 	return error;
 }
 
+u64 count_avail_swap(struct swap_info_struct *si)
+{
+	u64 result;
+	struct zram *z;
+	unsigned int free;
+	unsigned int ratio;
+
+	result = 0;
+	if (!si)
+		return 0;
+
+	//zram calculate available mem
+	if (si->flags & SWP_USED && si->swap_map) {
+		if (si->bdev->bd_disk->major == get_zram_major()) {
+			z = (struct zram *)si->bdev->bd_disk->private_data;
+			down_read(&z->init_lock);
+			ratio = atomic_read(&z->stats.min_compr_ratio);
+			free = (si->pages << (PAGE_SHIFT - 10))
+				- (si->inuse_pages << (PAGE_SHIFT - 10));
+			if (!ratio)
+				result += free / 2;
+			else
+				result = free * (100 - 10000 / ratio) / 100;
+			up_read(&z->init_lock);
+		} else
+			result += (si->pages << (PAGE_SHIFT - 10))
+					- (si->inuse_pages << (PAGE_SHIFT - 10));
+	}
+
+	return result;
+}
+
+u64 count_avail_swaps(void)
+{
+	int type;
+	u64 result;
+	struct swap_info_struct *si;
+
+	result = 0;
+	spin_lock(&swap_lock);
+	for (type = 0; type < nr_swapfiles; type++) {
+		si = swap_info[type];
+		spin_lock(&si->lock);
+		result += count_avail_swap(si);
+		spin_unlock(&si->lock);
+	}
+	spin_unlock(&swap_lock);
+
+	return result;
+}
+
+void update_zram_zstat(struct swap_info_struct *si)
+{
+	int ratio;
+	struct zram *z;
+	struct zram_stats *stat;
+	u64 orig_size, compr_data_size;
+
+	if (!si)
+		return;
+
+	//update zram min compress ratio
+	if (si->flags & SWP_USED && si->swap_map) {
+		if (si->bdev->bd_disk->major == get_zram_major()) {
+			z = (struct zram *)si->bdev->bd_disk->private_data;
+			down_read(&z->init_lock);
+			stat = &z->stats;
+			ratio = atomic_read(&stat->min_compr_ratio);
+			orig_size = atomic64_read(&stat->pages_stored) << PAGE_SHIFT;
+			compr_data_size = atomic64_read(&stat->compr_data_size);
+			if (compr_data_size && (!ratio
+					    || ((orig_size * 100) / compr_data_size < ratio)))
+				atomic_set(&stat->min_compr_ratio,
+					   (orig_size * 100) / compr_data_size);
+			up_read(&z->init_lock);
+		}
+	}
+}
+
+void update_zram_zstats(void)
+{
+	int type;
+	struct swap_info_struct *si;
+
+	spin_lock(&swap_lock);
+	for (type = 0; type < nr_swapfiles; type++) {
+		si = swap_info[type];
+		spin_lock(&si->lock);
+		update_zram_zstat(si);
+		spin_unlock(&si->lock);
+	}
+	spin_unlock(&swap_lock);
+}
+
 void si_swapinfo(struct sysinfo *val)
 {
 	unsigned int type;
diff --git a/mm/vmscan.c b/mm/vmscan.c
index eb31452..ffaf59b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4159,6 +4159,7 @@ static int kswapd(void *p)
 						alloc_order);
 		reclaim_order = balance_pgdat(pgdat, alloc_order,
 						highest_zoneidx);
+		update_zram_zstats();
 		if (reclaim_order < alloc_order)
 			goto kswapd_try_sleep;
 	}
-- 
2.7.4


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH V3] zram:calculate available memory when zram is used
  2021-06-07 15:39 [RFC PATCH V3] zram:calculate available memory when zram is used yongw.pur
@ 2021-06-08  9:28 ` Greg KH
  2021-06-09 14:58   ` yong w
  2021-06-08  9:29 ` Greg KH
  1 sibling, 1 reply; 6+ messages in thread
From: Greg KH @ 2021-06-08  9:28 UTC (permalink / raw)
  To: yongw.pur
  Cc: minchan, ngupta, senozhatsky, axboe, akpm, songmuchun, david,
	linux-kernel, linux-block, linux-mm, willy, linux-api,
	lu.zhongjun, yang.yang29, zhang.wenya1, wang.yong12

On Mon, Jun 07, 2021 at 08:39:14AM -0700, yongw.pur@gmail.com wrote:
> From: wangyong <wang.yong12@zte.com.cn>
> 
> When zram is used, available+Swap free memory is obviously bigger than we
> actually can use, because zram can compress memory by compression
> algorithm and zram compressed data will occupy memory too.
> 
> So, we can count the compression ratio of zram in the kernel. The space
> will be saved by zram and other swap device are calculated as follows:
> zram[swapfree - swapfree * compress ratio] + swapdev[swapfree]
> We can evaluate the available memory of the whole system as:
> MemAvailable+zram[swapfree - swapfree * compress ratio]+swapdev[swapfree]
> 
> Add an entry to the /proc/meminfo file, returns swap will save space.
> Which name is more appropriate is still under consideration.
> There are several alternative names: SwapAvailable, SwapSaved,
> SwapCompressible, Which is better?
> 
> Adding new entries has little effect on user program, since parsers
> usually parse by keywords
> 
> Changes from v2:
> *Add interface description document
> *Other mistakes and problems fix
> 
> Changes from v1:
> *Use a new interface to return memory savings when using swap devices
> *Zram add min_compr_ratio attr

These "Changes" need to go below the --- line please.

> 
> Signed-off-by: wangyong <wang.yong12@zte.com.cn>
> ---
>  Documentation/admin-guide/blockdev/zram.rst |  6 ++
>  Documentation/filesystems/proc.rst          |  4 ++
>  drivers/block/zram/zcomp.h                  |  8 +++
>  drivers/block/zram/zram_drv.c               | 19 ++++++
>  drivers/block/zram/zram_drv.h               |  1 +
>  fs/proc/meminfo.c                           |  1 +
>  include/linux/swap.h                        | 11 ++++
>  mm/swapfile.c                               | 95 +++++++++++++++++++++++++++++
>  mm/vmscan.c                                 |  1 +
>  9 files changed, 146 insertions(+)
> 
> diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst
> index 700329d..3b7c4c4 100644
> --- a/Documentation/admin-guide/blockdev/zram.rst
> +++ b/Documentation/admin-guide/blockdev/zram.rst
> @@ -283,6 +283,12 @@ a single line of text and contains the following stats separated by whitespace:
>  		Unit: 4K bytes
>   ============== =============================================================
>  
> +File /sys/block/zram<id>/min_compr_ratio
> +
> +The min_compr_ratio file represents the min_compr_ratio during zram swapping out.The calculation formula is as follows:
> +(orig_size * 100) / compr_data_size
> +
> +


sysfs files need to be documented in Documentation/ABI/ files.  You can
reference them in other documentation files, but they need to be in the
ABI/ directory as well.

Also please wrap your lines at the proper length and use a ' ' after a
'.'




>  9) Deactivate
>  =============
>  
> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> index 042c418..15d35ae 100644
> --- a/Documentation/filesystems/proc.rst
> +++ b/Documentation/filesystems/proc.rst
> @@ -961,6 +961,7 @@ You may not have all of these fields.
>      LowFree:          4432 kB
>      SwapTotal:           0 kB
>      SwapFree:            0 kB
> +    SwapAvailable:       0 kB
>      Dirty:             968 kB
>      Writeback:           0 kB
>      AnonPages:      861800 kB
> @@ -1032,6 +1033,9 @@ SwapTotal
>  SwapFree
>                Memory which has been evicted from RAM, and is temporarily
>                on the disk
> +SwapAvailable
> +              The memory savings when use swap devices. it takes zram
> +              compression ratio into considerations, when zram is used    

Trailing whitespace?

Did you run your patch through scripts/checkpatch.pl first before
sending it out?


>  Dirty
>                Memory which is waiting to get written back to the disk
>  Writeback
> diff --git a/drivers/block/zram/zcomp.h b/drivers/block/zram/zcomp.h
> index 40f6420..9c9cb96 100644
> --- a/drivers/block/zram/zcomp.h
> +++ b/drivers/block/zram/zcomp.h
> @@ -40,4 +40,12 @@ int zcomp_decompress(struct zcomp_strm *zstrm,
>  		const void *src, unsigned int src_len, void *dst);
>  
>  bool zcomp_set_max_streams(struct zcomp *comp, int num_strm);
> +#ifdef CONFIG_ZRAM
> +int get_zram_major(void);
> +#else
> +int get_zram_major(void)
> +{
> +	return -1;
> +}
> +#endif
>  #endif /* _ZCOMP_H_ */
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index fcaf275..8f527e0 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -59,6 +59,10 @@ static void zram_free_page(struct zram *zram, size_t index);
>  static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec,
>  				u32 index, int offset, struct bio *bio);
>  
> +int get_zram_major(void)
> +{
> +	return zram_major;

Why does anyone need the zram major number?


> +}
>  
>  static int zram_slot_trylock(struct zram *zram, u32 index)
>  {
> @@ -1040,6 +1044,19 @@ static ssize_t compact_store(struct device *dev,
>  	return len;
>  }
>  
> +static ssize_t min_compr_ratio_show(struct device *dev,
> +		struct device_attribute *attr, char *buf)
> +{
> +	struct zram *zram = dev_to_zram(dev);
> +	ssize_t ret;
> +
> +	down_read(&zram->init_lock);
> +	ret = scnprintf(buf, PAGE_SIZE, "%d\n", atomic_read(&zram->stats.min_compr_ratio));
> +	up_read(&zram->init_lock);

You are using an atomic variable _AND_ a read lock?  Are you sure that
makes sense?

And please use sysfs_emit() for sysfs files.


> +
> +	return ret;
> +}
> +
>  static ssize_t io_stat_show(struct device *dev,
>  		struct device_attribute *attr, char *buf)
>  {
> @@ -1132,6 +1149,7 @@ static ssize_t debug_stat_show(struct device *dev,
>  	return ret;
>  }
>  
> +static DEVICE_ATTR_RO(min_compr_ratio);
>  static DEVICE_ATTR_RO(io_stat);
>  static DEVICE_ATTR_RO(mm_stat);
>  #ifdef CONFIG_ZRAM_WRITEBACK
> @@ -1859,6 +1877,7 @@ static struct attribute *zram_disk_attrs[] = {
>  	&dev_attr_idle.attr,
>  	&dev_attr_max_comp_streams.attr,
>  	&dev_attr_comp_algorithm.attr,
> +	&dev_attr_min_compr_ratio.attr,
>  #ifdef CONFIG_ZRAM_WRITEBACK
>  	&dev_attr_backing_dev.attr,
>  	&dev_attr_writeback.attr,
> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> index 80c3b43..5717e06 100644
> --- a/drivers/block/zram/zram_drv.h
> +++ b/drivers/block/zram/zram_drv.h
> @@ -88,6 +88,7 @@ struct zram_stats {
>  	atomic64_t bd_reads;		/* no. of reads from backing device */
>  	atomic64_t bd_writes;		/* no. of writes from backing device */
>  #endif
> +	atomic_t min_compr_ratio;
>  };
>  
>  struct zram {
> diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
> index 6fa761c..34a174b 100644
> --- a/fs/proc/meminfo.c
> +++ b/fs/proc/meminfo.c
> @@ -86,6 +86,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
>  
>  	show_val_kb(m, "SwapTotal:      ", i.totalswap);
>  	show_val_kb(m, "SwapFree:       ", i.freeswap);
> +	show_val_kb(m, "SwapAvailable:	", count_avail_swaps());
>  	show_val_kb(m, "Dirty:          ",
>  		    global_node_page_state(NR_FILE_DIRTY));
>  	show_val_kb(m, "Writeback:      ",
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index bb48893..deed141 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -515,6 +515,8 @@ extern int init_swap_address_space(unsigned int type, unsigned long nr_pages);
>  extern void exit_swap_address_space(unsigned int type);
>  extern struct swap_info_struct *get_swap_device(swp_entry_t entry);
>  sector_t swap_page_sector(struct page *page);
> +extern void update_zram_zstats(void);
> +extern u64 count_avail_swaps(void);
>  
>  static inline void put_swap_device(struct swap_info_struct *si)
>  {
> @@ -689,6 +691,15 @@ static inline swp_entry_t get_swap_page(struct page *page)
>  	return entry;
>  }
>  
> +void update_zram_zstats(void)
> +{
> +}
> +
> +u64 count_avail_swaps(void)
> +{
> +	return 0;
> +}
> +
>  #endif /* CONFIG_SWAP */
>  
>  #ifdef CONFIG_THP_SWAP
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 1e07d1c..5ce5100 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -44,6 +44,7 @@
>  #include <asm/tlbflush.h>
>  #include <linux/swapops.h>
>  #include <linux/swap_cgroup.h>
> +#include "../drivers/block/zram/zram_drv.h"

That's a big hint that this is not correct, please do not do this :(

The core kernel should not depend on a random block driver's code.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH V3] zram:calculate available memory when zram is used
  2021-06-07 15:39 [RFC PATCH V3] zram:calculate available memory when zram is used yongw.pur
  2021-06-08  9:28 ` Greg KH
@ 2021-06-08  9:29 ` Greg KH
  2021-06-09 14:23   ` yong w
  1 sibling, 1 reply; 6+ messages in thread
From: Greg KH @ 2021-06-08  9:29 UTC (permalink / raw)
  To: yongw.pur
  Cc: minchan, ngupta, senozhatsky, axboe, akpm, songmuchun, david,
	linux-kernel, linux-block, linux-mm, willy, linux-api,
	lu.zhongjun, yang.yang29, zhang.wenya1, wang.yong12

On Mon, Jun 07, 2021 at 08:39:14AM -0700, yongw.pur@gmail.com wrote:
> From: wangyong <wang.yong12@zte.com.cn>
> 
> When zram is used, available+Swap free memory is obviously bigger than we
> actually can use, because zram can compress memory by compression
> algorithm and zram compressed data will occupy memory too.
> 
> So, we can count the compression ratio of zram in the kernel. The space
> will be saved by zram and other swap device are calculated as follows:
> zram[swapfree - swapfree * compress ratio] + swapdev[swapfree]
> We can evaluate the available memory of the whole system as:
> MemAvailable+zram[swapfree - swapfree * compress ratio]+swapdev[swapfree]

Why is this needed to be exported by userspace?  Who is going to use
this information and why can't they just calculate it from the exported
information already?

What tool requires this new information and what is it going to be used
for?

And why add a block driver's information to a core proc file?  Shouldn't
the information only be in the block driver's sysfs directory only?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH V3] zram:calculate available memory when zram is used
  2021-06-08  9:29 ` Greg KH
@ 2021-06-09 14:23   ` yong w
  2021-06-09 14:59     ` Greg KH
  0 siblings, 1 reply; 6+ messages in thread
From: yong w @ 2021-06-09 14:23 UTC (permalink / raw)
  To: Greg KH
  Cc: minchan, ngupta, senozhatsky, axboe, akpm, songmuchun,
	David Hildenbrand, linux-kernel, linux-block, linux-mm, willy,
	linux-api, lu.zhongjun, yang.yang29, zhang.wenya1, wang.yong12

Greg KH <gregkh@linuxfoundation.org> 于2021年6月8日周二 下午5:29写道:

>
> On Mon, Jun 07, 2021 at 08:39:14AM -0700, yongw.pur@gmail.com wrote:
> > From: wangyong <wang.yong12@zte.com.cn>
> >
> > When zram is used, available+Swap free memory is obviously bigger than we
> > actually can use, because zram can compress memory by compression
> > algorithm and zram compressed data will occupy memory too.
> >
> > So, we can count the compression ratio of zram in the kernel. The space
> > will be saved by zram and other swap device are calculated as follows:
> > zram[swapfree - swapfree * compress ratio] + swapdev[swapfree]
> > We can evaluate the available memory of the whole system as:
> > MemAvailable+zram[swapfree - swapfree * compress ratio]+swapdev[swapfree]
>
> Why is this needed to be exported by userspace?  Who is going to use
> this information and why can't they just calculate it from the exported
> information already?

In embedded devices, it is necessary to assess how much memory is available.
If the memory allocation is based on available+swapfree, it may cause oom and
affect the normal use of the devices. And it is more accurate and safe
to calculate
the swap available memory through minimum compression ratio.

Although mm_stat interface provides compressed information,but it is not easy to
get the minimum compression ratio during swaping out. Kernel provides a common
interface, which makes it easier to use and judge the state of system memory

> What tool requires this new information and what is it going to be used
> for?

It can be used in embedded devices to evaluate the memory condition
and avoid causing oom; Also If we wants to know more accurate available
memory information when zram is used.

> And why add a block driver's information to a core proc file?  Shouldn't
> the information only be in the block driver's sysfs directory only?
>
> thanks,
>
> greg k-h

I thought it would be better to put it there.
In the first patch, MemAvailable returned with swap available memory, and
David recommended a separate interface.

thanks

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH V3] zram:calculate available memory when zram is used
  2021-06-08  9:28 ` Greg KH
@ 2021-06-09 14:58   ` yong w
  0 siblings, 0 replies; 6+ messages in thread
From: yong w @ 2021-06-09 14:58 UTC (permalink / raw)
  To: Greg KH
  Cc: minchan, ngupta, senozhatsky, axboe, akpm, songmuchun,
	David Hildenbrand, linux-kernel, linux-block, linux-mm, willy,
	linux-api, lu.zhongjun, yang.yang29, zhang.wenya1, wang.yong12

Greg KH <gregkh@linuxfoundation.org> 于2021年6月8日周二 下午5:28写道:
>
> On Mon, Jun 07, 2021 at 08:39:14AM -0700, yongw.pur@gmail.com wrote:
> > From: wangyong <wang.yong12@zte.com.cn>
> >
> > When zram is used, available+Swap free memory is obviously bigger than we
> > actually can use, because zram can compress memory by compression
> > algorithm and zram compressed data will occupy memory too.
> >
> > So, we can count the compression ratio of zram in the kernel. The space
> > will be saved by zram and other swap device are calculated as follows:
> > zram[swapfree - swapfree * compress ratio] + swapdev[swapfree]
> > We can evaluate the available memory of the whole system as:
> > MemAvailable+zram[swapfree - swapfree * compress ratio]+swapdev[swapfree]
> >
> > Add an entry to the /proc/meminfo file, returns swap will save space.
> > Which name is more appropriate is still under consideration.
> > There are several alternative names: SwapAvailable, SwapSaved,
> > SwapCompressible, Which is better?
> >
> > Adding new entries has little effect on user program, since parsers
> > usually parse by keywords
> >
> > Changes from v2:
> > *Add interface description document
> > *Other mistakes and problems fix
> >
> > Changes from v1:
> > *Use a new interface to return memory savings when using swap devices
> > *Zram add min_compr_ratio attr
>
> These "Changes" need to go below the --- line please.
>
> >
> > Signed-off-by: wangyong <wang.yong12@zte.com.cn>
> > ---
> >  Documentation/admin-guide/blockdev/zram.rst |  6 ++
> >  Documentation/filesystems/proc.rst          |  4 ++
> >  drivers/block/zram/zcomp.h                  |  8 +++
> >  drivers/block/zram/zram_drv.c               | 19 ++++++
> >  drivers/block/zram/zram_drv.h               |  1 +
> >  fs/proc/meminfo.c                           |  1 +
> >  include/linux/swap.h                        | 11 ++++
> >  mm/swapfile.c                               | 95 +++++++++++++++++++++++++++++
> >  mm/vmscan.c                                 |  1 +
> >  9 files changed, 146 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/blockdev/zram.rst b/Documentation/admin-guide/blockdev/zram.rst
> > index 700329d..3b7c4c4 100644
> > --- a/Documentation/admin-guide/blockdev/zram.rst
> > +++ b/Documentation/admin-guide/blockdev/zram.rst
> > @@ -283,6 +283,12 @@ a single line of text and contains the following stats separated by whitespace:
> >               Unit: 4K bytes
> >   ============== =============================================================
> >
> > +File /sys/block/zram<id>/min_compr_ratio
> > +
> > +The min_compr_ratio file represents the min_compr_ratio during zram swapping out.The calculation formula is as follows:
> > +(orig_size * 100) / compr_data_size
> > +
> > +
>
>
> sysfs files need to be documented in Documentation/ABI/ files.  You can
> reference them in other documentation files, but they need to be in the
> ABI/ directory as well.
>
> Also please wrap your lines at the proper length and use a ' ' after a
> '.'

OK, I'll be careful next time.

>
> >  9) Deactivate
> >  =============
> >
> > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> > index 042c418..15d35ae 100644
> > --- a/Documentation/filesystems/proc.rst
> > +++ b/Documentation/filesystems/proc.rst
> > @@ -961,6 +961,7 @@ You may not have all of these fields.
> >      LowFree:          4432 kB
> >      SwapTotal:           0 kB
> >      SwapFree:            0 kB
> > +    SwapAvailable:       0 kB
> >      Dirty:             968 kB
> >      Writeback:           0 kB
> >      AnonPages:      861800 kB
> > @@ -1032,6 +1033,9 @@ SwapTotal
> >  SwapFree
> >                Memory which has been evicted from RAM, and is temporarily
> >                on the disk
> > +SwapAvailable
> > +              The memory savings when use swap devices. it takes zram
> > +              compression ratio into considerations, when zram is used
>
> Trailing whitespace?
>
> Did you run your patch through scripts/checkpatch.pl first before
> sending it out?
>

OK, I'll be careful next time.

>
> >  Dirty
> >                Memory which is waiting to get written back to the disk
> >  Writeback
> > diff --git a/drivers/block/zram/zcomp.h b/drivers/block/zram/zcomp.h
> > index 40f6420..9c9cb96 100644
> > --- a/drivers/block/zram/zcomp.h
> > +++ b/drivers/block/zram/zcomp.h
> > @@ -40,4 +40,12 @@ int zcomp_decompress(struct zcomp_strm *zstrm,
> >               const void *src, unsigned int src_len, void *dst);
> >
> >  bool zcomp_set_max_streams(struct zcomp *comp, int num_strm);
> > +#ifdef CONFIG_ZRAM
> > +int get_zram_major(void);
> > +#else
> > +int get_zram_major(void)
> > +{
> > +     return -1;
> > +}
> > +#endif
> >  #endif /* _ZCOMP_H_ */
> > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> > index fcaf275..8f527e0 100644
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -59,6 +59,10 @@ static void zram_free_page(struct zram *zram, size_t index);
> >  static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec,
> >                               u32 index, int offset, struct bio *bio);
> >
> > +int get_zram_major(void)
> > +{
> > +     return zram_major;
>
> Why does anyone need the zram major number?
>

Swapfile.c will use zram major to determine whether it is a zram device.
I plan to change it to internal interface later.

>
> > +}
> >
> >  static int zram_slot_trylock(struct zram *zram, u32 index)
> >  {
> > @@ -1040,6 +1044,19 @@ static ssize_t compact_store(struct device *dev,
> >       return len;
> >  }
> >
> > +static ssize_t min_compr_ratio_show(struct device *dev,
> > +             struct device_attribute *attr, char *buf)
> > +{
> > +     struct zram *zram = dev_to_zram(dev);
> > +     ssize_t ret;
> > +
> > +     down_read(&zram->init_lock);
> > +     ret = scnprintf(buf, PAGE_SIZE, "%d\n", atomic_read(&zram->stats.min_compr_ratio));
> > +     up_read(&zram->init_lock);
>
> You are using an atomic variable _AND_ a read lock?  Are you sure that
> makes sense?
>
> And please use sysfs_emit() for sysfs files.
>

It refers to mm_stat_show functioin.
    down_read(&zram->init_lock);
     ....
     ret = scnprintf(buf, PAGE_SIZE,
                        "%8llu %8llu %8llu %8lu %8ld %8llu %8lu %8llu %8llu\n",
                        ...
                        (u64)atomic64_read(&zram->stats.same_pages),
                        atomic_long_read(&pool_stats.pages_compacted),
                        (u64)atomic64_read(&zram->stats.huge_pages),
                        (u64)atomic64_read(&zram->stats.huge_pages_since));
        up_read(&zram->init_lock);

>
> > +
> > +     return ret;
> > +}
> > +
> >  static ssize_t io_stat_show(struct device *dev,
> >               struct device_attribute *attr, char *buf)
> >  {
> > @@ -1132,6 +1149,7 @@ static ssize_t debug_stat_show(struct device *dev,
> >       return ret;
> >  }
> >
> > +static DEVICE_ATTR_RO(min_compr_ratio);
> >  static DEVICE_ATTR_RO(io_stat);
> >  static DEVICE_ATTR_RO(mm_stat);
> >  #ifdef CONFIG_ZRAM_WRITEBACK
> > @@ -1859,6 +1877,7 @@ static struct attribute *zram_disk_attrs[] = {
> >       &dev_attr_idle.attr,
> >       &dev_attr_max_comp_streams.attr,
> >       &dev_attr_comp_algorithm.attr,
> > +     &dev_attr_min_compr_ratio.attr,
> >  #ifdef CONFIG_ZRAM_WRITEBACK
> >       &dev_attr_backing_dev.attr,
> >       &dev_attr_writeback.attr,
> > diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> > index 80c3b43..5717e06 100644
> > --- a/drivers/block/zram/zram_drv.h
> > +++ b/drivers/block/zram/zram_drv.h
> > @@ -88,6 +88,7 @@ struct zram_stats {
> >       atomic64_t bd_reads;            /* no. of reads from backing device */
> >       atomic64_t bd_writes;           /* no. of writes from backing device */
> >  #endif
> > +     atomic_t min_compr_ratio;
> >  };
> >
> >  struct zram {
> > diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
> > index 6fa761c..34a174b 100644
> > --- a/fs/proc/meminfo.c
> > +++ b/fs/proc/meminfo.c
> > @@ -86,6 +86,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
> >
> >       show_val_kb(m, "SwapTotal:      ", i.totalswap);
> >       show_val_kb(m, "SwapFree:       ", i.freeswap);
> > +     show_val_kb(m, "SwapAvailable:  ", count_avail_swaps());
> >       show_val_kb(m, "Dirty:          ",
> >                   global_node_page_state(NR_FILE_DIRTY));
> >       show_val_kb(m, "Writeback:      ",
> > diff --git a/include/linux/swap.h b/include/linux/swap.h
> > index bb48893..deed141 100644
> > --- a/include/linux/swap.h
> > +++ b/include/linux/swap.h
> > @@ -515,6 +515,8 @@ extern int init_swap_address_space(unsigned int type, unsigned long nr_pages);
> >  extern void exit_swap_address_space(unsigned int type);
> >  extern struct swap_info_struct *get_swap_device(swp_entry_t entry);
> >  sector_t swap_page_sector(struct page *page);
> > +extern void update_zram_zstats(void);
> > +extern u64 count_avail_swaps(void);
> >
> >  static inline void put_swap_device(struct swap_info_struct *si)
> >  {
> > @@ -689,6 +691,15 @@ static inline swp_entry_t get_swap_page(struct page *page)
> >       return entry;
> >  }
> >
> > +void update_zram_zstats(void)
> > +{
> > +}
> > +
> > +u64 count_avail_swaps(void)
> > +{
> > +     return 0;
> > +}
> > +
> >  #endif /* CONFIG_SWAP */
> >
> >  #ifdef CONFIG_THP_SWAP
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > index 1e07d1c..5ce5100 100644
> > --- a/mm/swapfile.c
> > +++ b/mm/swapfile.c
> > @@ -44,6 +44,7 @@
> >  #include <asm/tlbflush.h>
> >  #include <linux/swapops.h>
> >  #include <linux/swap_cgroup.h>
> > +#include "../drivers/block/zram/zram_drv.h"
>
> That's a big hint that this is not correct, please do not do this :(
>
> The core kernel should not depend on a random block driver's code.
>
> thanks,
>
> greg k-h
Yes,  it's a problem.
Using callback function to realize this function is under consideration.

thanks for your reply.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH V3] zram:calculate available memory when zram is used
  2021-06-09 14:23   ` yong w
@ 2021-06-09 14:59     ` Greg KH
  0 siblings, 0 replies; 6+ messages in thread
From: Greg KH @ 2021-06-09 14:59 UTC (permalink / raw)
  To: yong w
  Cc: minchan, ngupta, senozhatsky, axboe, akpm, songmuchun,
	David Hildenbrand, linux-kernel, linux-block, linux-mm, willy,
	linux-api, lu.zhongjun, yang.yang29, zhang.wenya1, wang.yong12

On Wed, Jun 09, 2021 at 10:23:36PM +0800, yong w wrote:
> Greg KH <gregkh@linuxfoundation.org> 于2021年6月8日周二 下午5:29写道:
> 
> >
> > On Mon, Jun 07, 2021 at 08:39:14AM -0700, yongw.pur@gmail.com wrote:
> > > From: wangyong <wang.yong12@zte.com.cn>
> > >
> > > When zram is used, available+Swap free memory is obviously bigger than we
> > > actually can use, because zram can compress memory by compression
> > > algorithm and zram compressed data will occupy memory too.
> > >
> > > So, we can count the compression ratio of zram in the kernel. The space
> > > will be saved by zram and other swap device are calculated as follows:
> > > zram[swapfree - swapfree * compress ratio] + swapdev[swapfree]
> > > We can evaluate the available memory of the whole system as:
> > > MemAvailable+zram[swapfree - swapfree * compress ratio]+swapdev[swapfree]
> >
> > Why is this needed to be exported by userspace?  Who is going to use
> > this information and why can't they just calculate it from the exported
> > information already?
> 
> In embedded devices, it is necessary to assess how much memory is available.

Why is that any different from a server?  Or a laptop?  Or any other
system running Linux?  "embedded" isn't special here :)

> If the memory allocation is based on available+swapfree, it may cause oom and
> affect the normal use of the devices. And it is more accurate and safe
> to calculate
> the swap available memory through minimum compression ratio.
> 
> Although mm_stat interface provides compressed information,but it is not easy to
> get the minimum compression ratio during swaping out. Kernel provides a common
> interface, which makes it easier to use and judge the state of system memory

If you are running up against this type of limit, how is a proc file
guess going to help much?  What are you going to do based on the result?
And as it's always going to be a guess, how reliable is it?

> > What tool requires this new information and what is it going to be used
> > for?
> 
> It can be used in embedded devices to evaluate the memory condition
> and avoid causing oom; Also If we wants to know more accurate available
> memory information when zram is used.

Why not rely on the oom logic instead?  What is wrong with that as this
is always going to be just a guess.  You are never going to be able to
react fast enough to reading this value to be able to do anything better
than you could through the existing oom notifier/logic, right?

> > And why add a block driver's information to a core proc file?  Shouldn't
> > the information only be in the block driver's sysfs directory only?
> >
> > thanks,
> >
> > greg k-h
> 
> I thought it would be better to put it there.

If no one needs it, why add it?  :)

> In the first patch, MemAvailable returned with swap available memory, and
> David recommended a separate interface.

A sysfs file makes more sense to me, and seems simpler.

But again, this is just a guess, trying to do real work based on it
feels really risky.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-06-09 15:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-07 15:39 [RFC PATCH V3] zram:calculate available memory when zram is used yongw.pur
2021-06-08  9:28 ` Greg KH
2021-06-09 14:58   ` yong w
2021-06-08  9:29 ` Greg KH
2021-06-09 14:23   ` yong w
2021-06-09 14:59     ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).