linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/4] use bin_attribute to break the size limitation of cpumap ABI
@ 2021-07-15 11:58 Barry Song
  2021-07-15 11:58 ` [PATCH v7 1/4] cpumask: introduce cpumap_print_to_buf to support large bitmask and list Barry Song
                   ` (5 more replies)
  0 siblings, 6 replies; 35+ messages in thread
From: Barry Song @ 2021-07-15 11:58 UTC (permalink / raw)
  To: gregkh, akpm, andriy.shevchenko, yury.norov, linux-kernel
  Cc: dave.hansen, linux, rafael, rdunlap, agordeev, sbrivio,
	jianpeng.ma, valentin.schneider, peterz, bristot, guodong.xu,
	tangchengchang, prime.zeng, yangyicong, tim.c.chen, linuxarm,
	Barry Song

v7:
  - update doc in code for new APIs according to the comments of
    Andy Shevchenko;
  - other minor cleanup and commit log fix according to the comments
    of Andy Shevchenko

v6:
  -minor cleanup according to Andy Shevchenko's comment;
  -take bitmap_print_to_buf back according to Yury Norov's comment and
   fix the documents; and also take the bitmap testcase back.
  -Sorry, Yury, I don't think it is doable to move memory allocation
   to drivers.
   Considering a driver like topology.c, we have M CPUs and each CPU
   has N various nodes like core_siblings, package_cpus, die_cpus etc,
   we can't know the size of each node of each CPU in advance. The best
   time to get the size of each node is really when users read the sysfs
   node. otherwise, we have to scan M*N nodes in drivers in advance to
   figure out the exact size of buffers we need.
   On the other hand, it is crazily tricky to ask a bundle of drivers to
   find a proper place to save the pointer of allocated buffers so that
   they can be re-used in second read of the same bin_attribute node.
   And I doubt it is really useful to save the address of buffers
   somewhere. Immediately freeing it seems to be a correct option to
   avoid runtime waste of memory. We can't predict when users will read
   topology ABI and which node users will read.
   Finally, reading topology things wouldn't be the really cpu-bound
   things in user applications, hardly this kind of ABI things can be
   a performance bottleneck. Users use numactl and lstopo commands to
   read ABIs but nobody will do it again and again. And a normal
   application won't do topology repeatly. So the overhead caused by
   malloc/free in the new bitmap API doesn't really matter.
   if we really want a place to re-used the buffer and avoid malloc/free,
   it seems this should be done in some common place rather than each
   driver. still it is hard to find the best place.

   Thanks for the comments of Yury and Andy in v5.

v5:
  -remove the bitmap API bitmap_print_to_buf, alternatively, only provide
   cpumap_print_to_buf API as what we really care about is cpumask for
   this moment. we can freely take bitmap_print_to_buf back once we find
   the second user.
   hopefully this can alleviate Yury's worries on possible abuse of a new
   bitmap API.
  -correct the document of cpumap_print_to_buf;
  -In patch1, clearly explain why we need this new API in commit log;
  -Also refine the commit log of patch 2 and 3;
  -As the modification is narrowed to the scope of cpumask, the kunit
   test of bitmap_print_to_buf doesn't apply in the new patchset. so
   test case patch4/4 is removed.

   Thanks for the comments of Greg, Yury, Andy. Thanks for Jonathan's
   review.

v4:
  -add test cases for bitmap_print_to_buf API;
  -add Reviewed-by of Jonathan Cameron for patches 1-3, thanks!

v3:
  -fixed the strlen issue and patch #1,#2,#3 minor formatting issues, thanks
   to Andy Shevchenko and Jonathan Cameron.

v2:
  -split the original patch #1 into two patches and use kasprintf() in
  -patch #1 to simplify the code. do some minor formatting adjustments.

Background:

the whole story began from this thread when Jonatah and me tried to add a
new topology level-cluster which exists on kunpeng920 and X86 Jacobsville:
https://lore.kernel.org/lkml/YFRGIedW1fUlnmi+@kroah.com/
https://lore.kernel.org/lkml/YFR2kwakbcGiI37w@kroah.com/

in the discussion, Greg had some concern about the potential one page size
limitation of sysfs ABI for topology. Greg's comment is reasonable
and I think we should address the problem.

For this moment, numa node, cpu topology and some other drivers are using
cpu bitmap and list to expose hardware topology. When cpu number is large,
the page buffer of sysfs won't be able to hold the whole bitmask or list.
This doesn't really happen nowadays for bitmask as the maximum NR_CPUS
is 8196 for X86_64 and 4096 for ARM64 since
8196 * 9 / 32 = 2305
is still smaller than 4KB page size.

So the existing BUILD_BUG_ON() in drivers/base/node.c is pretty much
preventing future problems when hardware gets more and more CPUs:
static ssize_t node_read_cpumap(struct device *dev, bool list, char *buf)
{
 	cpumask_var_t mask;
 	struct node *node_dev = to_node(dev);

	/* 2008/04/07: buf currently PAGE_SIZE, need 9 chars per 32 bits. */
	BUILD_BUG_ON((NR_CPUS/32 * 9) > (PAGE_SIZE-1));
}

But those ABIs exposing cpu lists are much more tricky as a list could be
like: 0, 3, 5, 7, 9, 11... etc. so nobody knows the size till the last
moment. Comparing to bitmask, list is easier to exceed one page.

In the previous discussion, Greg and Dave Hansen preferred to remove this
kind of limitation totally and remove the BUILD_BUG_ON() in
drivers/base/node.c together:
https://lore.kernel.org/lkml/1619679819-45256-2-git-send-email-tiantao6@hisilicon.com/
https://lore.kernel.org/lkml/YIueOR4fOYa1dSAb@kroah.com/

Todo:

right now, only topology and node are addressed. there are many other
drivers are calling cpumap_print_to_pagebuf() and have the similar
problems. we are going to address them one by one after this patchset
settles down.

Barry Song (1):
  lib: test_bitmap: add bitmap_print_to_buf test cases

Tian Tao (3):
  cpumask: introduce cpumap_print_to_buf to support large bitmask and
    list
  topology: use bin_attribute to break the size limitation of cpumap ABI
  drivers/base/node.c: use bin_attribute to break the size limitation of
    cpumap ABI

 drivers/base/node.c     |  51 +++++++++-----
 drivers/base/topology.c | 115 ++++++++++++++++--------------
 include/linux/bitmap.h  |   2 +
 include/linux/cpumask.h |  63 +++++++++++++++++
 lib/bitmap.c            |  78 +++++++++++++++++++++
 lib/test_bitmap.c       | 150 ++++++++++++++++++++++++++++++++++++++++
 6 files changed, 389 insertions(+), 70 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v7 1/4] cpumask: introduce cpumap_print_to_buf to support large bitmask and list
  2021-07-15 11:58 [PATCH v7 0/4] use bin_attribute to break the size limitation of cpumap ABI Barry Song
@ 2021-07-15 11:58 ` Barry Song
  2021-07-15 15:28   ` Yury Norov
  2021-07-15 11:58 ` [PATCH v7 2/4] topology: use bin_attribute to break the size limitation of cpumap ABI Barry Song
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 35+ messages in thread
From: Barry Song @ 2021-07-15 11:58 UTC (permalink / raw)
  To: gregkh, akpm, andriy.shevchenko, yury.norov, linux-kernel
  Cc: dave.hansen, linux, rafael, rdunlap, agordeev, sbrivio,
	jianpeng.ma, valentin.schneider, peterz, bristot, guodong.xu,
	tangchengchang, prime.zeng, yangyicong, tim.c.chen, linuxarm,
	Tian Tao, Barry Song

From: Tian Tao <tiantao6@hisilicon.com>

The existing cpumap_print_to_pagebuf() is used by cpu topology and other
drivers to export hexadecimal bitmask and decimal list to userspace by
sysfs ABI.

Right now, those drivers are using a normal attribute for this kind of
ABIs. A normal attribute typically has show entry as below:

static ssize_t example_dev_show(struct device *dev,
                struct device_attribute *attr, char *buf)
{
	...
	return cpumap_print_to_pagebuf(true, buf, &pmu_mmdc->cpu);
}
show entry of attribute has no offset and count parameters and this
means the file is limited to one page only.

cpumap_print_to_pagebuf() API works terribly well for this kind of
normal attribute with buf parameter and without offset, count:

static inline ssize_t
cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask)
{
	return bitmap_print_to_pagebuf(list, buf, cpumask_bits(mask),
				       nr_cpu_ids);
}

The problem is once we have many cpus, we have a chance to make bitmask
or list more than one page. Especially for list, it could be as complex
as 0,3,5,7,9,...... We have no simple way to know it exact size.

It turns out bin_attribute is a way to break this limit. bin_attribute
has show entry as below:
static ssize_t
example_bin_attribute_show(struct file *filp, struct kobject *kobj,
             struct bin_attribute *attr, char *buf,
             loff_t offset, size_t count)
{
	...
}

With the new offset and count parameters, this makes sysfs ABI be able
to support file size more than one page. For example, offset could be
>= 4096.

This patch introduces cpumap_print_to_buf() and its bitmap infrastructure
bitmap_print_to_buf() so that those drivers can move to bin_attribute to
support large bitmask and list. At the same time, we have to pass those
corresponding parameters such as offset, count from bin_attribute to this
new API.

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: "Ma, Jianpeng" <jianpeng.ma@intel.com>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
---
 v7:
 explanation deserves to be a paragraph in code according to Andy's
 comments;

 include/linux/bitmap.h  |  2 ++
 include/linux/cpumask.h | 63 +++++++++++++++++++++++++++++++++
 lib/bitmap.c            | 78 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 143 insertions(+)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index a36cfcec4e77..0de6effa2797 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -226,6 +226,8 @@ void bitmap_copy_le(unsigned long *dst, const unsigned long *src, unsigned int n
 unsigned int bitmap_ord_to_pos(const unsigned long *bitmap, unsigned int ord, unsigned int nbits);
 int bitmap_print_to_pagebuf(bool list, char *buf,
 				   const unsigned long *maskp, int nmaskbits);
+int bitmap_print_to_buf(bool list, char *buf, const unsigned long *maskp,
+			int nmaskbits, loff_t off, size_t count);
 
 #define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
 #define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1)))
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index f3689a52bfd0..f81ade866cf7 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -983,6 +983,69 @@ cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask)
 				      nr_cpu_ids);
 }
 
+/**
+ * cpumap_print_to_buf  - copies the cpumask into the buffer
+ * @list: indicates whether the cpumap must be list
+ *      true:  print in decimal list format
+ *      false: print in hexadecimal bitmask format
+ *
+ * The existing cpumap_print_to_pagebuf() is used by cpu topology and other
+ * drivers to export hexadecimal bitmask and decimal list to userspace by
+ * sysfs ABI.
+ * Drivers might be using a normal attribute for this kind of ABIs. A
+ * normal attribute typically has show entry as below:
+ * static ssize_t example_attribute_show(struct device *dev,
+ * 		struct device_attribute *attr, char *buf)
+ * {
+ * 	...
+ * 	return cpumap_print_to_pagebuf(true, buf, &pmu_mmdc->cpu);
+ * }
+ * show entry of attribute has no offset and count parameters. this means
+ * the file is limited to one page only.
+ * cpumap_print_to_pagebuf() API works terribly well for this kind of
+ * normal attribute with buf parameter and without offset, count:
+ * cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask)
+ * {
+ * }
+ * The problem is once we have many cpus, we have a chance to make bitmask
+ * or list more than one page. Especially for list, it could be as complex
+ * as 0,3,5,7,9,... We have no simple way to know it exact size.
+ * It turns out bin_attribute is a way to break this limit. bin_attribute
+ * has show entry as below:
+ * static ssize_t
+ * example_bin_attribute_show(struct file *filp, struct kobject *kobj,
+ * 		struct bin_attribute *attr, char *buf,
+ * 		loff_t offset, size_t count)
+ * {
+ * 	...
+ * }
+ * With the new offset and count parameters, this makes sysfs ABI be able
+ * to support file size more than one page. For example, offset could be
+ * >= 4096.
+ * cpumap_print_to_buf() makes those drivers be able to to support large
+ * bitmask and list after they move to use bin_attribute. In result, we
+ * have to pass the corresponding parameters such as off, count from
+ * bin_attribute show entry to this API.
+ *
+ * @mask: the cpumask to copy
+ * @buf: the buffer to copy into
+ * @off: in the string from which we are copying, We copy to @buf
+ * @count: the maximum number of bytes to print
+ *
+ * The function copies the cpumask into the buffer either as comma-separated
+ * list of cpus or hex values of cpumask; Typically used by bin_attribute to
+ * export cpumask bitmask and list ABI.
+ *
+ * Returns the length of how many bytes have been copied.
+ */
+static inline ssize_t
+cpumap_print_to_buf(bool list, char *buf, const struct cpumask *mask,
+		loff_t off, size_t count)
+{
+	return bitmap_print_to_buf(list, buf, cpumask_bits(mask),
+				   nr_cpu_ids, off, count);
+}
+
 #if NR_CPUS <= BITS_PER_LONG
 #define CPU_MASK_ALL							\
 (cpumask_t) { {								\
diff --git a/lib/bitmap.c b/lib/bitmap.c
index 9401d39e4722..56bcffe2fa8c 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -487,6 +487,84 @@ int bitmap_print_to_pagebuf(bool list, char *buf, const unsigned long *maskp,
 }
 EXPORT_SYMBOL(bitmap_print_to_pagebuf);
 
+/**
+ * bitmap_print_to_buf  - convert bitmap to list or hex format ASCII string
+ * @list: indicates whether the bitmap must be list
+ *      true:  print in decimal list format
+ *      false: print in hexadecimal bitmask format
+ *
+ * The bitmap_print_to_pagebuf() is used indirectly via its cpumap wrapper
+ * cpumap_print_to_pagebuf() or directly by drivers to export hexadecimal
+ * bitmask and decimal list to userspace by sysfs ABI.
+ * Drivers might be using a normal attribute for this kind of ABIs. A
+ * normal attribute typically has show entry as below:
+ * static ssize_t example_attribute_show(struct device *dev,
+ * 		struct device_attribute *attr, char *buf)
+ * {
+ * 	...
+ * 	return bitmap_print_to_pagebuf(true, buf, &mask, nr_trig_max);
+ * }
+ * show entry of attribute has no offset and count parameters and this
+ * means the file is limited to one page only.
+ * bitmap_print_to_pagebuf() API works terribly well for this kind of
+ * normal attribute with buf parameter and without offset, count:
+ * bitmap_print_to_pagebuf(bool list, char *buf, const unsigned long *maskp,
+ * 			   int nmaskbits)
+ * {
+ * }
+ * The problem is once we have a large bitmap, we have a chance to get a
+ * bitmask or list more than one page. Especially for list, it could be
+ * as complex as 0,3,5,7,9,... We have no simple way to know it exact size.
+ * It turns out bin_attribute is a way to break this limit. bin_attribute
+ * has show entry as below:
+ * static ssize_t
+ * example_bin_attribute_show(struct file *filp, struct kobject *kobj,
+ * 		struct bin_attribute *attr, char *buf,
+ * 		loff_t offset, size_t count)
+ * {
+ * 	...
+ * }
+ * With the new offset and count parameters, this makes sysfs ABI be able
+ * to support file size more than one page. For example, offset could be
+ * >= 4096.
+ * bitmap_print_to_buf() and its cpumap wrapper cpumap_print_to_buf() makes
+ * those drivers be able to support large bitmask and list after they move
+ * to use bin_attribute. In result, we have to pass the corresponding
+ * parameters such as off, count from bin_attribute show entry to this API.
+ *
+ * @buf: buffer into which string is placed
+ * @maskp: pointer to bitmap to convert
+ * @nmaskbits: size of bitmap, in bits
+ * @off: in the string from which we are copying, We copy to @buf
+ * @count: the maximum number of bytes to print
+ *
+ * The role of cpumap_print_to_buf() and cpumap_print_to_pagebuf() is similar,
+ * the difference is that bitmap_print_to_pagebuf() mainly serves sysfs
+ * attribute with the assumption the destination buffer is exactly one page
+ * and won't be more than one page. cpumap_print_to_buf(), on the other hand,
+ * mainly serves bin_attribute which doesn't work with exact one page, and it
+ * can break the size limit of converted decimal list and hexadecimal bitmask.
+ *
+ * Returns the number of characters actually printed to @buf
+ */
+int bitmap_print_to_buf(bool list, char *buf, const unsigned long *maskp,
+		int nmaskbits, loff_t off, size_t count)
+{
+	const char *fmt = list ? "%*pbl\n" : "%*pb\n";
+	ssize_t size;
+	void *data;
+
+	data = kasprintf(GFP_KERNEL, fmt, nmaskbits, maskp);
+	if (!data)
+		return -ENOMEM;
+
+	size = memory_read_from_buffer(buf, count, &off, data, strlen(data) + 1);
+	kfree(data);
+
+	return size;
+}
+EXPORT_SYMBOL(bitmap_print_to_buf);
+
 /*
  * Region 9-38:4/10 describes the following bitmap structure:
  * 0	   9  12    18			38	     N
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 2/4] topology: use bin_attribute to break the size limitation of cpumap ABI
  2021-07-15 11:58 [PATCH v7 0/4] use bin_attribute to break the size limitation of cpumap ABI Barry Song
  2021-07-15 11:58 ` [PATCH v7 1/4] cpumask: introduce cpumap_print_to_buf to support large bitmask and list Barry Song
@ 2021-07-15 11:58 ` Barry Song
  2021-07-16  8:49   ` Song Bao Hua (Barry Song)
  2021-07-15 11:58 ` [PATCH v7 3/4] drivers/base/node.c: " Barry Song
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 35+ messages in thread
From: Barry Song @ 2021-07-15 11:58 UTC (permalink / raw)
  To: gregkh, akpm, andriy.shevchenko, yury.norov, linux-kernel
  Cc: dave.hansen, linux, rafael, rdunlap, agordeev, sbrivio,
	jianpeng.ma, valentin.schneider, peterz, bristot, guodong.xu,
	tangchengchang, prime.zeng, yangyicong, tim.c.chen, linuxarm,
	Tian Tao, Barry Song

From: Tian Tao <tiantao6@hisilicon.com>

Reading /sys/devices/system/cpu/cpuX/topology/ returns cpu topology.
However, the size of this file is limited to PAGE_SIZE because of
the limitation for sysfs attribute.
This patch moves to use bin_attribute to extend the ABI to be more
than one page so that cpumap bitmask and list won't be potentially
trimmed.

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
---
 drivers/base/topology.c | 115 ++++++++++++++++++++++------------------
 1 file changed, 63 insertions(+), 52 deletions(-)

diff --git a/drivers/base/topology.c b/drivers/base/topology.c
index 4d254fcc93d1..dd3980124e33 100644
--- a/drivers/base/topology.c
+++ b/drivers/base/topology.c
@@ -21,25 +21,27 @@ static ssize_t name##_show(struct device *dev,				\
 	return sysfs_emit(buf, "%d\n", topology_##name(dev->id));	\
 }
 
-#define define_siblings_show_map(name, mask)				\
-static ssize_t name##_show(struct device *dev,				\
-			   struct device_attribute *attr, char *buf)	\
-{									\
-	return cpumap_print_to_pagebuf(false, buf, topology_##mask(dev->id));\
+#define define_siblings_read_func(name, mask)					\
+static ssize_t name##_read(struct file *file, struct kobject *kobj,		\
+				  struct bin_attribute *attr, char *buf,	\
+				  loff_t off, size_t count)			\
+{										\
+	struct device *dev = kobj_to_dev(kobj);                                 \
+										\
+	return cpumap_print_to_buf(false, buf, topology_##mask(dev->id),	\
+				   off, count);                                 \
+}										\
+										\
+static ssize_t name##_list_read(struct file *file, struct kobject *kobj,	\
+				  struct bin_attribute *attr, char *buf,	\
+				  loff_t off, size_t count)			\
+{										\
+	struct device *dev = kobj_to_dev(kobj);					\
+										\
+	return cpumap_print_to_buf(true, buf, topology_##mask(dev->id),		\
+				   off, count);					\
 }
 
-#define define_siblings_show_list(name, mask)				\
-static ssize_t name##_list_show(struct device *dev,			\
-				struct device_attribute *attr,		\
-				char *buf)				\
-{									\
-	return cpumap_print_to_pagebuf(true, buf, topology_##mask(dev->id));\
-}
-
-#define define_siblings_show_func(name, mask)	\
-	define_siblings_show_map(name, mask);	\
-	define_siblings_show_list(name, mask)
-
 define_id_show_func(physical_package_id);
 static DEVICE_ATTR_RO(physical_package_id);
 
@@ -49,71 +51,80 @@ static DEVICE_ATTR_RO(die_id);
 define_id_show_func(core_id);
 static DEVICE_ATTR_RO(core_id);
 
-define_siblings_show_func(thread_siblings, sibling_cpumask);
-static DEVICE_ATTR_RO(thread_siblings);
-static DEVICE_ATTR_RO(thread_siblings_list);
+define_siblings_read_func(thread_siblings, sibling_cpumask);
+static BIN_ATTR_RO(thread_siblings, 0);
+static BIN_ATTR_RO(thread_siblings_list, 0);
 
-define_siblings_show_func(core_cpus, sibling_cpumask);
-static DEVICE_ATTR_RO(core_cpus);
-static DEVICE_ATTR_RO(core_cpus_list);
+define_siblings_read_func(core_cpus, sibling_cpumask);
+static BIN_ATTR_RO(core_cpus, 0);
+static BIN_ATTR_RO(core_cpus_list, 0);
 
-define_siblings_show_func(core_siblings, core_cpumask);
-static DEVICE_ATTR_RO(core_siblings);
-static DEVICE_ATTR_RO(core_siblings_list);
+define_siblings_read_func(core_siblings, core_cpumask);
+static BIN_ATTR_RO(core_siblings, 0);
+static BIN_ATTR_RO(core_siblings_list, 0);
 
-define_siblings_show_func(die_cpus, die_cpumask);
-static DEVICE_ATTR_RO(die_cpus);
-static DEVICE_ATTR_RO(die_cpus_list);
+define_siblings_read_func(die_cpus, die_cpumask);
+static BIN_ATTR_RO(die_cpus, 0);
+static BIN_ATTR_RO(die_cpus_list, 0);
 
-define_siblings_show_func(package_cpus, core_cpumask);
-static DEVICE_ATTR_RO(package_cpus);
-static DEVICE_ATTR_RO(package_cpus_list);
+define_siblings_read_func(package_cpus, core_cpumask);
+static BIN_ATTR_RO(package_cpus, 0);
+static BIN_ATTR_RO(package_cpus_list, 0);
 
 #ifdef CONFIG_SCHED_BOOK
 define_id_show_func(book_id);
 static DEVICE_ATTR_RO(book_id);
-define_siblings_show_func(book_siblings, book_cpumask);
-static DEVICE_ATTR_RO(book_siblings);
-static DEVICE_ATTR_RO(book_siblings_list);
+define_siblings_read_func(book_siblings, book_cpumask);
+static BIN_ATTR_RO(book_siblings, 0);
+static BIN_ATTR_RO(book_siblings_list, 0);
 #endif
 
 #ifdef CONFIG_SCHED_DRAWER
 define_id_show_func(drawer_id);
 static DEVICE_ATTR_RO(drawer_id);
-define_siblings_show_func(drawer_siblings, drawer_cpumask);
-static DEVICE_ATTR_RO(drawer_siblings);
-static DEVICE_ATTR_RO(drawer_siblings_list);
+define_siblings_read_func(drawer_siblings, drawer_cpumask);
+static BIN_ATTR_RO(drawer_siblings, 0);
+static BIN_ATTR_RO(drawer_siblings_list, 0);
 #endif
 
+static struct bin_attribute *bin_attrs[] = {
+	&bin_attr_core_cpus,
+	&bin_attr_core_cpus_list,
+	&bin_attr_thread_siblings,
+	&bin_attr_thread_siblings_list,
+	&bin_attr_core_siblings,
+	&bin_attr_core_siblings_list,
+	&bin_attr_die_cpus,
+	&bin_attr_die_cpus_list,
+	&bin_attr_package_cpus,
+	&bin_attr_package_cpus_list,
+#ifdef CONFIG_SCHED_BOOK
+	&bin_attr_book_siblings,
+	&bin_attr_book_siblings_list,
+#endif
+#ifdef CONFIG_SCHED_DRAWER
+	&bin_attr_drawer_siblings,
+	&bin_attr_drawer_siblings_list,
+#endif
+	NULL
+};
+
 static struct attribute *default_attrs[] = {
 	&dev_attr_physical_package_id.attr,
 	&dev_attr_die_id.attr,
 	&dev_attr_core_id.attr,
-	&dev_attr_thread_siblings.attr,
-	&dev_attr_thread_siblings_list.attr,
-	&dev_attr_core_cpus.attr,
-	&dev_attr_core_cpus_list.attr,
-	&dev_attr_core_siblings.attr,
-	&dev_attr_core_siblings_list.attr,
-	&dev_attr_die_cpus.attr,
-	&dev_attr_die_cpus_list.attr,
-	&dev_attr_package_cpus.attr,
-	&dev_attr_package_cpus_list.attr,
 #ifdef CONFIG_SCHED_BOOK
 	&dev_attr_book_id.attr,
-	&dev_attr_book_siblings.attr,
-	&dev_attr_book_siblings_list.attr,
 #endif
 #ifdef CONFIG_SCHED_DRAWER
 	&dev_attr_drawer_id.attr,
-	&dev_attr_drawer_siblings.attr,
-	&dev_attr_drawer_siblings_list.attr,
 #endif
 	NULL
 };
 
 static const struct attribute_group topology_attr_group = {
 	.attrs = default_attrs,
+	.bin_attrs = bin_attrs,
 	.name = "topology"
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 3/4] drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI
  2021-07-15 11:58 [PATCH v7 0/4] use bin_attribute to break the size limitation of cpumap ABI Barry Song
  2021-07-15 11:58 ` [PATCH v7 1/4] cpumask: introduce cpumap_print_to_buf to support large bitmask and list Barry Song
  2021-07-15 11:58 ` [PATCH v7 2/4] topology: use bin_attribute to break the size limitation of cpumap ABI Barry Song
@ 2021-07-15 11:58 ` Barry Song
  2021-07-15 11:58 ` [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases Barry Song
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 35+ messages in thread
From: Barry Song @ 2021-07-15 11:58 UTC (permalink / raw)
  To: gregkh, akpm, andriy.shevchenko, yury.norov, linux-kernel
  Cc: dave.hansen, linux, rafael, rdunlap, agordeev, sbrivio,
	jianpeng.ma, valentin.schneider, peterz, bristot, guodong.xu,
	tangchengchang, prime.zeng, yangyicong, tim.c.chen, linuxarm,
	Tian Tao, Barry Song

From: Tian Tao <tiantao6@hisilicon.com>

Reading /sys/devices/system/cpu/cpuX/nodeX/ returns cpumap and cpulist.
However, the size of this file is limited to PAGE_SIZE because of the
limitation for sysfs attribute.

This patch moves to use bin_attribute to extend the ABI to be more
than one page so that cpumap bitmask and list won't be potentially
trimmed.

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
---
 -v7:
 sys -> /sys, thanks for Andy's comments

 drivers/base/node.c | 51 +++++++++++++++++++++++++++++----------------
 1 file changed, 33 insertions(+), 18 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 4a4ae868ad9f..89a72aba72a3 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -27,42 +27,44 @@ static struct bus_type node_subsys = {
 };
 
 
-static ssize_t node_read_cpumap(struct device *dev, bool list, char *buf)
+static ssize_t node_read_cpumap(struct device *dev, bool list, char *buf,
+				loff_t off, size_t count)
 {
 	ssize_t n;
 	cpumask_var_t mask;
 	struct node *node_dev = to_node(dev);
 
-	/* 2008/04/07: buf currently PAGE_SIZE, need 9 chars per 32 bits. */
-	BUILD_BUG_ON((NR_CPUS/32 * 9) > (PAGE_SIZE-1));
-
 	if (!alloc_cpumask_var(&mask, GFP_KERNEL))
 		return 0;
 
 	cpumask_and(mask, cpumask_of_node(node_dev->dev.id), cpu_online_mask);
-	n = cpumap_print_to_pagebuf(list, buf, mask);
+	n = cpumap_print_to_buf(list, buf, mask, off, count);
 	free_cpumask_var(mask);
 
 	return n;
 }
 
-static inline ssize_t cpumap_show(struct device *dev,
-				  struct device_attribute *attr,
-				  char *buf)
+static inline ssize_t cpumap_read(struct file *file, struct kobject *kobj,
+				  struct bin_attribute *attr, char *buf,
+				  loff_t off, size_t count)
 {
-	return node_read_cpumap(dev, false, buf);
+	struct device *dev = kobj_to_dev(kobj);
+
+	return node_read_cpumap(dev, false, buf, off, count);
 }
 
-static DEVICE_ATTR_RO(cpumap);
+static BIN_ATTR_RO(cpumap, 0);
 
-static inline ssize_t cpulist_show(struct device *dev,
-				   struct device_attribute *attr,
-				   char *buf)
+static inline ssize_t cpulist_read(struct file *file, struct kobject *kobj,
+				   struct bin_attribute *attr, char *buf,
+				   loff_t off, size_t count)
 {
-	return node_read_cpumap(dev, true, buf);
+	struct device *dev = kobj_to_dev(kobj);
+
+	return node_read_cpumap(dev, true, buf, off, count);
 }
 
-static DEVICE_ATTR_RO(cpulist);
+static BIN_ATTR_RO(cpulist, 0);
 
 /**
  * struct node_access_nodes - Access class device to hold user visible
@@ -557,15 +559,28 @@ static ssize_t node_read_distance(struct device *dev,
 static DEVICE_ATTR(distance, 0444, node_read_distance, NULL);
 
 static struct attribute *node_dev_attrs[] = {
-	&dev_attr_cpumap.attr,
-	&dev_attr_cpulist.attr,
 	&dev_attr_meminfo.attr,
 	&dev_attr_numastat.attr,
 	&dev_attr_distance.attr,
 	&dev_attr_vmstat.attr,
 	NULL
 };
-ATTRIBUTE_GROUPS(node_dev);
+
+static struct bin_attribute *node_dev_bin_attrs[] = {
+	&bin_attr_cpumap,
+	&bin_attr_cpulist,
+	NULL
+};
+
+static const struct attribute_group node_dev_group = {
+	.attrs = node_dev_attrs,
+	.bin_attrs = node_dev_bin_attrs
+};
+
+static const struct attribute_group *node_dev_groups[] = {
+	&node_dev_group,
+	NULL
+};
 
 #ifdef CONFIG_HUGETLBFS
 /*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
  2021-07-15 11:58 [PATCH v7 0/4] use bin_attribute to break the size limitation of cpumap ABI Barry Song
                   ` (2 preceding siblings ...)
  2021-07-15 11:58 ` [PATCH v7 3/4] drivers/base/node.c: " Barry Song
@ 2021-07-15 11:58 ` Barry Song
  2021-07-15 12:09   ` Andy Shevchenko
  2021-07-15 15:36   ` Yury Norov
  2021-07-28 13:41 ` [PATCH v7 0/4] use bin_attribute to break the size limitation of cpumap ABI Greg KH
  2021-07-28 18:58 ` [PATCH] bitmap: extend comment to bitmap_print_to_buf Yury Norov
  5 siblings, 2 replies; 35+ messages in thread
From: Barry Song @ 2021-07-15 11:58 UTC (permalink / raw)
  To: gregkh, akpm, andriy.shevchenko, yury.norov, linux-kernel
  Cc: dave.hansen, linux, rafael, rdunlap, agordeev, sbrivio,
	jianpeng.ma, valentin.schneider, peterz, bristot, guodong.xu,
	tangchengchang, prime.zeng, yangyicong, tim.c.chen, linuxarm,
	Barry Song

The added test items cover both cases where bitmap buf of the printed
result is greater than and less than 4KB.
And it also covers the case where offset for bitmap_print_to_buf is
non-zero which will happen when printed buf is larger than one page
in sysfs bin_attribute.

Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
---
 -v7:
 minor cleanup according to Andy's comments

 lib/test_bitmap.c | 150 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 150 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 4ea73f5aed41..eb8ebaf12865 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -19,6 +19,7 @@
 KSTM_MODULE_GLOBALS();
 
 static char pbl_buffer[PAGE_SIZE] __initdata;
+static char print_buf[PAGE_SIZE * 2] __initdata;
 
 static const unsigned long exp1[] __initconst = {
 	BITMAP_FROM_U64(1),
@@ -156,6 +157,20 @@ static bool __init __check_eq_clump8(const char *srcfile, unsigned int line,
 	return true;
 }
 
+static bool __init
+__check_eq_str(const char *srcfile, unsigned int line,
+		const char *exp_str, const char *str,
+		unsigned int len)
+{
+	bool eq;
+
+	eq = strncmp(exp_str, str, len) == 0;
+	if (!eq)
+		pr_err("[%s:%u] expected %s, got %s\n", srcfile, line, exp_str, str);
+
+	return eq;
+}
+
 #define __expect_eq(suffix, ...)					\
 	({								\
 		int result = 0;						\
@@ -173,6 +188,7 @@ static bool __init __check_eq_clump8(const char *srcfile, unsigned int line,
 #define expect_eq_pbl(...)		__expect_eq(pbl, ##__VA_ARGS__)
 #define expect_eq_u32_array(...)	__expect_eq(u32_array, ##__VA_ARGS__)
 #define expect_eq_clump8(...)		__expect_eq(clump8, ##__VA_ARGS__)
+#define expect_eq_str(...)		__expect_eq(str, ##__VA_ARGS__)
 
 static void __init test_zero_clear(void)
 {
@@ -660,6 +676,139 @@ static void __init test_bitmap_cut(void)
 	}
 }
 
+struct test_bitmap_print {
+	const unsigned long *bitmap;
+	unsigned long nbits;
+	const char *mask;
+	const char *list;
+};
+
+static const unsigned long small_bitmap[] __initconst = {
+	BITMAP_FROM_U64(0x3333333311111111ULL),
+};
+
+static const char small_mask[] __initconst = "33333333,11111111\n";
+static const char small_list[] __initconst = "0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61\n";
+
+static const unsigned long large_bitmap[] __initconst = {
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
+};
+
+static const char large_mask[] __initconst = "33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111,"
+					"33333333,11111111,33333333,11111111\n";
+
+static const char large_list[] __initconst = /* more than 4KB */
+	"0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61,64,68,72,76,80,84,88,92,96-97,100-101,104-1"
+	"05,108-109,112-113,116-117,120-121,124-125,128,132,136,140,144,148,152,156,160-161,164-165,168-169,172-173,176-1"
+	"77,180-181,184-185,188-189,192,196,200,204,208,212,216,220,224-225,228-229,232-233,236-237,240-241,244-245,248-2"
+	"49,252-253,256,260,264,268,272,276,280,284,288-289,292-293,296-297,300-301,304-305,308-309,312-313,316-317,320,3"
+	"24,328,332,336,340,344,348,352-353,356-357,360-361,364-365,368-369,372-373,376-377,380-381,384,388,392,396,400,4"
+	"04,408,412,416-417,420-421,424-425,428-429,432-433,436-437,440-441,444-445,448,452,456,460,464,468,472,476,480-4"
+	"81,484-485,488-489,492-493,496-497,500-501,504-505,508-509,512,516,520,524,528,532,536,540,544-545,548-549,552-5"
+	"53,556-557,560-561,564-565,568-569,572-573,576,580,584,588,592,596,600,604,608-609,612-613,616-617,620-621,624-6"
+	"25,628-629,632-633,636-637,640,644,648,652,656,660,664,668,672-673,676-677,680-681,684-685,688-689,692-693,696-6"
+	"97,700-701,704,708,712,716,720,724,728,732,736-737,740-741,744-745,748-749,752-753,756-757,760-761,764-765,768,7"
+	"72,776,780,784,788,792,796,800-801,804-805,808-809,812-813,816-817,820-821,824-825,828-829,832,836,840,844,848,8"
+	"52,856,860,864-865,868-869,872-873,876-877,880-881,884-885,888-889,892-893,896,900,904,908,912,916,920,924,928-9"
+	"29,932-933,936-937,940-941,944-945,948-949,952-953,956-957,960,964,968,972,976,980,984,988,992-993,996-997,1000-"
+	"1001,1004-1005,1008-1009,1012-1013,1016-1017,1020-1021,1024,1028,1032,1036,1040,1044,1048,1052,1056-1057,1060-10"
+	"61,1064-1065,1068-1069,1072-1073,1076-1077,1080-1081,1084-1085,1088,1092,1096,1100,1104,1108,1112,1116,1120-1121"
+	",1124-1125,1128-1129,1132-1133,1136-1137,1140-1141,1144-1145,1148-1149,1152,1156,1160,1164,1168,1172,1176,1180,1"
+	"184-1185,1188-1189,1192-1193,1196-1197,1200-1201,1204-1205,1208-1209,1212-1213,1216,1220,1224,1228,1232,1236,124"
+	"0,1244,1248-1249,1252-1253,1256-1257,1260-1261,1264-1265,1268-1269,1272-1273,1276-1277,1280,1284,1288,1292,1296,"
+	"1300,1304,1308,1312-1313,1316-1317,1320-1321,1324-1325,1328-1329,1332-1333,1336-1337,1340-1341,1344,1348,1352,13"
+	"56,1360,1364,1368,1372,1376-1377,1380-1381,1384-1385,1388-1389,1392-1393,1396-1397,1400-1401,1404-1405,1408,1412"
+	",1416,1420,1424,1428,1432,1436,1440-1441,1444-1445,1448-1449,1452-1453,1456-1457,1460-1461,1464-1465,1468-1469,1"
+	"472,1476,1480,1484,1488,1492,1496,1500,1504-1505,1508-1509,1512-1513,1516-1517,1520-1521,1524-1525,1528-1529,153"
+	"2-1533,1536,1540,1544,1548,1552,1556,1560,1564,1568-1569,1572-1573,1576-1577,1580-1581,1584-1585,1588-1589,1592-"
+	"1593,1596-1597,1600,1604,1608,1612,1616,1620,1624,1628,1632-1633,1636-1637,1640-1641,1644-1645,1648-1649,1652-16"
+	"53,1656-1657,1660-1661,1664,1668,1672,1676,1680,1684,1688,1692,1696-1697,1700-1701,1704-1705,1708-1709,1712-1713"
+	",1716-1717,1720-1721,1724-1725,1728,1732,1736,1740,1744,1748,1752,1756,1760-1761,1764-1765,1768-1769,1772-1773,1"
+	"776-1777,1780-1781,1784-1785,1788-1789,1792,1796,1800,1804,1808,1812,1816,1820,1824-1825,1828-1829,1832-1833,183"
+	"6-1837,1840-1841,1844-1845,1848-1849,1852-1853,1856,1860,1864,1868,1872,1876,1880,1884,1888-1889,1892-1893,1896-"
+	"1897,1900-1901,1904-1905,1908-1909,1912-1913,1916-1917,1920,1924,1928,1932,1936,1940,1944,1948,1952-1953,1956-19"
+	"57,1960-1961,1964-1965,1968-1969,1972-1973,1976-1977,1980-1981,1984,1988,1992,1996,2000,2004,2008,2012,2016-2017"
+	",2020-2021,2024-2025,2028-2029,2032-2033,2036-2037,2040-2041,2044-2045,2048,2052,2056,2060,2064,2068,2072,2076,2"
+	"080-2081,2084-2085,2088-2089,2092-2093,2096-2097,2100-2101,2104-2105,2108-2109,2112,2116,2120,2124,2128,2132,213"
+	"6,2140,2144-2145,2148-2149,2152-2153,2156-2157,2160-2161,2164-2165,2168-2169,2172-2173,2176,2180,2184,2188,2192,"
+	"2196,2200,2204,2208-2209,2212-2213,2216-2217,2220-2221,2224-2225,2228-2229,2232-2233,2236-2237,2240,2244,2248,22"
+	"52,2256,2260,2264,2268,2272-2273,2276-2277,2280-2281,2284-2285,2288-2289,2292-2293,2296-2297,2300-2301,2304,2308"
+	",2312,2316,2320,2324,2328,2332,2336-2337,2340-2341,2344-2345,2348-2349,2352-2353,2356-2357,2360-2361,2364-2365,2"
+	"368,2372,2376,2380,2384,2388,2392,2396,2400-2401,2404-2405,2408-2409,2412-2413,2416-2417,2420-2421,2424-2425,242"
+	"8-2429,2432,2436,2440,2444,2448,2452,2456,2460,2464-2465,2468-2469,2472-2473,2476-2477,2480-2481,2484-2485,2488-"
+	"2489,2492-2493,2496,2500,2504,2508,2512,2516,2520,2524,2528-2529,2532-2533,2536-2537,2540-2541,2544-2545,2548-25"
+	"49,2552-2553,2556-2557\n";
+
+static const struct test_bitmap_print test_print[] __initconst = {
+	{ small_bitmap, sizeof(small_bitmap) * BITS_PER_BYTE, small_mask, small_list },
+	{ large_bitmap, sizeof(large_bitmap) * BITS_PER_BYTE, large_mask, large_list },
+};
+
+static void __init test_bitmap_print_buf(void)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(test_print); i++) {
+		const struct test_bitmap_print *t = &test_print[i];
+		int n;
+
+		n = bitmap_print_to_buf(false, print_buf, t->bitmap, t->nbits,
+					0, 2 * PAGE_SIZE);
+		expect_eq_uint(strlen(t->mask) + 1, n);
+		expect_eq_str(t->mask, print_buf, n);
+
+		n = bitmap_print_to_buf(true, print_buf, t->bitmap, t->nbits,
+					0, 2 * PAGE_SIZE);
+		expect_eq_uint(strlen(t->list) + 1, n);
+		expect_eq_str(t->list, print_buf, n);
+
+		/* test bitmap_print_to_buf by non-zero offset */
+		if (strlen(t->list) > PAGE_SIZE) {
+			n = bitmap_print_to_buf(true, print_buf, t->bitmap, t->nbits,
+						PAGE_SIZE, PAGE_SIZE);
+			expect_eq_uint(strlen(t->list) + 1 - PAGE_SIZE, n);
+			expect_eq_str(t->list + PAGE_SIZE, print_buf, n);
+		}
+	}
+}
+
 static void __init selftest(void)
 {
 	test_zero_clear();
@@ -672,6 +821,7 @@ static void __init selftest(void)
 	test_mem_optimisations();
 	test_for_each_set_clump8();
 	test_bitmap_cut();
+	test_bitmap_print_buf();
 }
 
 KSTM_MODULE_LOADERS(test_bitmap);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
  2021-07-15 11:58 ` [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases Barry Song
@ 2021-07-15 12:09   ` Andy Shevchenko
  2021-07-15 20:47     ` Yury Norov
  2021-07-15 15:36   ` Yury Norov
  1 sibling, 1 reply; 35+ messages in thread
From: Andy Shevchenko @ 2021-07-15 12:09 UTC (permalink / raw)
  To: Barry Song
  Cc: gregkh, akpm, yury.norov, linux-kernel, dave.hansen, linux,
	rafael, rdunlap, agordeev, sbrivio, jianpeng.ma,
	valentin.schneider, peterz, bristot, guodong.xu, tangchengchang,
	prime.zeng, yangyicong, tim.c.chen, linuxarm

On Thu, Jul 15, 2021 at 11:58:56PM +1200, Barry Song wrote:
> The added test items cover both cases where bitmap buf of the printed
> result is greater than and less than 4KB.
> And it also covers the case where offset for bitmap_print_to_buf is
> non-zero which will happen when printed buf is larger than one page
> in sysfs bin_attribute.

More test cases is always a good thing, thanks!
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

> Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
> ---
>  -v7:
>  minor cleanup according to Andy's comments
> 
>  lib/test_bitmap.c | 150 ++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 150 insertions(+)
> 
> diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
> index 4ea73f5aed41..eb8ebaf12865 100644
> --- a/lib/test_bitmap.c
> +++ b/lib/test_bitmap.c
> @@ -19,6 +19,7 @@
>  KSTM_MODULE_GLOBALS();
>  
>  static char pbl_buffer[PAGE_SIZE] __initdata;
> +static char print_buf[PAGE_SIZE * 2] __initdata;
>  
>  static const unsigned long exp1[] __initconst = {
>  	BITMAP_FROM_U64(1),
> @@ -156,6 +157,20 @@ static bool __init __check_eq_clump8(const char *srcfile, unsigned int line,
>  	return true;
>  }
>  
> +static bool __init
> +__check_eq_str(const char *srcfile, unsigned int line,
> +		const char *exp_str, const char *str,
> +		unsigned int len)
> +{
> +	bool eq;
> +
> +	eq = strncmp(exp_str, str, len) == 0;
> +	if (!eq)
> +		pr_err("[%s:%u] expected %s, got %s\n", srcfile, line, exp_str, str);
> +
> +	return eq;
> +}
> +
>  #define __expect_eq(suffix, ...)					\
>  	({								\
>  		int result = 0;						\
> @@ -173,6 +188,7 @@ static bool __init __check_eq_clump8(const char *srcfile, unsigned int line,
>  #define expect_eq_pbl(...)		__expect_eq(pbl, ##__VA_ARGS__)
>  #define expect_eq_u32_array(...)	__expect_eq(u32_array, ##__VA_ARGS__)
>  #define expect_eq_clump8(...)		__expect_eq(clump8, ##__VA_ARGS__)
> +#define expect_eq_str(...)		__expect_eq(str, ##__VA_ARGS__)
>  
>  static void __init test_zero_clear(void)
>  {
> @@ -660,6 +676,139 @@ static void __init test_bitmap_cut(void)
>  	}
>  }
>  
> +struct test_bitmap_print {
> +	const unsigned long *bitmap;
> +	unsigned long nbits;
> +	const char *mask;
> +	const char *list;
> +};
> +
> +static const unsigned long small_bitmap[] __initconst = {
> +	BITMAP_FROM_U64(0x3333333311111111ULL),
> +};
> +
> +static const char small_mask[] __initconst = "33333333,11111111\n";
> +static const char small_list[] __initconst = "0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61\n";
> +
> +static const unsigned long large_bitmap[] __initconst = {
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +};
> +
> +static const char large_mask[] __initconst = "33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111\n";
> +
> +static const char large_list[] __initconst = /* more than 4KB */
> +	"0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61,64,68,72,76,80,84,88,92,96-97,100-101,104-1"
> +	"05,108-109,112-113,116-117,120-121,124-125,128,132,136,140,144,148,152,156,160-161,164-165,168-169,172-173,176-1"
> +	"77,180-181,184-185,188-189,192,196,200,204,208,212,216,220,224-225,228-229,232-233,236-237,240-241,244-245,248-2"
> +	"49,252-253,256,260,264,268,272,276,280,284,288-289,292-293,296-297,300-301,304-305,308-309,312-313,316-317,320,3"
> +	"24,328,332,336,340,344,348,352-353,356-357,360-361,364-365,368-369,372-373,376-377,380-381,384,388,392,396,400,4"
> +	"04,408,412,416-417,420-421,424-425,428-429,432-433,436-437,440-441,444-445,448,452,456,460,464,468,472,476,480-4"
> +	"81,484-485,488-489,492-493,496-497,500-501,504-505,508-509,512,516,520,524,528,532,536,540,544-545,548-549,552-5"
> +	"53,556-557,560-561,564-565,568-569,572-573,576,580,584,588,592,596,600,604,608-609,612-613,616-617,620-621,624-6"
> +	"25,628-629,632-633,636-637,640,644,648,652,656,660,664,668,672-673,676-677,680-681,684-685,688-689,692-693,696-6"
> +	"97,700-701,704,708,712,716,720,724,728,732,736-737,740-741,744-745,748-749,752-753,756-757,760-761,764-765,768,7"
> +	"72,776,780,784,788,792,796,800-801,804-805,808-809,812-813,816-817,820-821,824-825,828-829,832,836,840,844,848,8"
> +	"52,856,860,864-865,868-869,872-873,876-877,880-881,884-885,888-889,892-893,896,900,904,908,912,916,920,924,928-9"
> +	"29,932-933,936-937,940-941,944-945,948-949,952-953,956-957,960,964,968,972,976,980,984,988,992-993,996-997,1000-"
> +	"1001,1004-1005,1008-1009,1012-1013,1016-1017,1020-1021,1024,1028,1032,1036,1040,1044,1048,1052,1056-1057,1060-10"
> +	"61,1064-1065,1068-1069,1072-1073,1076-1077,1080-1081,1084-1085,1088,1092,1096,1100,1104,1108,1112,1116,1120-1121"
> +	",1124-1125,1128-1129,1132-1133,1136-1137,1140-1141,1144-1145,1148-1149,1152,1156,1160,1164,1168,1172,1176,1180,1"
> +	"184-1185,1188-1189,1192-1193,1196-1197,1200-1201,1204-1205,1208-1209,1212-1213,1216,1220,1224,1228,1232,1236,124"
> +	"0,1244,1248-1249,1252-1253,1256-1257,1260-1261,1264-1265,1268-1269,1272-1273,1276-1277,1280,1284,1288,1292,1296,"
> +	"1300,1304,1308,1312-1313,1316-1317,1320-1321,1324-1325,1328-1329,1332-1333,1336-1337,1340-1341,1344,1348,1352,13"
> +	"56,1360,1364,1368,1372,1376-1377,1380-1381,1384-1385,1388-1389,1392-1393,1396-1397,1400-1401,1404-1405,1408,1412"
> +	",1416,1420,1424,1428,1432,1436,1440-1441,1444-1445,1448-1449,1452-1453,1456-1457,1460-1461,1464-1465,1468-1469,1"
> +	"472,1476,1480,1484,1488,1492,1496,1500,1504-1505,1508-1509,1512-1513,1516-1517,1520-1521,1524-1525,1528-1529,153"
> +	"2-1533,1536,1540,1544,1548,1552,1556,1560,1564,1568-1569,1572-1573,1576-1577,1580-1581,1584-1585,1588-1589,1592-"
> +	"1593,1596-1597,1600,1604,1608,1612,1616,1620,1624,1628,1632-1633,1636-1637,1640-1641,1644-1645,1648-1649,1652-16"
> +	"53,1656-1657,1660-1661,1664,1668,1672,1676,1680,1684,1688,1692,1696-1697,1700-1701,1704-1705,1708-1709,1712-1713"
> +	",1716-1717,1720-1721,1724-1725,1728,1732,1736,1740,1744,1748,1752,1756,1760-1761,1764-1765,1768-1769,1772-1773,1"
> +	"776-1777,1780-1781,1784-1785,1788-1789,1792,1796,1800,1804,1808,1812,1816,1820,1824-1825,1828-1829,1832-1833,183"
> +	"6-1837,1840-1841,1844-1845,1848-1849,1852-1853,1856,1860,1864,1868,1872,1876,1880,1884,1888-1889,1892-1893,1896-"
> +	"1897,1900-1901,1904-1905,1908-1909,1912-1913,1916-1917,1920,1924,1928,1932,1936,1940,1944,1948,1952-1953,1956-19"
> +	"57,1960-1961,1964-1965,1968-1969,1972-1973,1976-1977,1980-1981,1984,1988,1992,1996,2000,2004,2008,2012,2016-2017"
> +	",2020-2021,2024-2025,2028-2029,2032-2033,2036-2037,2040-2041,2044-2045,2048,2052,2056,2060,2064,2068,2072,2076,2"
> +	"080-2081,2084-2085,2088-2089,2092-2093,2096-2097,2100-2101,2104-2105,2108-2109,2112,2116,2120,2124,2128,2132,213"
> +	"6,2140,2144-2145,2148-2149,2152-2153,2156-2157,2160-2161,2164-2165,2168-2169,2172-2173,2176,2180,2184,2188,2192,"
> +	"2196,2200,2204,2208-2209,2212-2213,2216-2217,2220-2221,2224-2225,2228-2229,2232-2233,2236-2237,2240,2244,2248,22"
> +	"52,2256,2260,2264,2268,2272-2273,2276-2277,2280-2281,2284-2285,2288-2289,2292-2293,2296-2297,2300-2301,2304,2308"
> +	",2312,2316,2320,2324,2328,2332,2336-2337,2340-2341,2344-2345,2348-2349,2352-2353,2356-2357,2360-2361,2364-2365,2"
> +	"368,2372,2376,2380,2384,2388,2392,2396,2400-2401,2404-2405,2408-2409,2412-2413,2416-2417,2420-2421,2424-2425,242"
> +	"8-2429,2432,2436,2440,2444,2448,2452,2456,2460,2464-2465,2468-2469,2472-2473,2476-2477,2480-2481,2484-2485,2488-"
> +	"2489,2492-2493,2496,2500,2504,2508,2512,2516,2520,2524,2528-2529,2532-2533,2536-2537,2540-2541,2544-2545,2548-25"
> +	"49,2552-2553,2556-2557\n";
> +
> +static const struct test_bitmap_print test_print[] __initconst = {
> +	{ small_bitmap, sizeof(small_bitmap) * BITS_PER_BYTE, small_mask, small_list },
> +	{ large_bitmap, sizeof(large_bitmap) * BITS_PER_BYTE, large_mask, large_list },
> +};
> +
> +static void __init test_bitmap_print_buf(void)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(test_print); i++) {
> +		const struct test_bitmap_print *t = &test_print[i];
> +		int n;
> +
> +		n = bitmap_print_to_buf(false, print_buf, t->bitmap, t->nbits,
> +					0, 2 * PAGE_SIZE);
> +		expect_eq_uint(strlen(t->mask) + 1, n);
> +		expect_eq_str(t->mask, print_buf, n);
> +
> +		n = bitmap_print_to_buf(true, print_buf, t->bitmap, t->nbits,
> +					0, 2 * PAGE_SIZE);
> +		expect_eq_uint(strlen(t->list) + 1, n);
> +		expect_eq_str(t->list, print_buf, n);
> +
> +		/* test bitmap_print_to_buf by non-zero offset */
> +		if (strlen(t->list) > PAGE_SIZE) {
> +			n = bitmap_print_to_buf(true, print_buf, t->bitmap, t->nbits,
> +						PAGE_SIZE, PAGE_SIZE);
> +			expect_eq_uint(strlen(t->list) + 1 - PAGE_SIZE, n);
> +			expect_eq_str(t->list + PAGE_SIZE, print_buf, n);
> +		}
> +	}
> +}
> +
>  static void __init selftest(void)
>  {
>  	test_zero_clear();
> @@ -672,6 +821,7 @@ static void __init selftest(void)
>  	test_mem_optimisations();
>  	test_for_each_set_clump8();
>  	test_bitmap_cut();
> +	test_bitmap_print_buf();
>  }
>  
>  KSTM_MODULE_LOADERS(test_bitmap);
> -- 
> 2.25.1
> 

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 1/4] cpumask: introduce cpumap_print_to_buf to support large bitmask and list
  2021-07-15 11:58 ` [PATCH v7 1/4] cpumask: introduce cpumap_print_to_buf to support large bitmask and list Barry Song
@ 2021-07-15 15:28   ` Yury Norov
  2021-07-15 21:08     ` Song Bao Hua (Barry Song)
  2021-07-16  0:57     ` Song Bao Hua (Barry Song)
  0 siblings, 2 replies; 35+ messages in thread
From: Yury Norov @ 2021-07-15 15:28 UTC (permalink / raw)
  To: Barry Song
  Cc: gregkh, akpm, andriy.shevchenko, linux-kernel, dave.hansen,
	linux, rafael, rdunlap, agordeev, sbrivio, jianpeng.ma,
	valentin.schneider, peterz, bristot, guodong.xu, tangchengchang,
	prime.zeng, yangyicong, tim.c.chen, linuxarm, Tian Tao

On Thu, Jul 15, 2021 at 11:58:53PM +1200, Barry Song wrote:
> (10.1.198.147)
> X-CFilter-Loop: Reflected
> Status: O
> Content-Length: 10263
> Lines: 252
> 
> From: Tian Tao <tiantao6@hisilicon.com>

[...]

> +int bitmap_print_to_buf(bool list, char *buf, const unsigned long *maskp,
> +		int nmaskbits, loff_t off, size_t count)
> +{
> +	const char *fmt = list ? "%*pbl\n" : "%*pb\n";
> +	ssize_t size;
> +	void *data;
> +
> +	data = kasprintf(GFP_KERNEL, fmt, nmaskbits, maskp);
> +	if (!data)
> +		return -ENOMEM;
> +
> +	size = memory_read_from_buffer(buf, count, &off, data, strlen(data) + 1);
> +	kfree(data);
> +
> +	return size;
> +}
> +EXPORT_SYMBOL(bitmap_print_to_buf);

In discussion to v4 of this series I've pointed out inefficiency of
this approach. Now it's v7, but the problem is still there.

1. You make user of your API guess aboout proper @count without any
   hint. This is worse than how it works now with pure vsnprintf().
2. For big bitmaps and small @count, your code will make enormous
   amount of unneeded work. For example, if the full length of string
   representation of bitmap is 1M, and length of the output buffer is
   1k, one will have to call bitmap_print_to_buf() 1000 times.  With
   current design it assumes that you allocate the full amount of memory
   1000 times, free it 1000 times and print huge bitmap 1000 times to
   just return small part of it.

NAK for this, and please stop submitting wrong approach again and
again.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
  2021-07-15 11:58 ` [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases Barry Song
  2021-07-15 12:09   ` Andy Shevchenko
@ 2021-07-15 15:36   ` Yury Norov
  1 sibling, 0 replies; 35+ messages in thread
From: Yury Norov @ 2021-07-15 15:36 UTC (permalink / raw)
  To: Barry Song
  Cc: gregkh, akpm, andriy.shevchenko, linux-kernel, dave.hansen,
	linux, rafael, rdunlap, agordeev, sbrivio, jianpeng.ma,
	valentin.schneider, peterz, bristot, guodong.xu, tangchengchang,
	prime.zeng, yangyicong, tim.c.chen, linuxarm

On Thu, Jul 15, 2021 at 11:58:56PM +1200, Barry Song wrote:
> 
> The added test items cover both cases where bitmap buf of the printed
> result is greater than and less than 4KB.
> And it also covers the case where offset for bitmap_print_to_buf is
> non-zero which will happen when printed buf is larger than one page
> in sysfs bin_attribute.
> 
> Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
> ---
>  -v7:
>  minor cleanup according to Andy's comments
> 
>  lib/test_bitmap.c | 150 ++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 150 insertions(+)
> 
> diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
> index 4ea73f5aed41..eb8ebaf12865 100644
> --- a/lib/test_bitmap.c
> +++ b/lib/test_bitmap.c
> @@ -19,6 +19,7 @@
>  KSTM_MODULE_GLOBALS();
>  
>  static char pbl_buffer[PAGE_SIZE] __initdata;
> +static char print_buf[PAGE_SIZE * 2] __initdata;
>  
>  static const unsigned long exp1[] __initconst = {
>  	BITMAP_FROM_U64(1),
> @@ -156,6 +157,20 @@ static bool __init __check_eq_clump8(const char *srcfile, unsigned int line,
>  	return true;
>  }
>  
> +static bool __init
> +__check_eq_str(const char *srcfile, unsigned int line,
> +		const char *exp_str, const char *str,
> +		unsigned int len)
> +{
> +	bool eq;
> +
> +	eq = strncmp(exp_str, str, len) == 0;
> +	if (!eq)
> +		pr_err("[%s:%u] expected %s, got %s\n", srcfile, line, exp_str, str);
> +
> +	return eq;
> +}
> +
>  #define __expect_eq(suffix, ...)					\
>  	({								\
>  		int result = 0;						\
> @@ -173,6 +188,7 @@ static bool __init __check_eq_clump8(const char *srcfile, unsigned int line,
>  #define expect_eq_pbl(...)		__expect_eq(pbl, ##__VA_ARGS__)
>  #define expect_eq_u32_array(...)	__expect_eq(u32_array, ##__VA_ARGS__)
>  #define expect_eq_clump8(...)		__expect_eq(clump8, ##__VA_ARGS__)
> +#define expect_eq_str(...)		__expect_eq(str, ##__VA_ARGS__)
>  
>  static void __init test_zero_clear(void)
>  {
> @@ -660,6 +676,139 @@ static void __init test_bitmap_cut(void)
>  	}
>  }
>  
> +struct test_bitmap_print {
> +	const unsigned long *bitmap;
> +	unsigned long nbits;
> +	const char *mask;
> +	const char *list;
> +};
> +
> +static const unsigned long small_bitmap[] __initconst = {
> +	BITMAP_FROM_U64(0x3333333311111111ULL),
> +};
> +
> +static const char small_mask[] __initconst = "33333333,11111111\n";
> +static const char small_list[] __initconst = "0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61\n";
> +
> +static const unsigned long large_bitmap[] __initconst = {
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +	BITMAP_FROM_U64(0x3333333311111111ULL), BITMAP_FROM_U64(0x3333333311111111ULL),
> +};
> +
> +static const char large_mask[] __initconst = "33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111,"
> +					"33333333,11111111,33333333,11111111\n";
> +
> +static const char large_list[] __initconst = /* more than 4KB */
> +	"0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61,64,68,72,76,80,84,88,92,96-97,100-101,104-1"
> +	"05,108-109,112-113,116-117,120-121,124-125,128,132,136,140,144,148,152,156,160-161,164-165,168-169,172-173,176-1"
> +	"77,180-181,184-185,188-189,192,196,200,204,208,212,216,220,224-225,228-229,232-233,236-237,240-241,244-245,248-2"
> +	"49,252-253,256,260,264,268,272,276,280,284,288-289,292-293,296-297,300-301,304-305,308-309,312-313,316-317,320,3"
> +	"24,328,332,336,340,344,348,352-353,356-357,360-361,364-365,368-369,372-373,376-377,380-381,384,388,392,396,400,4"
> +	"04,408,412,416-417,420-421,424-425,428-429,432-433,436-437,440-441,444-445,448,452,456,460,464,468,472,476,480-4"
> +	"81,484-485,488-489,492-493,496-497,500-501,504-505,508-509,512,516,520,524,528,532,536,540,544-545,548-549,552-5"
> +	"53,556-557,560-561,564-565,568-569,572-573,576,580,584,588,592,596,600,604,608-609,612-613,616-617,620-621,624-6"
> +	"25,628-629,632-633,636-637,640,644,648,652,656,660,664,668,672-673,676-677,680-681,684-685,688-689,692-693,696-6"
> +	"97,700-701,704,708,712,716,720,724,728,732,736-737,740-741,744-745,748-749,752-753,756-757,760-761,764-765,768,7"
> +	"72,776,780,784,788,792,796,800-801,804-805,808-809,812-813,816-817,820-821,824-825,828-829,832,836,840,844,848,8"
> +	"52,856,860,864-865,868-869,872-873,876-877,880-881,884-885,888-889,892-893,896,900,904,908,912,916,920,924,928-9"
> +	"29,932-933,936-937,940-941,944-945,948-949,952-953,956-957,960,964,968,972,976,980,984,988,992-993,996-997,1000-"
> +	"1001,1004-1005,1008-1009,1012-1013,1016-1017,1020-1021,1024,1028,1032,1036,1040,1044,1048,1052,1056-1057,1060-10"
> +	"61,1064-1065,1068-1069,1072-1073,1076-1077,1080-1081,1084-1085,1088,1092,1096,1100,1104,1108,1112,1116,1120-1121"
> +	",1124-1125,1128-1129,1132-1133,1136-1137,1140-1141,1144-1145,1148-1149,1152,1156,1160,1164,1168,1172,1176,1180,1"
> +	"184-1185,1188-1189,1192-1193,1196-1197,1200-1201,1204-1205,1208-1209,1212-1213,1216,1220,1224,1228,1232,1236,124"
> +	"0,1244,1248-1249,1252-1253,1256-1257,1260-1261,1264-1265,1268-1269,1272-1273,1276-1277,1280,1284,1288,1292,1296,"
> +	"1300,1304,1308,1312-1313,1316-1317,1320-1321,1324-1325,1328-1329,1332-1333,1336-1337,1340-1341,1344,1348,1352,13"
> +	"56,1360,1364,1368,1372,1376-1377,1380-1381,1384-1385,1388-1389,1392-1393,1396-1397,1400-1401,1404-1405,1408,1412"
> +	",1416,1420,1424,1428,1432,1436,1440-1441,1444-1445,1448-1449,1452-1453,1456-1457,1460-1461,1464-1465,1468-1469,1"
> +	"472,1476,1480,1484,1488,1492,1496,1500,1504-1505,1508-1509,1512-1513,1516-1517,1520-1521,1524-1525,1528-1529,153"
> +	"2-1533,1536,1540,1544,1548,1552,1556,1560,1564,1568-1569,1572-1573,1576-1577,1580-1581,1584-1585,1588-1589,1592-"
> +	"1593,1596-1597,1600,1604,1608,1612,1616,1620,1624,1628,1632-1633,1636-1637,1640-1641,1644-1645,1648-1649,1652-16"
> +	"53,1656-1657,1660-1661,1664,1668,1672,1676,1680,1684,1688,1692,1696-1697,1700-1701,1704-1705,1708-1709,1712-1713"
> +	",1716-1717,1720-1721,1724-1725,1728,1732,1736,1740,1744,1748,1752,1756,1760-1761,1764-1765,1768-1769,1772-1773,1"
> +	"776-1777,1780-1781,1784-1785,1788-1789,1792,1796,1800,1804,1808,1812,1816,1820,1824-1825,1828-1829,1832-1833,183"
> +	"6-1837,1840-1841,1844-1845,1848-1849,1852-1853,1856,1860,1864,1868,1872,1876,1880,1884,1888-1889,1892-1893,1896-"
> +	"1897,1900-1901,1904-1905,1908-1909,1912-1913,1916-1917,1920,1924,1928,1932,1936,1940,1944,1948,1952-1953,1956-19"
> +	"57,1960-1961,1964-1965,1968-1969,1972-1973,1976-1977,1980-1981,1984,1988,1992,1996,2000,2004,2008,2012,2016-2017"
> +	",2020-2021,2024-2025,2028-2029,2032-2033,2036-2037,2040-2041,2044-2045,2048,2052,2056,2060,2064,2068,2072,2076,2"
> +	"080-2081,2084-2085,2088-2089,2092-2093,2096-2097,2100-2101,2104-2105,2108-2109,2112,2116,2120,2124,2128,2132,213"
> +	"6,2140,2144-2145,2148-2149,2152-2153,2156-2157,2160-2161,2164-2165,2168-2169,2172-2173,2176,2180,2184,2188,2192,"
> +	"2196,2200,2204,2208-2209,2212-2213,2216-2217,2220-2221,2224-2225,2228-2229,2232-2233,2236-2237,2240,2244,2248,22"
> +	"52,2256,2260,2264,2268,2272-2273,2276-2277,2280-2281,2284-2285,2288-2289,2292-2293,2296-2297,2300-2301,2304,2308"
> +	",2312,2316,2320,2324,2328,2332,2336-2337,2340-2341,2344-2345,2348-2349,2352-2353,2356-2357,2360-2361,2364-2365,2"
> +	"368,2372,2376,2380,2384,2388,2392,2396,2400-2401,2404-2405,2408-2409,2412-2413,2416-2417,2420-2421,2424-2425,242"
> +	"8-2429,2432,2436,2440,2444,2448,2452,2456,2460,2464-2465,2468-2469,2472-2473,2476-2477,2480-2481,2484-2485,2488-"
> +	"2489,2492-2493,2496,2500,2504,2508,2512,2516,2520,2524,2528-2529,2532-2533,2536-2537,2540-2541,2544-2545,2548-25"
> +	"49,2552-2553,2556-2557\n";

Indeed it's longer than 80 chars per line, even than 100 chars. 

I think it's not possible for reviewers to manually check
correctness of such a huge test line. Could you consider using
bitmap_parselist() + strncmp() instead of manual inspecting.

> +static const struct test_bitmap_print test_print[] __initconst = {
> +	{ small_bitmap, sizeof(small_bitmap) * BITS_PER_BYTE, small_mask, small_list },
> +	{ large_bitmap, sizeof(large_bitmap) * BITS_PER_BYTE, large_mask, large_list },
> +};
> +
> +static void __init test_bitmap_print_buf(void)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(test_print); i++) {
> +		const struct test_bitmap_print *t = &test_print[i];
> +		int n;
> +
> +		n = bitmap_print_to_buf(false, print_buf, t->bitmap, t->nbits,
> +					0, 2 * PAGE_SIZE);
> +		expect_eq_uint(strlen(t->mask) + 1, n);
> +		expect_eq_str(t->mask, print_buf, n);
> +
> +		n = bitmap_print_to_buf(true, print_buf, t->bitmap, t->nbits,
> +					0, 2 * PAGE_SIZE);
> +		expect_eq_uint(strlen(t->list) + 1, n);
> +		expect_eq_str(t->list, print_buf, n);
> +
> +		/* test bitmap_print_to_buf by non-zero offset */
> +		if (strlen(t->list) > PAGE_SIZE) {
> +			n = bitmap_print_to_buf(true, print_buf, t->bitmap, t->nbits,
> +						PAGE_SIZE, PAGE_SIZE);
> +			expect_eq_uint(strlen(t->list) + 1 - PAGE_SIZE, n);
> +			expect_eq_str(t->list + PAGE_SIZE, print_buf, n);
> +		}
> +	}
> +}
> +
>  static void __init selftest(void)
>  {
>  	test_zero_clear();
> @@ -672,6 +821,7 @@ static void __init selftest(void)
>  	test_mem_optimisations();
>  	test_for_each_set_clump8();
>  	test_bitmap_cut();
> +	test_bitmap_print_buf();
>  }
>  
>  KSTM_MODULE_LOADERS(test_bitmap);
> -- 
> 2.25.1

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
  2021-07-15 12:09   ` Andy Shevchenko
@ 2021-07-15 20:47     ` Yury Norov
  2021-07-15 21:32       ` Andy Shevchenko
  0 siblings, 1 reply; 35+ messages in thread
From: Yury Norov @ 2021-07-15 20:47 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Barry Song, gregkh, akpm, linux-kernel, dave.hansen, linux,
	rafael, rdunlap, agordeev, sbrivio, jianpeng.ma,
	valentin.schneider, peterz, bristot, guodong.xu, tangchengchang,
	prime.zeng, yangyicong, tim.c.chen, linuxarm

On Thu, Jul 15, 2021 at 03:09:39PM +0300, Andy Shevchenko wrote:
> On Thu, Jul 15, 2021 at 11:58:56PM +1200, Barry Song wrote:
> > The added test items cover both cases where bitmap buf of the printed
> > result is greater than and less than 4KB.
> > And it also covers the case where offset for bitmap_print_to_buf is
> > non-zero which will happen when printed buf is larger than one page
> > in sysfs bin_attribute.
> 
> More test cases is always a good thing, thanks!

Generally yes. But in this case... I believe, Barry didn't write that
huge line below by himself. Most probably he copy-pasted the output of
his bitmap_print_buf() into the test. If so, this code tests nothing,
and just enforces current behavior of snprintf.

> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> 
> > Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
> > ---
> >  -v7:
> >  minor cleanup according to Andy's comments

[...]

> > +static const char large_list[] __initconst = /* more than 4KB */
> > +	"0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61,64,68,72,76,80,84,88,92,96-97,100-101,104-1"
> > +	"05,108-109,112-113,116-117,120-121,124-125,128,132,136,140,144,148,152,156,160-161,164-165,168-169,172-173,176-1"
> > +	"77,180-181,184-185,188-189,192,196,200,204,208,212,216,220,224-225,228-229,232-233,236-237,240-241,244-245,248-2"

I don't like this behavior of the code: each individual line is not a
valid bitmap_list. I would prefer to split original bitmap and print
list representation of parts in a compatible format; considering a
receiving part of this splitting machinery.

Thanks,
Yury

> > +	"49,252-253,256,260,264,268,272,276,280,284,288-289,292-293,296-297,300-301,304-305,308-309,312-313,316-317,320,3"
> > +	"24,328,332,336,340,344,348,352-353,356-357,360-361,364-365,368-369,372-373,376-377,380-381,384,388,392,396,400,4"
> > +	"04,408,412,416-417,420-421,424-425,428-429,432-433,436-437,440-441,444-445,448,452,456,460,464,468,472,476,480-4"
> > +	"81,484-485,488-489,492-493,496-497,500-501,504-505,508-509,512,516,520,524,528,532,536,540,544-545,548-549,552-5"
> > +	"53,556-557,560-561,564-565,568-569,572-573,576,580,584,588,592,596,600,604,608-609,612-613,616-617,620-621,624-6"
> > +	"25,628-629,632-633,636-637,640,644,648,652,656,660,664,668,672-673,676-677,680-681,684-685,688-689,692-693,696-6"
> > +	"97,700-701,704,708,712,716,720,724,728,732,736-737,740-741,744-745,748-749,752-753,756-757,760-761,764-765,768,7"
> > +	"72,776,780,784,788,792,796,800-801,804-805,808-809,812-813,816-817,820-821,824-825,828-829,832,836,840,844,848,8"
> > +	"52,856,860,864-865,868-869,872-873,876-877,880-881,884-885,888-889,892-893,896,900,904,908,912,916,920,924,928-9"
> > +	"29,932-933,936-937,940-941,944-945,948-949,952-953,956-957,960,964,968,972,976,980,984,988,992-993,996-997,1000-"
> > +	"1001,1004-1005,1008-1009,1012-1013,1016-1017,1020-1021,1024,1028,1032,1036,1040,1044,1048,1052,1056-1057,1060-10"
> > +	"61,1064-1065,1068-1069,1072-1073,1076-1077,1080-1081,1084-1085,1088,1092,1096,1100,1104,1108,1112,1116,1120-1121"
> > +	",1124-1125,1128-1129,1132-1133,1136-1137,1140-1141,1144-1145,1148-1149,1152,1156,1160,1164,1168,1172,1176,1180,1"
> > +	"184-1185,1188-1189,1192-1193,1196-1197,1200-1201,1204-1205,1208-1209,1212-1213,1216,1220,1224,1228,1232,1236,124"
> > +	"0,1244,1248-1249,1252-1253,1256-1257,1260-1261,1264-1265,1268-1269,1272-1273,1276-1277,1280,1284,1288,1292,1296,"
> > +	"1300,1304,1308,1312-1313,1316-1317,1320-1321,1324-1325,1328-1329,1332-1333,1336-1337,1340-1341,1344,1348,1352,13"
> > +	"56,1360,1364,1368,1372,1376-1377,1380-1381,1384-1385,1388-1389,1392-1393,1396-1397,1400-1401,1404-1405,1408,1412"
> > +	",1416,1420,1424,1428,1432,1436,1440-1441,1444-1445,1448-1449,1452-1453,1456-1457,1460-1461,1464-1465,1468-1469,1"
> > +	"472,1476,1480,1484,1488,1492,1496,1500,1504-1505,1508-1509,1512-1513,1516-1517,1520-1521,1524-1525,1528-1529,153"
> > +	"2-1533,1536,1540,1544,1548,1552,1556,1560,1564,1568-1569,1572-1573,1576-1577,1580-1581,1584-1585,1588-1589,1592-"
> > +	"1593,1596-1597,1600,1604,1608,1612,1616,1620,1624,1628,1632-1633,1636-1637,1640-1641,1644-1645,1648-1649,1652-16"
> > +	"53,1656-1657,1660-1661,1664,1668,1672,1676,1680,1684,1688,1692,1696-1697,1700-1701,1704-1705,1708-1709,1712-1713"
> > +	",1716-1717,1720-1721,1724-1725,1728,1732,1736,1740,1744,1748,1752,1756,1760-1761,1764-1765,1768-1769,1772-1773,1"
> > +	"776-1777,1780-1781,1784-1785,1788-1789,1792,1796,1800,1804,1808,1812,1816,1820,1824-1825,1828-1829,1832-1833,183"
> > +	"6-1837,1840-1841,1844-1845,1848-1849,1852-1853,1856,1860,1864,1868,1872,1876,1880,1884,1888-1889,1892-1893,1896-"
> > +	"1897,1900-1901,1904-1905,1908-1909,1912-1913,1916-1917,1920,1924,1928,1932,1936,1940,1944,1948,1952-1953,1956-19"
> > +	"57,1960-1961,1964-1965,1968-1969,1972-1973,1976-1977,1980-1981,1984,1988,1992,1996,2000,2004,2008,2012,2016-2017"
> > +	",2020-2021,2024-2025,2028-2029,2032-2033,2036-2037,2040-2041,2044-2045,2048,2052,2056,2060,2064,2068,2072,2076,2"
> > +	"080-2081,2084-2085,2088-2089,2092-2093,2096-2097,2100-2101,2104-2105,2108-2109,2112,2116,2120,2124,2128,2132,213"
> > +	"6,2140,2144-2145,2148-2149,2152-2153,2156-2157,2160-2161,2164-2165,2168-2169,2172-2173,2176,2180,2184,2188,2192,"
> > +	"2196,2200,2204,2208-2209,2212-2213,2216-2217,2220-2221,2224-2225,2228-2229,2232-2233,2236-2237,2240,2244,2248,22"
> > +	"52,2256,2260,2264,2268,2272-2273,2276-2277,2280-2281,2284-2285,2288-2289,2292-2293,2296-2297,2300-2301,2304,2308"
> > +	",2312,2316,2320,2324,2328,2332,2336-2337,2340-2341,2344-2345,2348-2349,2352-2353,2356-2357,2360-2361,2364-2365,2"
> > +	"368,2372,2376,2380,2384,2388,2392,2396,2400-2401,2404-2405,2408-2409,2412-2413,2416-2417,2420-2421,2424-2425,242"
> > +	"8-2429,2432,2436,2440,2444,2448,2452,2456,2460,2464-2465,2468-2469,2472-2473,2476-2477,2480-2481,2484-2485,2488-"
> > +	"2489,2492-2493,2496,2500,2504,2508,2512,2516,2520,2524,2528-2529,2532-2533,2536-2537,2540-2541,2544-2545,2548-25"
> > +	"49,2552-2553,2556-2557\n";
> > +
> > +static const struct test_bitmap_print test_print[] __initconst = {
> > +	{ small_bitmap, sizeof(small_bitmap) * BITS_PER_BYTE, small_mask, small_list },
> > +	{ large_bitmap, sizeof(large_bitmap) * BITS_PER_BYTE, large_mask, large_list },
> > +};
> > +
> > +static void __init test_bitmap_print_buf(void)
> > +{
> > +	int i;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(test_print); i++) {
> > +		const struct test_bitmap_print *t = &test_print[i];
> > +		int n;
> > +
> > +		n = bitmap_print_to_buf(false, print_buf, t->bitmap, t->nbits,
> > +					0, 2 * PAGE_SIZE);
> > +		expect_eq_uint(strlen(t->mask) + 1, n);
> > +		expect_eq_str(t->mask, print_buf, n);
> > +
> > +		n = bitmap_print_to_buf(true, print_buf, t->bitmap, t->nbits,
> > +					0, 2 * PAGE_SIZE);
> > +		expect_eq_uint(strlen(t->list) + 1, n);
> > +		expect_eq_str(t->list, print_buf, n);
> > +
> > +		/* test bitmap_print_to_buf by non-zero offset */
> > +		if (strlen(t->list) > PAGE_SIZE) {
> > +			n = bitmap_print_to_buf(true, print_buf, t->bitmap, t->nbits,
> > +						PAGE_SIZE, PAGE_SIZE);
> > +			expect_eq_uint(strlen(t->list) + 1 - PAGE_SIZE, n);
> > +			expect_eq_str(t->list + PAGE_SIZE, print_buf, n);
> > +		}
> > +	}
> > +}
> > +
> >  static void __init selftest(void)
> >  {
> >  	test_zero_clear();
> > @@ -672,6 +821,7 @@ static void __init selftest(void)
> >  	test_mem_optimisations();
> >  	test_for_each_set_clump8();
> >  	test_bitmap_cut();
> > +	test_bitmap_print_buf();
> >  }
> >  
> >  KSTM_MODULE_LOADERS(test_bitmap);
> > -- 
> > 2.25.1
> > 
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v7 1/4] cpumask: introduce cpumap_print_to_buf to support large bitmask and list
  2021-07-15 15:28   ` Yury Norov
@ 2021-07-15 21:08     ` Song Bao Hua (Barry Song)
  2021-07-16  0:57     ` Song Bao Hua (Barry Song)
  1 sibling, 0 replies; 35+ messages in thread
From: Song Bao Hua (Barry Song) @ 2021-07-15 21:08 UTC (permalink / raw)
  To: Yury Norov
  Cc: gregkh, akpm, andriy.shevchenko, linux-kernel, dave.hansen,
	linux, rafael, rdunlap, agordeev, sbrivio, jianpeng.ma,
	valentin.schneider, peterz, bristot, guodong.xu, tangchengchang,
	Zengtao (B), yangyicong, tim.c.chen, Linuxarm, tiantao (H)



> -----Original Message-----
> From: Yury Norov [mailto:yury.norov@gmail.com]
> Sent: Friday, July 16, 2021 3:29 AM
> To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>
> Cc: gregkh@linuxfoundation.org; akpm@linux-foundation.org;
> andriy.shevchenko@linux.intel.com; linux-kernel@vger.kernel.org;
> dave.hansen@intel.com; linux@rasmusvillemoes.dk; rafael@kernel.org;
> rdunlap@infradead.org; agordeev@linux.ibm.com; sbrivio@redhat.com;
> jianpeng.ma@intel.com; valentin.schneider@arm.com; peterz@infradead.org;
> bristot@redhat.com; guodong.xu@linaro.org; tangchengchang
> <tangchengchang@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>;
> yangyicong <yangyicong@huawei.com>; tim.c.chen@linux.intel.com; Linuxarm
> <linuxarm@huawei.com>; tiantao (H) <tiantao6@hisilicon.com>
> Subject: Re: [PATCH v7 1/4] cpumask: introduce cpumap_print_to_buf to support
> large bitmask and list
> 
> On Thu, Jul 15, 2021 at 11:58:53PM +1200, Barry Song wrote:
> > (10.1.198.147)
> > X-CFilter-Loop: Reflected
> > Status: O
> > Content-Length: 10263
> > Lines: 252
> >
> > From: Tian Tao <tiantao6@hisilicon.com>
> 
> [...]
> 
> > +int bitmap_print_to_buf(bool list, char *buf, const unsigned long *maskp,
> > +		int nmaskbits, loff_t off, size_t count)
> > +{
> > +	const char *fmt = list ? "%*pbl\n" : "%*pb\n";
> > +	ssize_t size;
> > +	void *data;
> > +
> > +	data = kasprintf(GFP_KERNEL, fmt, nmaskbits, maskp);
> > +	if (!data)
> > +		return -ENOMEM;
> > +
> > +	size = memory_read_from_buffer(buf, count, &off, data, strlen(data) + 1);
> > +	kfree(data);
> > +
> > +	return size;
> > +}
> > +EXPORT_SYMBOL(bitmap_print_to_buf);
> 
> In discussion to v4 of this series I've pointed out inefficiency of
> this approach. Now it's v7, but the problem is still there.
> 
> 1. You make user of your API guess aboout proper @count without any
>    hint. This is worse than how it works now with pure vsnprintf().

This isn't true. While this count comes from sysfs bin_attribute,
sysfs show() entry guarantee the count is proper and inside the
valid range of the buffer. Otherwise, sysfs bin_attribute has totally
broken for all users.

> 2. For big bitmaps and small @count, your code will make enormous
>    amount of unneeded work. For example, if the full length of string
>    representation of bitmap is 1M, and length of the output buffer is
>    1k, one will have to call bitmap_print_to_buf() 1000 times.  With
>    current design it assumes that you allocate the full amount of memory
>    1000 times, free it 1000 times and print huge bitmap 1000 times to
>    just return small part of it.

This isn't true either. Nobody is actually holding a cpumap like 1MB.
4KB has been used in current kernel for a long time, no machine
has really complained it is not enough. So I would expect the real
case would be one time for majority, perhaps twice for some machines
which we haven't seen yet.

> 
> NAK for this, and please stop submitting wrong approach again and
> again.

I have always answered your email and explained with a lot of word,
but you totally ignored my explanation and didn't even answer my
explanation in v5 and v6. That seems quite unfair.

Considering a driver which has M cpus and N different topology 
entries in its show entry:

example_bin_attribute_show(struct file *filp, struct kobject *kobj,
             struct bin_attribute *attr, char *buf,
             loff_t offset, size_t count)
{
	...
}
In case what you say is true and this show() is called 1000 times
with different offset if the buffer is as big as 1MB.
How would the code work by reusing a buffer allocated in advance,
like below?

example_bin_attribute_show(struct file *filp, struct kobject *kobj,
             struct bin_attribute *attr, char *buf,
             loff_t offset, size_t count)
{
    //1st time:
	char *bitmap_buf = bitmap_buffer_allocate(....);
	save bitmap_buf to somewhere?

    //2nd~1000 time
    reuse the bitmap_buf?
    
    //1000time
    Free the bitmap buf?
}

Or like below?

char *global_bitmap_buf = bitmap_buffer_allocate(....)?

example_bin_attribute_show(struct file *filp, struct kobject *kobj,
             struct bin_attribute *attr, char *buf,
             loff_t offset, size_t count)
{
    //1st - 1000 time?
    Reuse the global_bitmap_buf?
}

Neither of the above way is good to me. The 1st one is not doable
at all. The second one is not doable either. M*N entries will have
different size and the buffer being re-used seems to be wasting
memory.

I'd appreciate if you could post some pseudo code so that v8 could
make some actual difference and go to the way you prefer.
On the other hand, we also need Greg's Ack on driver changes which
might happen afterwards.

So please post some pseudo code rather than simply put a NAK
by ignoring my long explanation.

Thanks
Barry


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
  2021-07-15 20:47     ` Yury Norov
@ 2021-07-15 21:32       ` Andy Shevchenko
  2021-07-15 23:23         ` Yury Norov
  0 siblings, 1 reply; 35+ messages in thread
From: Andy Shevchenko @ 2021-07-15 21:32 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Barry Song, Greg Kroah-Hartman, Andrew Morton,
	Linux Kernel Mailing List, Dave Hansen, Rasmus Villemoes,
	Rafael J. Wysocki, Randy Dunlap, Alexander Gordeev,
	Stefano Brivio, Ma, Jianpeng, Valentin Schneider,
	Peter Zijlstra (Intel),
	Daniel Bristot de Oliveira, Guodong Xu, tangchengchang,
	Zengtao (B),
	yangyicong, tim.c.chen, Linuxarm

On Thu, Jul 15, 2021 at 11:48 PM Yury Norov <yury.norov@gmail.com> wrote:
> On Thu, Jul 15, 2021 at 03:09:39PM +0300, Andy Shevchenko wrote:
> > On Thu, Jul 15, 2021 at 11:58:56PM +1200, Barry Song wrote:
> > > The added test items cover both cases where bitmap buf of the printed
> > > result is greater than and less than 4KB.
> > > And it also covers the case where offset for bitmap_print_to_buf is
> > > non-zero which will happen when printed buf is larger than one page
> > > in sysfs bin_attribute.
> >
> > More test cases is always a good thing, thanks!
>
> Generally yes. But in this case... I believe, Barry didn't write that
> huge line below by himself. Most probably he copy-pasted the output of
> his bitmap_print_buf() into the test. If so, this code tests nothing,
> and just enforces current behavior of snprintf.

I'm not sure I got what you are telling me. The big line is to test
strings that are bigger than 4k.

...

> > > +static const char large_list[] __initconst = /* more than 4KB */
> > > +   "0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61,64,68,72,76,80,84,88,92,96-97,100-101,104-1"
> > > +   "05,108-109,112-113,116-117,120-121,124-125,128,132,136,140,144,148,152,156,160-161,164-165,168-169,172-173,176-1"
> > > +   "77,180-181,184-185,188-189,192,196,200,204,208,212,216,220,224-225,228-229,232-233,236-237,240-241,244-245,248-2"
>
> I don't like this behavior of the code: each individual line is not a
> valid bitmap_list. I would prefer to split original bitmap and print
> list representation of parts in a compatible format; considering a
> receiving part of this splitting machinery.

I agree that split is not the best here, but after all it's only 1
line and this is on purpose.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
  2021-07-15 21:32       ` Andy Shevchenko
@ 2021-07-15 23:23         ` Yury Norov
  2021-07-16  0:41           ` Song Bao Hua (Barry Song)
  2021-07-21 11:40           ` Greg Kroah-Hartman
  0 siblings, 2 replies; 35+ messages in thread
From: Yury Norov @ 2021-07-15 23:23 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Andy Shevchenko, Barry Song, Greg Kroah-Hartman, Andrew Morton,
	Linux Kernel Mailing List, Dave Hansen, Rasmus Villemoes,
	Rafael J. Wysocki, Randy Dunlap, Alexander Gordeev,
	Stefano Brivio, Ma, Jianpeng, Valentin Schneider,
	Peter Zijlstra (Intel),
	Daniel Bristot de Oliveira, Guodong Xu, tangchengchang,
	Zengtao (B),
	yangyicong, tim.c.chen, Linuxarm

On Fri, Jul 16, 2021 at 12:32:45AM +0300, Andy Shevchenko wrote:
> On Thu, Jul 15, 2021 at 11:48 PM Yury Norov <yury.norov@gmail.com> wrote:
> > On Thu, Jul 15, 2021 at 03:09:39PM +0300, Andy Shevchenko wrote:
> > > On Thu, Jul 15, 2021 at 11:58:56PM +1200, Barry Song wrote:
> > > > The added test items cover both cases where bitmap buf of the printed
> > > > result is greater than and less than 4KB.
> > > > And it also covers the case where offset for bitmap_print_to_buf is
> > > > non-zero which will happen when printed buf is larger than one page
> > > > in sysfs bin_attribute.
> > >
> > > More test cases is always a good thing, thanks!
> >
> > Generally yes. But in this case... I believe, Barry didn't write that
> > huge line below by himself. Most probably he copy-pasted the output of
> > his bitmap_print_buf() into the test. If so, this code tests nothing,
> > and just enforces current behavior of snprintf.
> 
> I'm not sure I got what you are telling me. The big line is to test
> strings that are bigger than 4k.

I'm trying to say that human are not able to verify correctness of
this line. The test is supposed to check bitmap_print_to_buf(), but
reference output itself is generated by bitmap_print_to_buf(). This
test will always pass by design, even if there's an error somewhere
in the middle, isn't it?

> 
> ...
> 
> > > > +static const char large_list[] __initconst = /* more than 4KB */
> > > > +   "0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61,64,68,72,76,80,84,88,92,96-97,100-101,104-1"
> > > > +   "05,108-109,112-113,116-117,120-121,124-125,128,132,136,140,144,148,152,156,160-161,164-165,168-169,172-173,176-1"
> > > > +   "77,180-181,184-185,188-189,192,196,200,204,208,212,216,220,224-225,228-229,232-233,236-237,240-241,244-245,248-2"
> >
> > I don't like this behavior of the code: each individual line is not a
> > valid bitmap_list. I would prefer to split original bitmap and print
> > list representation of parts in a compatible format; considering a
> > receiving part of this splitting machinery.
> 
> I agree that split is not the best here, but after all it's only 1
> line and this is on purpose.

What I see is that bitmap_print_to_buf() is called many times, and
each time it returns something that is not a valid bitmap list string.
If the caller was be able to concatenate all the lines returned by
bitmap_print_to_buf(), he'd probably get correct result. But in such
case, why don't he use scnprintf("pbl") directly?

If there exists a real case where scnprintf("pbl") output is too long
(or execution time is too slow) that we need to split, the right
approach is to split the original bitmap, not an output of
scnprintf("pbl").

And I still don't understand, what prevents the higher levels from
calling the scnprintf() directly in this specific case?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
  2021-07-15 23:23         ` Yury Norov
@ 2021-07-16  0:41           ` Song Bao Hua (Barry Song)
  2021-07-21 11:40           ` Greg Kroah-Hartman
  1 sibling, 0 replies; 35+ messages in thread
From: Song Bao Hua (Barry Song) @ 2021-07-16  0:41 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko
  Cc: Andy Shevchenko, Greg Kroah-Hartman, Andrew Morton,
	Linux Kernel Mailing List, Dave Hansen, Rasmus Villemoes,
	Rafael J. Wysocki, Randy Dunlap, Alexander Gordeev,
	Stefano Brivio, Ma, Jianpeng, Valentin Schneider,
	Peter Zijlstra (Intel),
	Daniel Bristot de Oliveira, Guodong Xu, tangchengchang,
	Zengtao (B),
	yangyicong, tim.c.chen, Linuxarm



> -----Original Message-----
> From: Yury Norov [mailto:yury.norov@gmail.com]
> Sent: Friday, July 16, 2021 11:24 AM
> To: Andy Shevchenko <andy.shevchenko@gmail.com>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>; Song Bao Hua (Barry
> Song) <song.bao.hua@hisilicon.com>; Greg Kroah-Hartman
> <gregkh@linuxfoundation.org>; Andrew Morton <akpm@linux-foundation.org>;
> Linux Kernel Mailing List <linux-kernel@vger.kernel.org>; Dave Hansen
> <dave.hansen@intel.com>; Rasmus Villemoes <linux@rasmusvillemoes.dk>; Rafael
> J. Wysocki <rafael@kernel.org>; Randy Dunlap <rdunlap@infradead.org>;
> Alexander Gordeev <agordeev@linux.ibm.com>; Stefano Brivio
> <sbrivio@redhat.com>; Ma, Jianpeng <jianpeng.ma@intel.com>; Valentin
> Schneider <valentin.schneider@arm.com>; Peter Zijlstra (Intel)
> <peterz@infradead.org>; Daniel Bristot de Oliveira <bristot@redhat.com>;
> Guodong Xu <guodong.xu@linaro.org>; tangchengchang
> <tangchengchang@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>;
> yangyicong <yangyicong@huawei.com>; tim.c.chen@linux.intel.com; Linuxarm
> <linuxarm@huawei.com>
> Subject: Re: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
> 
> On Fri, Jul 16, 2021 at 12:32:45AM +0300, Andy Shevchenko wrote:
> > On Thu, Jul 15, 2021 at 11:48 PM Yury Norov <yury.norov@gmail.com> wrote:
> > > On Thu, Jul 15, 2021 at 03:09:39PM +0300, Andy Shevchenko wrote:
> > > > On Thu, Jul 15, 2021 at 11:58:56PM +1200, Barry Song wrote:
> > > > > The added test items cover both cases where bitmap buf of the printed
> > > > > result is greater than and less than 4KB.
> > > > > And it also covers the case where offset for bitmap_print_to_buf is
> > > > > non-zero which will happen when printed buf is larger than one page
> > > > > in sysfs bin_attribute.
> > > >
> > > > More test cases is always a good thing, thanks!
> > >
> > > Generally yes. But in this case... I believe, Barry didn't write that
> > > huge line below by himself. Most probably he copy-pasted the output of
> > > his bitmap_print_buf() into the test. If so, this code tests nothing,
> > > and just enforces current behavior of snprintf.
> >
> > I'm not sure I got what you are telling me. The big line is to test
> > strings that are bigger than 4k.
> 
> I'm trying to say that human are not able to verify correctness of
> this line. The test is supposed to check bitmap_print_to_buf(), but
> reference output itself is generated by bitmap_print_to_buf(). This
> test will always pass by design, even if there's an error somewhere
> in the middle, isn't it?

So would it be better to compare the output of bitmap_print_to_buf()
with the output of the direct scnprintf("pbl")?  In this case, we have
to assume scnprintf("pbl") is always right.

> 
> >
> > ...
> >
> > > > > +static const char large_list[] __initconst = /* more than 4KB */
> > > > > +
> "0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61,64,6
> 8,72,76,80,84,88,92,96-97,100-101,104-1"
> > > > > +
> "05,108-109,112-113,116-117,120-121,124-125,128,132,136,140,144,148,152,15
> 6,160-161,164-165,168-169,172-173,176-1"
> > > > > +
> "77,180-181,184-185,188-189,192,196,200,204,208,212,216,220,224-225,228-22
> 9,232-233,236-237,240-241,244-245,248-2"
> > >
> > > I don't like this behavior of the code: each individual line is not a
> > > valid bitmap_list. I would prefer to split original bitmap and print
> > > list representation of parts in a compatible format; considering a
> > > receiving part of this splitting machinery.
> >
> > I agree that split is not the best here, but after all it's only 1
> > line and this is on purpose.
> 
> What I see is that bitmap_print_to_buf() is called many times, and
> each time it returns something that is not a valid bitmap list string.
> If the caller was be able to concatenate all the lines returned by
> bitmap_print_to_buf(), he'd probably get correct result. But in such
> case, why don't he use scnprintf("pbl") directly?

I still don't understand what you mean, for a sys ABI like list, its
main users are random "cat" commands:
$ cat /sys/devices/system/cpu/cpu0/topology/thread_siblings_list
0-1

Users "cat" the file to get a human-readable list. Users who are
"catting" sysfs have no bitmap to run scnprintf("pbl").

> 
> If there exists a real case where scnprintf("pbl") output is too long
> (or execution time is too slow) that we need to split, the right
> approach is to split the original bitmap, not an output of
> scnprintf("pbl").

The whole patchset is printing bitmap to a list or bitmask
so that users can "cat" the files to get a human-readable list.
But due to the limit of sysfs attribute, the list will be
trimmed if the printed result is more than 4KB.

> 
> And I still don't understand, what prevents the higher levels from
> calling the scnprintf() directly in this specific case?

Again, I don't understand what you mean. Higher levels like those drivers
can, for sure, call scnprintf() directly, then we have to copy this kind
of code here and there. With a cpumap API, those drivers can simply call
an API.

Anyway, it is really difficult for me to understand the approach
you are expecting. I'd really appreciate you can post some pseudo
code for the modification of drivers to make it clear. In my personal
opinion, the current approach from tian tao for patch1-3 is the best
choice from the perspective of the API's users - sysfs bin_attribute.

Thanks
Barry


^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v7 1/4] cpumask: introduce cpumap_print_to_buf to support large bitmask and list
  2021-07-15 15:28   ` Yury Norov
  2021-07-15 21:08     ` Song Bao Hua (Barry Song)
@ 2021-07-16  0:57     ` Song Bao Hua (Barry Song)
  1 sibling, 0 replies; 35+ messages in thread
From: Song Bao Hua (Barry Song) @ 2021-07-16  0:57 UTC (permalink / raw)
  To: Yury Norov
  Cc: gregkh, akpm, andriy.shevchenko, linux-kernel, dave.hansen,
	linux, rafael, rdunlap, agordeev, sbrivio, jianpeng.ma,
	valentin.schneider, peterz, bristot, guodong.xu, tangchengchang,
	Zengtao (B), yangyicong, tim.c.chen, Linuxarm, tiantao (H)



> -----Original Message-----
> From: Song Bao Hua (Barry Song)
> Sent: Friday, July 16, 2021 9:08 AM
> To: 'Yury Norov' <yury.norov@gmail.com>
> Cc: gregkh@linuxfoundation.org; akpm@linux-foundation.org;
> andriy.shevchenko@linux.intel.com; linux-kernel@vger.kernel.org;
> dave.hansen@intel.com; linux@rasmusvillemoes.dk; rafael@kernel.org;
> rdunlap@infradead.org; agordeev@linux.ibm.com; sbrivio@redhat.com;
> jianpeng.ma@intel.com; valentin.schneider@arm.com; peterz@infradead.org;
> bristot@redhat.com; guodong.xu@linaro.org; tangchengchang
> <tangchengchang@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>;
> yangyicong <yangyicong@huawei.com>; tim.c.chen@linux.intel.com; Linuxarm
> <linuxarm@huawei.com>; tiantao (H) <tiantao6@hisilicon.com>
> Subject: RE: [PATCH v7 1/4] cpumask: introduce cpumap_print_to_buf to support
> large bitmask and list
> 
> 
> 
> > -----Original Message-----
> > From: Yury Norov [mailto:yury.norov@gmail.com]
> > Sent: Friday, July 16, 2021 3:29 AM
> > To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>
> > Cc: gregkh@linuxfoundation.org; akpm@linux-foundation.org;
> > andriy.shevchenko@linux.intel.com; linux-kernel@vger.kernel.org;
> > dave.hansen@intel.com; linux@rasmusvillemoes.dk; rafael@kernel.org;
> > rdunlap@infradead.org; agordeev@linux.ibm.com; sbrivio@redhat.com;
> > jianpeng.ma@intel.com; valentin.schneider@arm.com; peterz@infradead.org;
> > bristot@redhat.com; guodong.xu@linaro.org; tangchengchang
> > <tangchengchang@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>;
> > yangyicong <yangyicong@huawei.com>; tim.c.chen@linux.intel.com; Linuxarm
> > <linuxarm@huawei.com>; tiantao (H) <tiantao6@hisilicon.com>
> > Subject: Re: [PATCH v7 1/4] cpumask: introduce cpumap_print_to_buf to support
> > large bitmask and list
> >
> > On Thu, Jul 15, 2021 at 11:58:53PM +1200, Barry Song wrote:
> > > (10.1.198.147)
> > > X-CFilter-Loop: Reflected
> > > Status: O
> > > Content-Length: 10263
> > > Lines: 252
> > >
> > > From: Tian Tao <tiantao6@hisilicon.com>
> >
> > [...]
> >
> > > +int bitmap_print_to_buf(bool list, char *buf, const unsigned long *maskp,
> > > +		int nmaskbits, loff_t off, size_t count)
> > > +{
> > > +	const char *fmt = list ? "%*pbl\n" : "%*pb\n";
> > > +	ssize_t size;
> > > +	void *data;
> > > +
> > > +	data = kasprintf(GFP_KERNEL, fmt, nmaskbits, maskp);
> > > +	if (!data)
> > > +		return -ENOMEM;
> > > +
> > > +	size = memory_read_from_buffer(buf, count, &off, data, strlen(data) +
> 1);
> > > +	kfree(data);
> > > +
> > > +	return size;
> > > +}
> > > +EXPORT_SYMBOL(bitmap_print_to_buf);
> >
> > In discussion to v4 of this series I've pointed out inefficiency of
> > this approach. Now it's v7, but the problem is still there.
> >
> > 1. You make user of your API guess aboout proper @count without any
> >    hint. This is worse than how it works now with pure vsnprintf().
> 
> This isn't true. While this count comes from sysfs bin_attribute,
> sysfs show() entry guarantee the count is proper and inside the
> valid range of the buffer. Otherwise, sysfs bin_attribute has totally
> broken for all users.

In addition, "You make user of your API guess about proper @count
without any hint" isn't true either. For a sysfs ABI and other
files, users will get EOF when it arrives the end of the file.
This is exactly the hint for users to stop further reading.

The new APIs are serving ABI purpose. hardly other kernel modules
will want a printed list.

Thanks
Barry


^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v7 2/4] topology: use bin_attribute to break the size limitation of cpumap ABI
  2021-07-15 11:58 ` [PATCH v7 2/4] topology: use bin_attribute to break the size limitation of cpumap ABI Barry Song
@ 2021-07-16  8:49   ` Song Bao Hua (Barry Song)
  2021-07-16 20:04     ` Yury Norov
  0 siblings, 1 reply; 35+ messages in thread
From: Song Bao Hua (Barry Song) @ 2021-07-16  8:49 UTC (permalink / raw)
  To: gregkh, akpm, andriy.shevchenko, yury.norov, linux-kernel
  Cc: dave.hansen, linux, rafael, rdunlap, agordeev, sbrivio,
	jianpeng.ma, valentin.schneider, peterz, bristot, guodong.xu,
	tangchengchang, Zengtao (B),
	yangyicong, tim.c.chen, Linuxarm, tiantao (H)

Hi Yury,
Not sure if I have totally got your idea. But please see if the below
is closer to what you prefer.

I haven't really tested it. But this approach can somehow solve the
problem you mentioned(malloc/free and printing is done 1000times for
a 1MB buffer which is read 1K each time).

Bitmap provides some API to alloc and return print_buf:

+ssize_t bitmap_get_print_buf(bool list, char **buf, const unsigned long *maskp,
+               int nmaskbits)
+{
+       const char *fmt = list ? "%*pbl\n" : "%*pb\n";
+       ssize_t size;
+
+       size = snprintf(NULL, 0, fmt, nmaskbits, maskp);
+       *buf = kmalloc_track_caller(size + 1, GFP_KERNEL);
+       scnprintf(*buf, size + 1, fmt, nmaskbits, maskp);
+
+       return size + 1;
+}
+
+static inline ssize_t
+cpumap_get_print_buf(bool list, char **buf, const struct cpumask *mask)
+{
+       return bitmap_get_print_buf(list, buf, cpumask_bits(mask),
+                       nr_cpu_ids);
+}
+
+struct bitmap_print_buf
+{
+       char *buf;
+       ssize_t size;
+};
+

In bin_attribute, move to get and save the buffer while sysfs entry is
read at the first time and free it when file arrives EOF:

 #define define_id_show_func(name)                                      \
 static ssize_t name##_show(struct device *dev,                         \
                           struct device_attribute *attr, char *buf)    \
@@ -27,9 +53,27 @@ static ssize_t name##_read(struct file *file, struct kobject *kobj,          \
                                  loff_t off, size_t count)                     \
 {                                                                              \
        struct device *dev = kobj_to_dev(kobj);                                 \
-                                                                               \
-       return cpumap_print_to_buf(false, buf, topology_##mask(dev->id),        \
-                                  off, count);                                 \
+       struct bitmap_print_buf *bmb = dev_get_drvdata(dev);                    \
+       if (!bmb) {                                                             \
+               bmb = devm_kzalloc(dev, sizeof(*bmb), GFP_KERNEL);              \
+               if (!bmb)                                                       \
+                       return -ENOMEM;                                         \
+               dev_set_drvdata(dev, bmb);                                      \
+       }                                                                       \
+       /* for the 1st time, getting the printed buffer */                \
+       if (!bmb->buf)                                                           \
+               bmb->size = cpumap_get_print_buf(false, &bmb->buf,       \
+                                    topology_##mask(dev->id));                 \
+       /* when we arrive EOF, free the printed buffer */                       \
+       if (off >= bmb->size) {                                                 \
+               kfree(bmb->buf);  bmb->buf = NULL;                                   \
+               return 0;                                                       \
+       }                                                                       \
+       /* while a large printed buffer is read many times, we reuse         \
+        * the buffer we get at the 1st time                                    \
+        */                                                                     \
+       strncpy(buf, bmb->buf + off, count);                                    \
+       return min(count,  bmb->size - off);                                    \
 }                                                                              \
                                                                                \
This means a huge change in drivers though I am not sure Greg is
a fan of this approach. Anyway, "1000 times" is not a real case.
Typically we will arrive EOF after one time.

Thanks
Barry

> -----Original Message-----
> From: Song Bao Hua (Barry Song)
> Sent: Thursday, July 15, 2021 11:59 PM
> To: gregkh@linuxfoundation.org; akpm@linux-foundation.org;
> andriy.shevchenko@linux.intel.com; yury.norov@gmail.com;
> linux-kernel@vger.kernel.org
> Cc: dave.hansen@intel.com; linux@rasmusvillemoes.dk; rafael@kernel.org;
> rdunlap@infradead.org; agordeev@linux.ibm.com; sbrivio@redhat.com;
> jianpeng.ma@intel.com; valentin.schneider@arm.com; peterz@infradead.org;
> bristot@redhat.com; guodong.xu@linaro.org; tangchengchang
> <tangchengchang@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>;
> yangyicong <yangyicong@huawei.com>; tim.c.chen@linux.intel.com; Linuxarm
> <linuxarm@huawei.com>; tiantao (H) <tiantao6@hisilicon.com>; Song Bao Hua
> (Barry Song) <song.bao.hua@hisilicon.com>
> Subject: [PATCH v7 2/4] topology: use bin_attribute to break the size limitation
> of cpumap ABI
> 
> From: Tian Tao <tiantao6@hisilicon.com>
> 
> Reading /sys/devices/system/cpu/cpuX/topology/ returns cpu topology.
> However, the size of this file is limited to PAGE_SIZE because of
> the limitation for sysfs attribute.
> This patch moves to use bin_attribute to extend the ABI to be more
> than one page so that cpumap bitmask and list won't be potentially
> trimmed.
> 
> Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
> ---
>  drivers/base/topology.c | 115 ++++++++++++++++++++++------------------
>  1 file changed, 63 insertions(+), 52 deletions(-)
> 
> diff --git a/drivers/base/topology.c b/drivers/base/topology.c
> index 4d254fcc93d1..dd3980124e33 100644
> --- a/drivers/base/topology.c
> +++ b/drivers/base/topology.c
> @@ -21,25 +21,27 @@ static ssize_t name##_show(struct device *dev,
> 	\
>  	return sysfs_emit(buf, "%d\n", topology_##name(dev->id));	\
>  }
> 
> -#define define_siblings_show_map(name, mask)				\
> -static ssize_t name##_show(struct device *dev,				\
> -			   struct device_attribute *attr, char *buf)	\
> -{									\
> -	return cpumap_print_to_pagebuf(false, buf, topology_##mask(dev->id));\
> +#define define_siblings_read_func(name, mask)					\
> +static ssize_t name##_read(struct file *file, struct kobject *kobj,		\
> +				  struct bin_attribute *attr, char *buf,	\
> +				  loff_t off, size_t count)			\
> +{										\
> +	struct device *dev = kobj_to_dev(kobj);                                 \
> +										\
> +	return cpumap_print_to_buf(false, buf, topology_##mask(dev->id),	\
> +				   off, count);                                 \
> +}										\
> +										\
> +static ssize_t name##_list_read(struct file *file, struct kobject *kobj,	\
> +				  struct bin_attribute *attr, char *buf,	\
> +				  loff_t off, size_t count)			\
> +{										\
> +	struct device *dev = kobj_to_dev(kobj);					\
> +										\
> +	return cpumap_print_to_buf(true, buf, topology_##mask(dev->id),		\
> +				   off, count);					\
>  }
> 
> -#define define_siblings_show_list(name, mask)				\
> -static ssize_t name##_list_show(struct device *dev,			\
> -				struct device_attribute *attr,		\
> -				char *buf)				\
> -{									\
> -	return cpumap_print_to_pagebuf(true, buf, topology_##mask(dev->id));\
> -}
> -
> -#define define_siblings_show_func(name, mask)	\
> -	define_siblings_show_map(name, mask);	\
> -	define_siblings_show_list(name, mask)
> -
>  define_id_show_func(physical_package_id);
>  static DEVICE_ATTR_RO(physical_package_id);
> 
> @@ -49,71 +51,80 @@ static DEVICE_ATTR_RO(die_id);
>  define_id_show_func(core_id);
>  static DEVICE_ATTR_RO(core_id);
> 
> -define_siblings_show_func(thread_siblings, sibling_cpumask);
> -static DEVICE_ATTR_RO(thread_siblings);
> -static DEVICE_ATTR_RO(thread_siblings_list);
> +define_siblings_read_func(thread_siblings, sibling_cpumask);
> +static BIN_ATTR_RO(thread_siblings, 0);
> +static BIN_ATTR_RO(thread_siblings_list, 0);
> 
> -define_siblings_show_func(core_cpus, sibling_cpumask);
> -static DEVICE_ATTR_RO(core_cpus);
> -static DEVICE_ATTR_RO(core_cpus_list);
> +define_siblings_read_func(core_cpus, sibling_cpumask);
> +static BIN_ATTR_RO(core_cpus, 0);
> +static BIN_ATTR_RO(core_cpus_list, 0);
> 
> -define_siblings_show_func(core_siblings, core_cpumask);
> -static DEVICE_ATTR_RO(core_siblings);
> -static DEVICE_ATTR_RO(core_siblings_list);
> +define_siblings_read_func(core_siblings, core_cpumask);
> +static BIN_ATTR_RO(core_siblings, 0);
> +static BIN_ATTR_RO(core_siblings_list, 0);
> 
> -define_siblings_show_func(die_cpus, die_cpumask);
> -static DEVICE_ATTR_RO(die_cpus);
> -static DEVICE_ATTR_RO(die_cpus_list);
> +define_siblings_read_func(die_cpus, die_cpumask);
> +static BIN_ATTR_RO(die_cpus, 0);
> +static BIN_ATTR_RO(die_cpus_list, 0);
> 
> -define_siblings_show_func(package_cpus, core_cpumask);
> -static DEVICE_ATTR_RO(package_cpus);
> -static DEVICE_ATTR_RO(package_cpus_list);
> +define_siblings_read_func(package_cpus, core_cpumask);
> +static BIN_ATTR_RO(package_cpus, 0);
> +static BIN_ATTR_RO(package_cpus_list, 0);
> 
>  #ifdef CONFIG_SCHED_BOOK
>  define_id_show_func(book_id);
>  static DEVICE_ATTR_RO(book_id);
> -define_siblings_show_func(book_siblings, book_cpumask);
> -static DEVICE_ATTR_RO(book_siblings);
> -static DEVICE_ATTR_RO(book_siblings_list);
> +define_siblings_read_func(book_siblings, book_cpumask);
> +static BIN_ATTR_RO(book_siblings, 0);
> +static BIN_ATTR_RO(book_siblings_list, 0);
>  #endif
> 
>  #ifdef CONFIG_SCHED_DRAWER
>  define_id_show_func(drawer_id);
>  static DEVICE_ATTR_RO(drawer_id);
> -define_siblings_show_func(drawer_siblings, drawer_cpumask);
> -static DEVICE_ATTR_RO(drawer_siblings);
> -static DEVICE_ATTR_RO(drawer_siblings_list);
> +define_siblings_read_func(drawer_siblings, drawer_cpumask);
> +static BIN_ATTR_RO(drawer_siblings, 0);
> +static BIN_ATTR_RO(drawer_siblings_list, 0);
>  #endif
> 
> +static struct bin_attribute *bin_attrs[] = {
> +	&bin_attr_core_cpus,
> +	&bin_attr_core_cpus_list,
> +	&bin_attr_thread_siblings,
> +	&bin_attr_thread_siblings_list,
> +	&bin_attr_core_siblings,
> +	&bin_attr_core_siblings_list,
> +	&bin_attr_die_cpus,
> +	&bin_attr_die_cpus_list,
> +	&bin_attr_package_cpus,
> +	&bin_attr_package_cpus_list,
> +#ifdef CONFIG_SCHED_BOOK
> +	&bin_attr_book_siblings,
> +	&bin_attr_book_siblings_list,
> +#endif
> +#ifdef CONFIG_SCHED_DRAWER
> +	&bin_attr_drawer_siblings,
> +	&bin_attr_drawer_siblings_list,
> +#endif
> +	NULL
> +};
> +
>  static struct attribute *default_attrs[] = {
>  	&dev_attr_physical_package_id.attr,
>  	&dev_attr_die_id.attr,
>  	&dev_attr_core_id.attr,
> -	&dev_attr_thread_siblings.attr,
> -	&dev_attr_thread_siblings_list.attr,
> -	&dev_attr_core_cpus.attr,
> -	&dev_attr_core_cpus_list.attr,
> -	&dev_attr_core_siblings.attr,
> -	&dev_attr_core_siblings_list.attr,
> -	&dev_attr_die_cpus.attr,
> -	&dev_attr_die_cpus_list.attr,
> -	&dev_attr_package_cpus.attr,
> -	&dev_attr_package_cpus_list.attr,
>  #ifdef CONFIG_SCHED_BOOK
>  	&dev_attr_book_id.attr,
> -	&dev_attr_book_siblings.attr,
> -	&dev_attr_book_siblings_list.attr,
>  #endif
>  #ifdef CONFIG_SCHED_DRAWER
>  	&dev_attr_drawer_id.attr,
> -	&dev_attr_drawer_siblings.attr,
> -	&dev_attr_drawer_siblings_list.attr,
>  #endif
>  	NULL
>  };
> 
>  static const struct attribute_group topology_attr_group = {
>  	.attrs = default_attrs,
> +	.bin_attrs = bin_attrs,
>  	.name = "topology"
>  };
> 
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 2/4] topology: use bin_attribute to break the size limitation of cpumap ABI
  2021-07-16  8:49   ` Song Bao Hua (Barry Song)
@ 2021-07-16 20:04     ` Yury Norov
  2021-07-17  0:16       ` Song Bao Hua (Barry Song)
  0 siblings, 1 reply; 35+ messages in thread
From: Yury Norov @ 2021-07-16 20:04 UTC (permalink / raw)
  To: Song Bao Hua (Barry Song)
  Cc: gregkh, akpm, andriy.shevchenko, linux-kernel, dave.hansen,
	linux, rafael, rdunlap, agordeev, sbrivio, jianpeng.ma,
	valentin.schneider, peterz, bristot, guodong.xu, tangchengchang,
	Zengtao (B), yangyicong, tim.c.chen, Linuxarm, tiantao (H)

On Fri, Jul 16, 2021 at 08:49:58AM +0000, Song Bao Hua (Barry Song) wrote:
> Hi Yury,
> Not sure if I have totally got your idea. But please see if the below
> is closer to what you prefer.
> 
> I haven't really tested it. But this approach can somehow solve the
> problem you mentioned(malloc/free and printing is done 1000times for
> a 1MB buffer which is read 1K each time).
> 
> Bitmap provides some API to alloc and return print_buf:
> 
> +ssize_t bitmap_get_print_buf(bool list, char **buf, const unsigned long *maskp,
> +               int nmaskbits)
> +{
> +       const char *fmt = list ? "%*pbl\n" : "%*pb\n";
> +       ssize_t size;
> +
> +       size = snprintf(NULL, 0, fmt, nmaskbits, maskp);
> +       *buf = kmalloc_track_caller(size + 1, GFP_KERNEL);
> +       scnprintf(*buf, size + 1, fmt, nmaskbits, maskp);
> +
> +       return size + 1;
> +}
> +
> +static inline ssize_t
> +cpumap_get_print_buf(bool list, char **buf, const struct cpumask *mask)
> +{
> +       return bitmap_get_print_buf(list, buf, cpumask_bits(mask),
> +                       nr_cpu_ids);
> +}
> +
> +struct bitmap_print_buf
> +{
> +       char *buf;
> +       ssize_t size;
> +};
> +
> 
> In bin_attribute, move to get and save the buffer while sysfs entry is
> read at the first time and free it when file arrives EOF:
> 
>  #define define_id_show_func(name)                                      \
>  static ssize_t name##_show(struct device *dev,                         \
>                            struct device_attribute *attr, char *buf)    \
> @@ -27,9 +53,27 @@ static ssize_t name##_read(struct file *file, struct kobject *kobj,          \
>                                   loff_t off, size_t count)                     \
>  {                                                                              \
>         struct device *dev = kobj_to_dev(kobj);                                 \
> -                                                                               \
> -       return cpumap_print_to_buf(false, buf, topology_##mask(dev->id),        \
> -                                  off, count);                                 \
> +       struct bitmap_print_buf *bmb = dev_get_drvdata(dev);                    \
> +       if (!bmb) {                                                             \
> +               bmb = devm_kzalloc(dev, sizeof(*bmb), GFP_KERNEL);              \
> +               if (!bmb)                                                       \
> +                       return -ENOMEM;                                         \
> +               dev_set_drvdata(dev, bmb);                                      \
> +       }                                                                       \
> +       /* for the 1st time, getting the printed buffer */                \
> +       if (!bmb->buf)                                                           \
> +               bmb->size = cpumap_get_print_buf(false, &bmb->buf,       \
> +                                    topology_##mask(dev->id));                 \
> +       /* when we arrive EOF, free the printed buffer */                       \
> +       if (off >= bmb->size) {                                                 \
> +               kfree(bmb->buf);  bmb->buf = NULL;                                   \
> +               return 0;                                                       \
> +       }                                                                       \
> +       /* while a large printed buffer is read many times, we reuse         \
> +        * the buffer we get at the 1st time                                    \
> +        */                                                                     \
> +       strncpy(buf, bmb->buf + off, count);                                    \
> +       return min(count,  bmb->size - off);                                    \
>  }                                                                              \
>                                                                                 \
> This means a huge change in drivers though I am not sure Greg is
> a fan of this approach. Anyway, "1000 times" is not a real case.
> Typically we will arrive EOF after one time.

Not a real case in your driver doesn't mean not a real case at all.
You're adding your code to the very basic core layer, and so you'd
consider all possible scenarios, not only your particular one. 

On the other hand, if you add your function(s) at drivers/base/node.c
and explain that O(nbits**2) 'is not a real case' in _this_ driver -
I think it will be acceptable. Maybe this is your choice...
 
> Thanks
> Barry

> Not sure if I have totally got your idea. But please see if the below
> is closer to what you prefer.
>
> I haven't really tested it. But this approach can somehow solve the
> problem you mentioned(malloc/free and printing is done 1000times for
> a 1MB buffer which is read 1K each time).
> 
> Bitmap provides some API to alloc and return print_buf:

I'm not too familar to the topology things, and in fact never exposed
anything thru the sysfs.

From general knowledge, it's better to allocate memory for the string
on file creation to avoid situation when kernel has no memory to allocate
it when user tries to read.

So from my perspective, the most straightforward solution would be:

register(cpumask)
{
        size_t max_size = cpumap_max_string_size(list, cpumask)
        void *buf = kmalloc(max_size, ...);

        sysfs_create_file(..., buf)
}

unregister()
{
        kfree(buf);
}

show()
{
        snprintf(buf, max_size, "*pbl", cpumask);
}

This would require to add bitmap_max_string_size(list, bitmap, nbits),
but it's O(1), and I think, others will find it helpful.

Again, I'm not professional with sysfs, and fully admit that it might
be wrong.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v7 2/4] topology: use bin_attribute to break the size limitation of cpumap ABI
  2021-07-16 20:04     ` Yury Norov
@ 2021-07-17  0:16       ` Song Bao Hua (Barry Song)
  2021-07-17  1:12         ` Yury Norov
  0 siblings, 1 reply; 35+ messages in thread
From: Song Bao Hua (Barry Song) @ 2021-07-17  0:16 UTC (permalink / raw)
  To: Yury Norov
  Cc: gregkh, akpm, andriy.shevchenko, linux-kernel, dave.hansen,
	linux, rafael, rdunlap, agordeev, sbrivio, jianpeng.ma,
	valentin.schneider, peterz, bristot, guodong.xu, tangchengchang,
	Zengtao (B), yangyicong, tim.c.chen, Linuxarm, tiantao (H)



> -----Original Message-----
> From: Yury Norov [mailto:yury.norov@gmail.com]
> Sent: Saturday, July 17, 2021 8:04 AM
> To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>
> Cc: gregkh@linuxfoundation.org; akpm@linux-foundation.org;
> andriy.shevchenko@linux.intel.com; linux-kernel@vger.kernel.org;
> dave.hansen@intel.com; linux@rasmusvillemoes.dk; rafael@kernel.org;
> rdunlap@infradead.org; agordeev@linux.ibm.com; sbrivio@redhat.com;
> jianpeng.ma@intel.com; valentin.schneider@arm.com; peterz@infradead.org;
> bristot@redhat.com; guodong.xu@linaro.org; tangchengchang
> <tangchengchang@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>;
> yangyicong <yangyicong@huawei.com>; tim.c.chen@linux.intel.com; Linuxarm
> <linuxarm@huawei.com>; tiantao (H) <tiantao6@hisilicon.com>
> Subject: Re: [PATCH v7 2/4] topology: use bin_attribute to break the size
> limitation of cpumap ABI
> 
> On Fri, Jul 16, 2021 at 08:49:58AM +0000, Song Bao Hua (Barry Song) wrote:
> > Hi Yury,
> > Not sure if I have totally got your idea. But please see if the below
> > is closer to what you prefer.
> >
> > I haven't really tested it. But this approach can somehow solve the
> > problem you mentioned(malloc/free and printing is done 1000times for
> > a 1MB buffer which is read 1K each time).
> >
> > Bitmap provides some API to alloc and return print_buf:
> >
> > +ssize_t bitmap_get_print_buf(bool list, char **buf, const unsigned long
> *maskp,
> > +               int nmaskbits)
> > +{
> > +       const char *fmt = list ? "%*pbl\n" : "%*pb\n";
> > +       ssize_t size;
> > +
> > +       size = snprintf(NULL, 0, fmt, nmaskbits, maskp);
> > +       *buf = kmalloc_track_caller(size + 1, GFP_KERNEL);
> > +       scnprintf(*buf, size + 1, fmt, nmaskbits, maskp);
> > +
> > +       return size + 1;
> > +}
> > +
> > +static inline ssize_t
> > +cpumap_get_print_buf(bool list, char **buf, const struct cpumask *mask)
> > +{
> > +       return bitmap_get_print_buf(list, buf, cpumask_bits(mask),
> > +                       nr_cpu_ids);
> > +}
> > +
> > +struct bitmap_print_buf
> > +{
> > +       char *buf;
> > +       ssize_t size;
> > +};
> > +
> >
> > In bin_attribute, move to get and save the buffer while sysfs entry is
> > read at the first time and free it when file arrives EOF:
> >
> >  #define define_id_show_func(name)                                      \
> >  static ssize_t name##_show(struct device *dev,                         \
> >                            struct device_attribute *attr, char *buf)    \
> > @@ -27,9 +53,27 @@ static ssize_t name##_read(struct file *file, struct kobject
> *kobj,          \
> >                                   loff_t off, size_t count)                     \
> >  {                                                                              \
> >         struct device *dev = kobj_to_dev(kobj);                                 \
> > -                                                                               \
> > -       return cpumap_print_to_buf(false, buf, topology_##mask(dev->id),
> \
> > -                                  off, count);                                 \
> > +       struct bitmap_print_buf *bmb = dev_get_drvdata(dev);                    \
> > +       if (!bmb) {                                                             \
> > +               bmb = devm_kzalloc(dev, sizeof(*bmb), GFP_KERNEL);              \
> > +               if (!bmb)                                                       \
> > +                       return -ENOMEM;                                         \
> > +               dev_set_drvdata(dev, bmb);                                      \
> > +       }                                                                       \
> > +       /* for the 1st time, getting the printed buffer */                \
> > +       if (!bmb->buf)                                                           \
> > +               bmb->size = cpumap_get_print_buf(false, &bmb->buf,       \
> > +                                    topology_##mask(dev->id));                 \
> > +       /* when we arrive EOF, free the printed buffer */                       \
> > +       if (off >= bmb->size) {                                                 \
> > +               kfree(bmb->buf);  bmb->buf = NULL;
> \
> > +               return 0;                                                       \
> > +       }                                                                       \
> > +       /* while a large printed buffer is read many times, we reuse         \
> > +        * the buffer we get at the 1st time                                    \
> > +        */                                                                     \
> > +       strncpy(buf, bmb->buf + off, count);                                    \
> > +       return min(count,  bmb->size - off);                                    \
> >  }                                                                              \
> >                                                                                 \
> > This means a huge change in drivers though I am not sure Greg is
> > a fan of this approach. Anyway, "1000 times" is not a real case.
> > Typically we will arrive EOF after one time.
> 
> Not a real case in your driver doesn't mean not a real case at all.
> You're adding your code to the very basic core layer, and so you'd
> consider all possible scenarios, not only your particular one.
> 

Generally yes. And generally I agree with your point. That point is
exactly what I have always adhered to in software design. But my point
I have been arguing is that the new APIs are for sysfs ABI only. So it
is not particular one, it is common.

In my understanding, only ABI things to users will need to print bitmap
to mask or list. Rarely a kernel module will need it. Kernel modules
would be running real and/or/andnot bit operations on binary bitmap
directly.

Anyway, I'm glad to take a step in the way you prefer.

> On the other hand, if you add your function(s) at drivers/base/node.c
> and explain that O(nbits**2) 'is not a real case' in _this_ driver -
> I think it will be acceptable. Maybe this is your choice...
> 
> > Thanks
> > Barry
> 
> > Not sure if I have totally got your idea. But please see if the below
> > is closer to what you prefer.
> >
> > I haven't really tested it. But this approach can somehow solve the
> > problem you mentioned(malloc/free and printing is done 1000times for
> > a 1MB buffer which is read 1K each time).
> >
> > Bitmap provides some API to alloc and return print_buf:
> 
> I'm not too familar to the topology things, and in fact never exposed
> anything thru the sysfs.
> 
> From general knowledge, it's better to allocate memory for the string
> on file creation to avoid situation when kernel has no memory to allocate
> it when user tries to read.
> 
> So from my perspective, the most straightforward solution would be:
> 
> register(cpumask)
> {
>         size_t max_size = cpumap_max_string_size(list, cpumask)
>         void *buf = kmalloc(max_size, ...);
> 
>         sysfs_create_file(..., buf)
> }
> 
> unregister()
> {
>         kfree(buf);
> }
> 
> show()
> {
>         snprintf(buf, max_size, "*pbl", cpumask);
> }
> 

Generally good idea. However, for sysfs ABI entries, it might not be
that true.

A sysfs entry might never be read for its whole life. As I explained
before, a sysfs entry - especially for list,  is randomly "cat" by users.
Many of them won't be read forever. And after they are read once, they
will probably never be read again. The operations to read ABI could be
random and rare. Performance wouldn't be a concern.

To avoid holding the memory which might never be used, it is better to
allocate and free the memory during runtime. I mean to allocate in show()
and free in show(), aka, to do it on demand.

For example, for a server with 256CPU and each cpu has dozens of sysfs ABI
entries, only a few of sysfs list entries might be randomly "cat" by users.
Holding 256*entries memory doesn't look good.

> This would require to add bitmap_max_string_size(list, bitmap, nbits),
> but it's O(1), and I think, others will find it helpful.

What about getting size and memory at the same time?

ssize_t bitmap_get_print_buf(bool list, char **buf, const unsigned long
 *maskp, int nmaskbits)

ssize_t cpumap_get_print_buf(bool list, char **buf, const struct cpumask *mask);

This API returns the size of printed buffer, and it also gets the
printed result saved in *buf. Then drivers don't need to do three
steps:

1. get cpumap buffer size which is your cpumap_max_string_size()
2. allocate memory for buffer according to size got in step 1
3. print bitmap(cpumap) to buffer by "pbl"

It will only need to call bitmap_get_print_buf() and all three
things are done inside bitmap_get_print_buf().

How to use the size and memory allocated in cpumap_get_print_buf
will be totally up to users.

The other benefit for this is that if we get string size during initialization,
and then we print in show() entries, the size got at the beginning might be not
enough as system topology might have changed. Sysfs ABI reflects the status of
system at this moment.

> 
> Again, I'm not professional with sysfs, and fully admit that it might
> be wrong.

Never mind.

Thanks
Barry


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 2/4] topology: use bin_attribute to break the size limitation of cpumap ABI
  2021-07-17  0:16       ` Song Bao Hua (Barry Song)
@ 2021-07-17  1:12         ` Yury Norov
  2021-07-19  9:07           ` andriy.shevchenko
  0 siblings, 1 reply; 35+ messages in thread
From: Yury Norov @ 2021-07-17  1:12 UTC (permalink / raw)
  To: Song Bao Hua (Barry Song)
  Cc: gregkh, akpm, andriy.shevchenko, linux-kernel, dave.hansen,
	linux, rafael, rdunlap, agordeev, sbrivio, jianpeng.ma,
	valentin.schneider, peterz, bristot, guodong.xu, tangchengchang,
	Zengtao (B), yangyicong, tim.c.chen, Linuxarm, tiantao (H)

On Sat, Jul 17, 2021 at 12:16:48AM +0000, Song Bao Hua (Barry Song) wrote:
> 
> 
> > -----Original Message-----
> > From: Yury Norov [mailto:yury.norov@gmail.com]
> > Sent: Saturday, July 17, 2021 8:04 AM
> > To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>
> > Cc: gregkh@linuxfoundation.org; akpm@linux-foundation.org;
> > andriy.shevchenko@linux.intel.com; linux-kernel@vger.kernel.org;
> > dave.hansen@intel.com; linux@rasmusvillemoes.dk; rafael@kernel.org;
> > rdunlap@infradead.org; agordeev@linux.ibm.com; sbrivio@redhat.com;
> > jianpeng.ma@intel.com; valentin.schneider@arm.com; peterz@infradead.org;
> > bristot@redhat.com; guodong.xu@linaro.org; tangchengchang
> > <tangchengchang@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>;
> > yangyicong <yangyicong@huawei.com>; tim.c.chen@linux.intel.com; Linuxarm
> > <linuxarm@huawei.com>; tiantao (H) <tiantao6@hisilicon.com>
> > Subject: Re: [PATCH v7 2/4] topology: use bin_attribute to break the size
> > limitation of cpumap ABI
> > 
> > On Fri, Jul 16, 2021 at 08:49:58AM +0000, Song Bao Hua (Barry Song) wrote:
> > > Hi Yury,
> > > Not sure if I have totally got your idea. But please see if the below
> > > is closer to what you prefer.
> > >
> > > I haven't really tested it. But this approach can somehow solve the
> > > problem you mentioned(malloc/free and printing is done 1000times for
> > > a 1MB buffer which is read 1K each time).
> > >
> > > Bitmap provides some API to alloc and return print_buf:
> > >
> > > +ssize_t bitmap_get_print_buf(bool list, char **buf, const unsigned long
> > *maskp,
> > > +               int nmaskbits)
> > > +{
> > > +       const char *fmt = list ? "%*pbl\n" : "%*pb\n";
> > > +       ssize_t size;
> > > +
> > > +       size = snprintf(NULL, 0, fmt, nmaskbits, maskp);
> > > +       *buf = kmalloc_track_caller(size + 1, GFP_KERNEL);
> > > +       scnprintf(*buf, size + 1, fmt, nmaskbits, maskp);
> > > +
> > > +       return size + 1;
> > > +}
> > > +
> > > +static inline ssize_t
> > > +cpumap_get_print_buf(bool list, char **buf, const struct cpumask *mask)
> > > +{
> > > +       return bitmap_get_print_buf(list, buf, cpumask_bits(mask),
> > > +                       nr_cpu_ids);
> > > +}
> > > +
> > > +struct bitmap_print_buf
> > > +{
> > > +       char *buf;
> > > +       ssize_t size;
> > > +};
> > > +
> > >
> > > In bin_attribute, move to get and save the buffer while sysfs entry is
> > > read at the first time and free it when file arrives EOF:
> > >
> > >  #define define_id_show_func(name)                                      \
> > >  static ssize_t name##_show(struct device *dev,                         \
> > >                            struct device_attribute *attr, char *buf)    \
> > > @@ -27,9 +53,27 @@ static ssize_t name##_read(struct file *file, struct kobject
> > *kobj,          \
> > >                                   loff_t off, size_t count)                     \
> > >  {                                                                              \
> > >         struct device *dev = kobj_to_dev(kobj);                                 \
> > > -                                                                               \
> > > -       return cpumap_print_to_buf(false, buf, topology_##mask(dev->id),
> > \
> > > -                                  off, count);                                 \
> > > +       struct bitmap_print_buf *bmb = dev_get_drvdata(dev);                    \
> > > +       if (!bmb) {                                                             \
> > > +               bmb = devm_kzalloc(dev, sizeof(*bmb), GFP_KERNEL);              \
> > > +               if (!bmb)                                                       \
> > > +                       return -ENOMEM;                                         \
> > > +               dev_set_drvdata(dev, bmb);                                      \
> > > +       }                                                                       \
> > > +       /* for the 1st time, getting the printed buffer */                \
> > > +       if (!bmb->buf)                                                           \
> > > +               bmb->size = cpumap_get_print_buf(false, &bmb->buf,       \
> > > +                                    topology_##mask(dev->id));                 \
> > > +       /* when we arrive EOF, free the printed buffer */                       \
> > > +       if (off >= bmb->size) {                                                 \
> > > +               kfree(bmb->buf);  bmb->buf = NULL;
> > \
> > > +               return 0;                                                       \
> > > +       }                                                                       \
> > > +       /* while a large printed buffer is read many times, we reuse         \
> > > +        * the buffer we get at the 1st time                                    \
> > > +        */                                                                     \
> > > +       strncpy(buf, bmb->buf + off, count);                                    \
> > > +       return min(count,  bmb->size - off);                                    \
> > >  }                                                                              \
> > >                                                                                 \
> > > This means a huge change in drivers though I am not sure Greg is
> > > a fan of this approach. Anyway, "1000 times" is not a real case.
> > > Typically we will arrive EOF after one time.
> > 
> > Not a real case in your driver doesn't mean not a real case at all.
> > You're adding your code to the very basic core layer, and so you'd
> > consider all possible scenarios, not only your particular one.
> > 
> 
> Generally yes. And generally I agree with your point. That point is
> exactly what I have always adhered to in software design. But my point
> I have been arguing is that the new APIs are for sysfs ABI only. So it
> is not particular one, it is common.
> 
> In my understanding, only ABI things to users will need to print bitmap
> to mask or list. Rarely a kernel module will need it. Kernel modules
> would be running real and/or/andnot bit operations on binary bitmap
> directly.
> 
> Anyway, I'm glad to take a step in the way you prefer.
> 
> > On the other hand, if you add your function(s) at drivers/base/node.c
> > and explain that O(nbits**2) 'is not a real case' in _this_ driver -
> > I think it will be acceptable. Maybe this is your choice...
> > 
> > > Thanks
> > > Barry
> > 
> > > Not sure if I have totally got your idea. But please see if the below
> > > is closer to what you prefer.
> > >
> > > I haven't really tested it. But this approach can somehow solve the
> > > problem you mentioned(malloc/free and printing is done 1000times for
> > > a 1MB buffer which is read 1K each time).
> > >
> > > Bitmap provides some API to alloc and return print_buf:
> > 
> > I'm not too familar to the topology things, and in fact never exposed
> > anything thru the sysfs.
> > 
> > From general knowledge, it's better to allocate memory for the string
> > on file creation to avoid situation when kernel has no memory to allocate
> > it when user tries to read.
> > 
> > So from my perspective, the most straightforward solution would be:
> > 
> > register(cpumask)
> > {
> >         size_t max_size = cpumap_max_string_size(list, cpumask)
> >         void *buf = kmalloc(max_size, ...);
> > 
> >         sysfs_create_file(..., buf)
> > }
> > 
> > unregister()
> > {
> >         kfree(buf);
> > }
> > 
> > show()
> > {
> >         snprintf(buf, max_size, "*pbl", cpumask);
> > }
> > 
> 
> Generally good idea. However, for sysfs ABI entries, it might not be
> that true.
> 
> A sysfs entry might never be read for its whole life. As I explained
> before, a sysfs entry - especially for list,  is randomly "cat" by users.
> Many of them won't be read forever. And after they are read once, they
> will probably never be read again. The operations to read ABI could be
> random and rare. Performance wouldn't be a concern.
> 
> To avoid holding the memory which might never be used, it is better to
> allocate and free the memory during runtime. I mean to allocate in show()
> and free in show(), aka, to do it on demand.
> 
> For example, for a server with 256CPU and each cpu has dozens of sysfs ABI
> entries, only a few of sysfs list entries might be randomly "cat" by users.
> Holding 256*entries memory doesn't look good.

Ok, makes sense.
 
> > This would require to add bitmap_max_string_size(list, bitmap, nbits),
> > but it's O(1), and I think, others will find it helpful.
> 
> What about getting size and memory at the same time?

1. We already have kasprintf()
2. It breaks coding style.

Documentation/process/coding-style.rst:
        Functions should be short and sweet, and do just one thing. 

From practical point of view, there should be some balance between
granularity and ease-of-use. But in this case, bitmap_list cries for 
a function that will help to estimate size of output buffer. And it's
easy to imagine a case where the estimated length of bitmap is needed
explicitly:

        size_t max_size = bitmap_max_string_size(nbits);
        char *buf = kmalloc(PAGE_ALIGN(max_size) * nr_cpus);

Thought, I don't insist. In your driver you can do:

        size_t size = snprintf(NULL, 0, ...);
        void *buf = kmalloc(size);

It will be fully correct, and you already have everything you need.
 
> ssize_t bitmap_get_print_buf(bool list, char **buf, const unsigned long
>  *maskp, int nmaskbits)
> 
> ssize_t cpumap_get_print_buf(bool list, char **buf, const struct cpumask *mask);
>
> This API returns the size of printed buffer, and it also gets the
> printed result saved in *buf. Then drivers don't need to do three
> steps:
> 
> 1. get cpumap buffer size which is your cpumap_max_string_size()
> 2. allocate memory for buffer according to size got in step 1
> 3. print bitmap(cpumap) to buffer by "pbl"
> 
> It will only need to call bitmap_get_print_buf() and all three
> things are done inside bitmap_get_print_buf().
> 
> How to use the size and memory allocated in cpumap_get_print_buf
> will be totally up to users.
> 
> The other benefit for this is that if we get string size during initialization,
> and then we print in show() entries, the size got at the beginning might be not
> enough as system topology might have changed. Sysfs ABI reflects the status of
> system at this moment.
> 
> > 
> > Again, I'm not professional with sysfs, and fully admit that it might
> > be wrong.
> 
> Never mind.
> 
> Thanks
> Barry

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 2/4] topology: use bin_attribute to break the size limitation of cpumap ABI
  2021-07-17  1:12         ` Yury Norov
@ 2021-07-19  9:07           ` andriy.shevchenko
  2021-07-19 11:10             ` Song Bao Hua (Barry Song)
  0 siblings, 1 reply; 35+ messages in thread
From: andriy.shevchenko @ 2021-07-19  9:07 UTC (permalink / raw)
  To: Yury Norov
  Cc: Song Bao Hua (Barry Song),
	gregkh, akpm, linux-kernel, dave.hansen, linux, rafael, rdunlap,
	agordeev, sbrivio, jianpeng.ma, valentin.schneider, peterz,
	bristot, guodong.xu, tangchengchang, Zengtao (B),
	yangyicong, tim.c.chen, Linuxarm, tiantao (H)

On Fri, Jul 16, 2021 at 06:12:21PM -0700, Yury Norov wrote:
> On Sat, Jul 17, 2021 at 12:16:48AM +0000, Song Bao Hua (Barry Song) wrote:
> > > From: Yury Norov [mailto:yury.norov@gmail.com]
> > > Sent: Saturday, July 17, 2021 8:04 AM
> > > On Fri, Jul 16, 2021 at 08:49:58AM +0000, Song Bao Hua (Barry Song) wrote:

...

> > Generally good idea. However, for sysfs ABI entries, it might not be
> > that true.
> > 
> > A sysfs entry might never be read for its whole life. As I explained
> > before, a sysfs entry - especially for list,  is randomly "cat" by users.
> > Many of them won't be read forever. And after they are read once, they
> > will probably never be read again. The operations to read ABI could be
> > random and rare. Performance wouldn't be a concern.
> > 
> > To avoid holding the memory which might never be used, it is better to
> > allocate and free the memory during runtime. I mean to allocate in show()
> > and free in show(), aka, to do it on demand.
> > 
> > For example, for a server with 256CPU and each cpu has dozens of sysfs ABI
> > entries, only a few of sysfs list entries might be randomly "cat" by users.
> > Holding 256*entries memory doesn't look good.
> 
> Ok, makes sense.
>  
> > > This would require to add bitmap_max_string_size(list, bitmap, nbits),
> > > but it's O(1), and I think, others will find it helpful.
> > 
> > What about getting size and memory at the same time?
> 
> 1. We already have kasprintf()
> 2. It breaks coding style.
> 
> Documentation/process/coding-style.rst:
>         Functions should be short and sweet, and do just one thing. 
> 
> From practical point of view, there should be some balance between
> granularity and ease-of-use. But in this case, bitmap_list cries for 
> a function that will help to estimate size of output buffer.

According to the vsnprintf() logic the estimated size is what it returns. If
user supplies too few bytes available, the comparison with the returned value
can tell caller that space wasn't big enough.

> And it's
> easy to imagine a case where the estimated length of bitmap is needed
> explicitly:
> 
>         size_t max_size = bitmap_max_string_size(nbits);
>         char *buf = kmalloc(PAGE_ALIGN(max_size) * nr_cpus);
> 
> Thought, I don't insist. In your driver you can do:
> 
>         size_t size = snprintf(NULL, 0, ...);
>         void *buf = kmalloc(size);
> 
> It will be fully correct, and you already have everything you need.
>  
> > ssize_t bitmap_get_print_buf(bool list, char **buf, const unsigned long
> >  *maskp, int nmaskbits)
> > 
> > ssize_t cpumap_get_print_buf(bool list, char **buf, const struct cpumask *mask);
> >
> > This API returns the size of printed buffer, and it also gets the
> > printed result saved in *buf. Then drivers don't need to do three
> > steps:
> > 
> > 1. get cpumap buffer size which is your cpumap_max_string_size()
> > 2. allocate memory for buffer according to size got in step 1
> > 3. print bitmap(cpumap) to buffer by "pbl"
> > 
> > It will only need to call bitmap_get_print_buf() and all three
> > things are done inside bitmap_get_print_buf().
> > 
> > How to use the size and memory allocated in cpumap_get_print_buf
> > will be totally up to users.
> > 
> > The other benefit for this is that if we get string size during initialization,
> > and then we print in show() entries, the size got at the beginning might be not
> > enough as system topology might have changed. Sysfs ABI reflects the status of
> > system at this moment.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v7 2/4] topology: use bin_attribute to break the size limitation of cpumap ABI
  2021-07-19  9:07           ` andriy.shevchenko
@ 2021-07-19 11:10             ` Song Bao Hua (Barry Song)
  2021-07-19 17:10               ` Yury Norov
  0 siblings, 1 reply; 35+ messages in thread
From: Song Bao Hua (Barry Song) @ 2021-07-19 11:10 UTC (permalink / raw)
  To: andriy.shevchenko, Yury Norov
  Cc: gregkh, akpm, linux-kernel, dave.hansen, linux, rafael, rdunlap,
	agordeev, sbrivio, jianpeng.ma, valentin.schneider, peterz,
	bristot, guodong.xu, tangchengchang, Zengtao (B),
	yangyicong, tim.c.chen, Linuxarm, tiantao (H)



> -----Original Message-----
> From: andriy.shevchenko@linux.intel.com
> [mailto:andriy.shevchenko@linux.intel.com]
> Sent: Monday, July 19, 2021 9:07 PM
> To: Yury Norov <yury.norov@gmail.com>
> Cc: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>;
> gregkh@linuxfoundation.org; akpm@linux-foundation.org;
> linux-kernel@vger.kernel.org; dave.hansen@intel.com;
> linux@rasmusvillemoes.dk; rafael@kernel.org; rdunlap@infradead.org;
> agordeev@linux.ibm.com; sbrivio@redhat.com; jianpeng.ma@intel.com;
> valentin.schneider@arm.com; peterz@infradead.org; bristot@redhat.com;
> guodong.xu@linaro.org; tangchengchang <tangchengchang@huawei.com>; Zengtao (B)
> <prime.zeng@hisilicon.com>; yangyicong <yangyicong@huawei.com>;
> tim.c.chen@linux.intel.com; Linuxarm <linuxarm@huawei.com>; tiantao (H)
> <tiantao6@hisilicon.com>
> Subject: Re: [PATCH v7 2/4] topology: use bin_attribute to break the size
> limitation of cpumap ABI
> 
> On Fri, Jul 16, 2021 at 06:12:21PM -0700, Yury Norov wrote:
> > On Sat, Jul 17, 2021 at 12:16:48AM +0000, Song Bao Hua (Barry Song) wrote:
> > > > From: Yury Norov [mailto:yury.norov@gmail.com]
> > > > Sent: Saturday, July 17, 2021 8:04 AM
> > > > On Fri, Jul 16, 2021 at 08:49:58AM +0000, Song Bao Hua (Barry Song) wrote:
> 
> ...
> 
> > > Generally good idea. However, for sysfs ABI entries, it might not be
> > > that true.
> > >
> > > A sysfs entry might never be read for its whole life. As I explained
> > > before, a sysfs entry - especially for list,  is randomly "cat" by users.
> > > Many of them won't be read forever. And after they are read once, they
> > > will probably never be read again. The operations to read ABI could be
> > > random and rare. Performance wouldn't be a concern.
> > >
> > > To avoid holding the memory which might never be used, it is better to
> > > allocate and free the memory during runtime. I mean to allocate in show()
> > > and free in show(), aka, to do it on demand.
> > >
> > > For example, for a server with 256CPU and each cpu has dozens of sysfs ABI
> > > entries, only a few of sysfs list entries might be randomly "cat" by users.
> > > Holding 256*entries memory doesn't look good.
> >
> > Ok, makes sense.
> >
> > > > This would require to add bitmap_max_string_size(list, bitmap, nbits),
> > > > but it's O(1), and I think, others will find it helpful.
> > >
> > > What about getting size and memory at the same time?
> >
> > 1. We already have kasprintf()
> > 2. It breaks coding style.
> >
> > Documentation/process/coding-style.rst:
> >         Functions should be short and sweet, and do just one thing.
> >
> > From practical point of view, there should be some balance between
> > granularity and ease-of-use. But in this case, bitmap_list cries for
> > a function that will help to estimate size of output buffer.
> 
> According to the vsnprintf() logic the estimated size is what it returns. If
> user supplies too few bytes available, the comparison with the returned value
> can tell caller that space wasn't big enough.

As far as my understanding, for estimated size in bitmap_max_string_size()
Yury might mean something as below?

* For bitmask:
Each 32bit needs 9 bytes "11111111,", so the max size of mask would be:
9*nr_cpus / 32 ?

* For list:
Maximally  cpu0-9 need 2bytes, like "1,"
Maximally cpu10-99 need 3bytes, like "50,"
Maximally cpu100-999 need 4bytes, like "101,"
Maximally cpu1000-9999 need 5 bytes..
 
So once we know the size of the bitmap, we can figure out the maximum
size of its string for mask and list?

Pls correct me if you don't mean this, yury.


> 
> > And it's
> > easy to imagine a case where the estimated length of bitmap is needed
> > explicitly:
> >
> >         size_t max_size = bitmap_max_string_size(nbits);
> >         char *buf = kmalloc(PAGE_ALIGN(max_size) * nr_cpus);
> >
> > Thought, I don't insist. In your driver you can do:
> >
> >         size_t size = snprintf(NULL, 0, ...);
> >         void *buf = kmalloc(size);
> >
> > It will be fully correct, and you already have everything you need.
> >
> > > ssize_t bitmap_get_print_buf(bool list, char **buf, const unsigned long
> > >  *maskp, int nmaskbits)
> > >
> > > ssize_t cpumap_get_print_buf(bool list, char **buf, const struct cpumask
> *mask);
> > >
> > > This API returns the size of printed buffer, and it also gets the
> > > printed result saved in *buf. Then drivers don't need to do three
> > > steps:
> > >
> > > 1. get cpumap buffer size which is your cpumap_max_string_size()
> > > 2. allocate memory for buffer according to size got in step 1
> > > 3. print bitmap(cpumap) to buffer by "pbl"
> > >
> > > It will only need to call bitmap_get_print_buf() and all three
> > > things are done inside bitmap_get_print_buf().
> > >
> > > How to use the size and memory allocated in cpumap_get_print_buf
> > > will be totally up to users.
> > >
> > > The other benefit for this is that if we get string size during initialization,
> > > and then we print in show() entries, the size got at the beginning might
> be not
> > > enough as system topology might have changed. Sysfs ABI reflects the status
> of
> > > system at this moment.
> 
> --
> With Best Regards,
> Andy Shevchenko
> 

Thanks
Barry


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 2/4] topology: use bin_attribute to break the size limitation of cpumap ABI
  2021-07-19 11:10             ` Song Bao Hua (Barry Song)
@ 2021-07-19 17:10               ` Yury Norov
  2021-07-21  9:30                 ` Song Bao Hua (Barry Song)
  0 siblings, 1 reply; 35+ messages in thread
From: Yury Norov @ 2021-07-19 17:10 UTC (permalink / raw)
  To: Song Bao Hua (Barry Song)
  Cc: andriy.shevchenko, gregkh, akpm, linux-kernel, dave.hansen,
	linux, rafael, rdunlap, agordeev, sbrivio, jianpeng.ma,
	valentin.schneider, peterz, bristot, guodong.xu, tangchengchang,
	Zengtao (B), yangyicong, tim.c.chen, Linuxarm, tiantao (H)

On Mon, Jul 19, 2021 at 11:10:45AM +0000, Song Bao Hua (Barry Song) wrote:
> 
> 
> > -----Original Message-----
> > From: andriy.shevchenko@linux.intel.com
> > [mailto:andriy.shevchenko@linux.intel.com]
> > Sent: Monday, July 19, 2021 9:07 PM
> > To: Yury Norov <yury.norov@gmail.com>
> > Cc: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>;
> > gregkh@linuxfoundation.org; akpm@linux-foundation.org;
> > linux-kernel@vger.kernel.org; dave.hansen@intel.com;
> > linux@rasmusvillemoes.dk; rafael@kernel.org; rdunlap@infradead.org;
> > agordeev@linux.ibm.com; sbrivio@redhat.com; jianpeng.ma@intel.com;
> > valentin.schneider@arm.com; peterz@infradead.org; bristot@redhat.com;
> > guodong.xu@linaro.org; tangchengchang <tangchengchang@huawei.com>; Zengtao (B)
> > <prime.zeng@hisilicon.com>; yangyicong <yangyicong@huawei.com>;
> > tim.c.chen@linux.intel.com; Linuxarm <linuxarm@huawei.com>; tiantao (H)
> > <tiantao6@hisilicon.com>
> > Subject: Re: [PATCH v7 2/4] topology: use bin_attribute to break the size
> > limitation of cpumap ABI
> > 
> > On Fri, Jul 16, 2021 at 06:12:21PM -0700, Yury Norov wrote:
> > > On Sat, Jul 17, 2021 at 12:16:48AM +0000, Song Bao Hua (Barry Song) wrote:
> > > > > From: Yury Norov [mailto:yury.norov@gmail.com]
> > > > > Sent: Saturday, July 17, 2021 8:04 AM
> > > > > On Fri, Jul 16, 2021 at 08:49:58AM +0000, Song Bao Hua (Barry Song) wrote:
> > 
> > ...
> > 
> > > > Generally good idea. However, for sysfs ABI entries, it might not be
> > > > that true.
> > > >
> > > > A sysfs entry might never be read for its whole life. As I explained
> > > > before, a sysfs entry - especially for list,  is randomly "cat" by users.
> > > > Many of them won't be read forever. And after they are read once, they
> > > > will probably never be read again. The operations to read ABI could be
> > > > random and rare. Performance wouldn't be a concern.
> > > >
> > > > To avoid holding the memory which might never be used, it is better to
> > > > allocate and free the memory during runtime. I mean to allocate in show()
> > > > and free in show(), aka, to do it on demand.
> > > >
> > > > For example, for a server with 256CPU and each cpu has dozens of sysfs ABI
> > > > entries, only a few of sysfs list entries might be randomly "cat" by users.
> > > > Holding 256*entries memory doesn't look good.
> > >
> > > Ok, makes sense.
> > >
> > > > > This would require to add bitmap_max_string_size(list, bitmap, nbits),
> > > > > but it's O(1), and I think, others will find it helpful.
> > > >
> > > > What about getting size and memory at the same time?
> > >
> > > 1. We already have kasprintf()
> > > 2. It breaks coding style.
> > >
> > > Documentation/process/coding-style.rst:
> > >         Functions should be short and sweet, and do just one thing.
> > >
> > > From practical point of view, there should be some balance between
> > > granularity and ease-of-use. But in this case, bitmap_list cries for
> > > a function that will help to estimate size of output buffer.
> > 
> > According to the vsnprintf() logic the estimated size is what it returns. If
> > user supplies too few bytes available, the comparison with the returned value
> > can tell caller that space wasn't big enough.

snprintf(NULL, 0, "pbl", ...) also works, but it's O(nbits), and user is not
guaranteed that allocated memory would be always sufficient. I mean max possible
length for given nbits, not a length of a specific string.

In case of lists, the length may grow. Consider:
        0-8 -> 0-3,5-8 -> 0,2,4,6,8

If we want to allocate a storage for strings that may change, it would be
helpful to allocate memory for the lengthiest string in advance.

So, bitmap_max_string_len() may be a convenient O(1) alternative for
those who interested in printing the same bitmap in the same buffer.

> As far as my understanding, for estimated size in bitmap_max_string_size()
> Yury might mean something as below?
> 
> * For bitmask:
> Each 32bit needs 9 bytes "11111111,", so the max size of mask would be:
> 9*nr_cpus / 32 ?

11111 -> "f1", but your formula gives 1. 
I think it should be like this (not tested):
        DIV_ROUND_UP(nbits, 4) + nbits < 32 ? 0 : nbits / 32 - 1

> * For list:
> Maximally  cpu0-9 need 2bytes, like "1,"
> Maximally cpu10-99 need 3bytes, like "50,"
> Maximally cpu100-999 need 4bytes, like "101,"
> Maximally cpu1000-9999 need 5 bytes..
>  
> So once we know the size of the bitmap, we can figure out the maximum
> size of its string for mask and list?
> 
> Pls correct me if you don't mean this, yury.
 
Assuming that longest possible strings are of the form 0,2,4,6,... I
think it's correct except for the last comma, so substract 1.

If we decide to go on with this bitmap_max_strlen(), the list part
should be tested extensively.
 
> > > And it's
> > > easy to imagine a case where the estimated length of bitmap is needed
> > > explicitly:
> > >
> > >         size_t max_size = bitmap_max_string_size(nbits);
> > >         char *buf = kmalloc(PAGE_ALIGN(max_size) * nr_cpus);
> > >
> > > Thought, I don't insist. In your driver you can do:
> > >
> > >         size_t size = snprintf(NULL, 0, ...);
> > >         void *buf = kmalloc(size);
> > >
> > > It will be fully correct, and you already have everything you need.
> > >
> > > > ssize_t bitmap_get_print_buf(bool list, char **buf, const unsigned long
> > > >  *maskp, int nmaskbits)
> > > >
> > > > ssize_t cpumap_get_print_buf(bool list, char **buf, const struct cpumask
> > *mask);
> > > >
> > > > This API returns the size of printed buffer, and it also gets the
> > > > printed result saved in *buf. Then drivers don't need to do three
> > > > steps:
> > > >
> > > > 1. get cpumap buffer size which is your cpumap_max_string_size()
> > > > 2. allocate memory for buffer according to size got in step 1
> > > > 3. print bitmap(cpumap) to buffer by "pbl"
> > > >
> > > > It will only need to call bitmap_get_print_buf() and all three
> > > > things are done inside bitmap_get_print_buf().
> > > >
> > > > How to use the size and memory allocated in cpumap_get_print_buf
> > > > will be totally up to users.
> > > >
> > > > The other benefit for this is that if we get string size during initialization,
> > > > and then we print in show() entries, the size got at the beginning might
> > be not
> > > > enough as system topology might have changed. Sysfs ABI reflects the status
> > of
> > > > system at this moment.
> > 
> > --
> > With Best Regards,
> > Andy Shevchenko
> > 
> 
> Thanks
> Barry

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v7 2/4] topology: use bin_attribute to break the size limitation of cpumap ABI
  2021-07-19 17:10               ` Yury Norov
@ 2021-07-21  9:30                 ` Song Bao Hua (Barry Song)
  0 siblings, 0 replies; 35+ messages in thread
From: Song Bao Hua (Barry Song) @ 2021-07-21  9:30 UTC (permalink / raw)
  To: Yury Norov
  Cc: andriy.shevchenko, gregkh, akpm, linux-kernel, dave.hansen,
	linux, rafael, rdunlap, agordeev, sbrivio, jianpeng.ma,
	valentin.schneider, peterz, bristot, guodong.xu, tangchengchang,
	Zengtao (B), yangyicong, tim.c.chen, Linuxarm, tiantao (H)



> -----Original Message-----
> From: Yury Norov [mailto:yury.norov@gmail.com]
> Sent: Tuesday, July 20, 2021 5:10 AM
> To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>
> Cc: andriy.shevchenko@linux.intel.com; gregkh@linuxfoundation.org;
> akpm@linux-foundation.org; linux-kernel@vger.kernel.org;
> dave.hansen@intel.com; linux@rasmusvillemoes.dk; rafael@kernel.org;
> rdunlap@infradead.org; agordeev@linux.ibm.com; sbrivio@redhat.com;
> jianpeng.ma@intel.com; valentin.schneider@arm.com; peterz@infradead.org;
> bristot@redhat.com; guodong.xu@linaro.org; tangchengchang
> <tangchengchang@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>;
> yangyicong <yangyicong@huawei.com>; tim.c.chen@linux.intel.com; Linuxarm
> <linuxarm@huawei.com>; tiantao (H) <tiantao6@hisilicon.com>
> Subject: Re: [PATCH v7 2/4] topology: use bin_attribute to break the size
> limitation of cpumap ABI
> 
> On Mon, Jul 19, 2021 at 11:10:45AM +0000, Song Bao Hua (Barry Song) wrote:
> >
> >
> > > -----Original Message-----
> > > From: andriy.shevchenko@linux.intel.com
> > > [mailto:andriy.shevchenko@linux.intel.com]
> > > Sent: Monday, July 19, 2021 9:07 PM
> > > To: Yury Norov <yury.norov@gmail.com>
> > > Cc: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>;
> > > gregkh@linuxfoundation.org; akpm@linux-foundation.org;
> > > linux-kernel@vger.kernel.org; dave.hansen@intel.com;
> > > linux@rasmusvillemoes.dk; rafael@kernel.org; rdunlap@infradead.org;
> > > agordeev@linux.ibm.com; sbrivio@redhat.com; jianpeng.ma@intel.com;
> > > valentin.schneider@arm.com; peterz@infradead.org; bristot@redhat.com;
> > > guodong.xu@linaro.org; tangchengchang <tangchengchang@huawei.com>;
> Zengtao (B)
> > > <prime.zeng@hisilicon.com>; yangyicong <yangyicong@huawei.com>;
> > > tim.c.chen@linux.intel.com; Linuxarm <linuxarm@huawei.com>; tiantao (H)
> > > <tiantao6@hisilicon.com>
> > > Subject: Re: [PATCH v7 2/4] topology: use bin_attribute to break the size
> > > limitation of cpumap ABI
> > >
> > > On Fri, Jul 16, 2021 at 06:12:21PM -0700, Yury Norov wrote:
> > > > On Sat, Jul 17, 2021 at 12:16:48AM +0000, Song Bao Hua (Barry Song) wrote:
> > > > > > From: Yury Norov [mailto:yury.norov@gmail.com]
> > > > > > Sent: Saturday, July 17, 2021 8:04 AM
> > > > > > On Fri, Jul 16, 2021 at 08:49:58AM +0000, Song Bao Hua (Barry Song)
> wrote:
> > >
> > > ...
> > >
> > > > > Generally good idea. However, for sysfs ABI entries, it might not be
> > > > > that true.
> > > > >
> > > > > A sysfs entry might never be read for its whole life. As I explained
> > > > > before, a sysfs entry - especially for list,  is randomly "cat" by users.
> > > > > Many of them won't be read forever. And after they are read once, they
> > > > > will probably never be read again. The operations to read ABI could
> be
> > > > > random and rare. Performance wouldn't be a concern.
> > > > >
> > > > > To avoid holding the memory which might never be used, it is better
> to
> > > > > allocate and free the memory during runtime. I mean to allocate in show()
> > > > > and free in show(), aka, to do it on demand.
> > > > >
> > > > > For example, for a server with 256CPU and each cpu has dozens of sysfs
> ABI
> > > > > entries, only a few of sysfs list entries might be randomly "cat" by
> users.
> > > > > Holding 256*entries memory doesn't look good.
> > > >
> > > > Ok, makes sense.
> > > >
> > > > > > This would require to add bitmap_max_string_size(list, bitmap, nbits),
> > > > > > but it's O(1), and I think, others will find it helpful.
> > > > >
> > > > > What about getting size and memory at the same time?
> > > >
> > > > 1. We already have kasprintf()
> > > > 2. It breaks coding style.
> > > >
> > > > Documentation/process/coding-style.rst:
> > > >         Functions should be short and sweet, and do just one thing.
> > > >
> > > > From practical point of view, there should be some balance between
> > > > granularity and ease-of-use. But in this case, bitmap_list cries for
> > > > a function that will help to estimate size of output buffer.
> > >
> > > According to the vsnprintf() logic the estimated size is what it returns.
> If
> > > user supplies too few bytes available, the comparison with the returned
> value
> > > can tell caller that space wasn't big enough.
> 
> snprintf(NULL, 0, "pbl", ...) also works, but it's O(nbits), and user is not
> guaranteed that allocated memory would be always sufficient. I mean max possible
> length for given nbits, not a length of a specific string.
> 
> In case of lists, the length may grow. Consider:
>         0-8 -> 0-3,5-8 -> 0,2,4,6,8
> 
> If we want to allocate a storage for strings that may change, it would be
> helpful to allocate memory for the lengthiest string in advance.
> 
> So, bitmap_max_string_len() may be a convenient O(1) alternative for
> those who interested in printing the same bitmap in the same buffer.
> 
> > As far as my understanding, for estimated size in bitmap_max_string_size()
> > Yury might mean something as below?
> >
> > * For bitmask:
> > Each 32bit needs 9 bytes "11111111,", so the max size of mask would be:
> > 9*nr_cpus / 32 ?
> 
> 11111 -> "f1", but your formula gives 1.
> I think it should be like this (not tested):
>         DIV_ROUND_UP(nbits, 4) + nbits < 32 ? 0 : nbits / 32 - 1

I am not quite sure this is correct, in case nbits=64,
we will need:
ffffffff,ffffffff   9+8=17

your formula is ignoring the "," for each 32bits?

> 
> > * For list:
> > Maximally  cpu0-9 need 2bytes, like "1,"
> > Maximally cpu10-99 need 3bytes, like "50,"
> > Maximally cpu100-999 need 4bytes, like "101,"
> > Maximally cpu1000-9999 need 5 bytes..
> >
> > So once we know the size of the bitmap, we can figure out the maximum
> > size of its string for mask and list?
> >
> > Pls correct me if you don't mean this, yury.
> 
> Assuming that longest possible strings are of the form 0,2,4,6,... I
> think it's correct except for the last comma, so substract 1.

I don't think 0,2,4,6,8 takes maximum length,
0-1,3-4,6-7 are longer.

It seems maximum bitmap pattern is(binary):
110110110110...

In this case, each 3bits takes 4 bytes.

For cpu 10-99, it would be: 10-11,13-14,16-17,19-20,

Each 3bits takes "6 bytes".

For cpu100-999, it would be: 100-101, 103-104, 106-107,....
Each 3bits takes "8 bytes"?

Anyway, It seems be quite tricky :-)

> 
> If we decide to go on with this bitmap_max_strlen(), the list part
> should be tested extensively.
> 

Btw, sysfs core code needs changes to support pre-allocated
memory in customized size and seq read on it. This needs a lot
of efforts I am looking at.

> > > > And it's
> > > > easy to imagine a case where the estimated length of bitmap is needed
> > > > explicitly:
> > > >
> > > >         size_t max_size = bitmap_max_string_size(nbits);
> > > >         char *buf = kmalloc(PAGE_ALIGN(max_size) * nr_cpus);
> > > >
> > > > Thought, I don't insist. In your driver you can do:
> > > >
> > > >         size_t size = snprintf(NULL, 0, ...);
> > > >         void *buf = kmalloc(size);
> > > >
> > > > It will be fully correct, and you already have everything you need.
> > > >
> > > > > ssize_t bitmap_get_print_buf(bool list, char **buf, const unsigned long
> > > > >  *maskp, int nmaskbits)
> > > > >
> > > > > ssize_t cpumap_get_print_buf(bool list, char **buf, const struct cpumask
> > > *mask);
> > > > >
> > > > > This API returns the size of printed buffer, and it also gets the
> > > > > printed result saved in *buf. Then drivers don't need to do three
> > > > > steps:
> > > > >
> > > > > 1. get cpumap buffer size which is your cpumap_max_string_size()
> > > > > 2. allocate memory for buffer according to size got in step 1
> > > > > 3. print bitmap(cpumap) to buffer by "pbl"
> > > > >
> > > > > It will only need to call bitmap_get_print_buf() and all three
> > > > > things are done inside bitmap_get_print_buf().
> > > > >
> > > > > How to use the size and memory allocated in cpumap_get_print_buf
> > > > > will be totally up to users.
> > > > >
> > > > > The other benefit for this is that if we get string size during
> initialization,
> > > > > and then we print in show() entries, the size got at the beginning might
> > > be not
> > > > > enough as system topology might have changed. Sysfs ABI reflects the
> status
> > > of
> > > > > system at this moment.
> > >
> > > --
> > > With Best Regards,
> > > Andy Shevchenko
> > >
> >


Thanks
Barry

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
  2021-07-15 23:23         ` Yury Norov
  2021-07-16  0:41           ` Song Bao Hua (Barry Song)
@ 2021-07-21 11:40           ` Greg Kroah-Hartman
  2021-07-22 17:09             ` Yury Norov
  1 sibling, 1 reply; 35+ messages in thread
From: Greg Kroah-Hartman @ 2021-07-21 11:40 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Andy Shevchenko, Barry Song, Andrew Morton,
	Linux Kernel Mailing List, Dave Hansen, Rasmus Villemoes,
	Rafael J. Wysocki, Randy Dunlap, Alexander Gordeev,
	Stefano Brivio, Ma, Jianpeng, Valentin Schneider,
	Peter Zijlstra (Intel),
	Daniel Bristot de Oliveira, Guodong Xu, tangchengchang,
	Zengtao (B),
	yangyicong, tim.c.chen, Linuxarm

On Thu, Jul 15, 2021 at 04:23:32PM -0700, Yury Norov wrote:
> On Fri, Jul 16, 2021 at 12:32:45AM +0300, Andy Shevchenko wrote:
> > On Thu, Jul 15, 2021 at 11:48 PM Yury Norov <yury.norov@gmail.com> wrote:
> > > On Thu, Jul 15, 2021 at 03:09:39PM +0300, Andy Shevchenko wrote:
> > > > On Thu, Jul 15, 2021 at 11:58:56PM +1200, Barry Song wrote:
> > > > > The added test items cover both cases where bitmap buf of the printed
> > > > > result is greater than and less than 4KB.
> > > > > And it also covers the case where offset for bitmap_print_to_buf is
> > > > > non-zero which will happen when printed buf is larger than one page
> > > > > in sysfs bin_attribute.
> > > >
> > > > More test cases is always a good thing, thanks!
> > >
> > > Generally yes. But in this case... I believe, Barry didn't write that
> > > huge line below by himself. Most probably he copy-pasted the output of
> > > his bitmap_print_buf() into the test. If so, this code tests nothing,
> > > and just enforces current behavior of snprintf.
> > 
> > I'm not sure I got what you are telling me. The big line is to test
> > strings that are bigger than 4k.
> 
> I'm trying to say that human are not able to verify correctness of
> this line. The test is supposed to check bitmap_print_to_buf(), but
> reference output itself is generated by bitmap_print_to_buf(). This
> test will always pass by design, even if there's an error somewhere
> in the middle, isn't it?

Then please manually check it to verify it is correct or not.  Once we
have it verified, that's fine, it will remain static in this test for
always going forward.

That's what "oracles" are for, there is nothing wrong with this test
case or "proof" that I can see.

> > 
> > ...
> > 
> > > > > +static const char large_list[] __initconst = /* more than 4KB */
> > > > > +   "0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61,64,68,72,76,80,84,88,92,96-97,100-101,104-1"
> > > > > +   "05,108-109,112-113,116-117,120-121,124-125,128,132,136,140,144,148,152,156,160-161,164-165,168-169,172-173,176-1"
> > > > > +   "77,180-181,184-185,188-189,192,196,200,204,208,212,216,220,224-225,228-229,232-233,236-237,240-241,244-245,248-2"
> > >
> > > I don't like this behavior of the code: each individual line is not a
> > > valid bitmap_list. I would prefer to split original bitmap and print
> > > list representation of parts in a compatible format; considering a
> > > receiving part of this splitting machinery.
> > 
> > I agree that split is not the best here, but after all it's only 1
> > line and this is on purpose.
> 
> What I see is that bitmap_print_to_buf() is called many times,

That is not what the above list shows at all, it's one long string all
together, only split up to make it easier for us to work with.

> and
> each time it returns something that is not a valid bitmap list string.
> If the caller was be able to concatenate all the lines returned by
> bitmap_print_to_buf(), he'd probably get correct result. But in such
> case, why don't he use scnprintf("pbl") directly?

I do not understand the objection here at all.  This series is fixing a
real problem that people are having and your complaining about test
strings is _VERY_ odd.

If you have an alternate solution, please propose it, otherwise I will
be taking this series in the next few days.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
  2021-07-21 11:40           ` Greg Kroah-Hartman
@ 2021-07-22 17:09             ` Yury Norov
  2021-07-22 17:47               ` Greg Kroah-Hartman
  0 siblings, 1 reply; 35+ messages in thread
From: Yury Norov @ 2021-07-22 17:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Andy Shevchenko, Andy Shevchenko, Barry Song, Andrew Morton,
	Linux Kernel Mailing List, Dave Hansen, Rasmus Villemoes,
	Rafael J. Wysocki, Randy Dunlap, Alexander Gordeev,
	Stefano Brivio, Ma, Jianpeng, Valentin Schneider,
	Peter Zijlstra (Intel),
	Daniel Bristot de Oliveira, Guodong Xu, tangchengchang,
	Zengtao (B),
	yangyicong, tim.c.chen, Linuxarm

On Wed, Jul 21, 2021 at 01:40:36PM +0200, Greg Kroah-Hartman wrote:
> On Thu, Jul 15, 2021 at 04:23:32PM -0700, Yury Norov wrote:
> > On Fri, Jul 16, 2021 at 12:32:45AM +0300, Andy Shevchenko wrote:
> > > On Thu, Jul 15, 2021 at 11:48 PM Yury Norov <yury.norov@gmail.com> wrote:
> > > > On Thu, Jul 15, 2021 at 03:09:39PM +0300, Andy Shevchenko wrote:
> > > > > On Thu, Jul 15, 2021 at 11:58:56PM +1200, Barry Song wrote:
> > > > > > The added test items cover both cases where bitmap buf of the printed
> > > > > > result is greater than and less than 4KB.
> > > > > > And it also covers the case where offset for bitmap_print_to_buf is
> > > > > > non-zero which will happen when printed buf is larger than one page
> > > > > > in sysfs bin_attribute.
> > > > >
> > > > > More test cases is always a good thing, thanks!
> > > >
> > > > Generally yes. But in this case... I believe, Barry didn't write that
> > > > huge line below by himself. Most probably he copy-pasted the output of
> > > > his bitmap_print_buf() into the test. If so, this code tests nothing,
> > > > and just enforces current behavior of snprintf.
> > > 
> > > I'm not sure I got what you are telling me. The big line is to test
> > > strings that are bigger than 4k.
> > 
> > I'm trying to say that human are not able to verify correctness of
> > this line. The test is supposed to check bitmap_print_to_buf(), but
> > reference output itself is generated by bitmap_print_to_buf(). This
> > test will always pass by design, even if there's an error somewhere
> > in the middle, isn't it?
> 
> Then please manually check it to verify it is correct or not.  Once we
> have it verified, that's fine, it will remain static in this test for
> always going forward.
> 
> That's what "oracles" are for, there is nothing wrong with this test
> case or "proof" that I can see.
> 
> > > 
> > > ...
> > > 
> > > > > > +static const char large_list[] __initconst = /* more than 4KB */
> > > > > > +   "0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61,64,68,72,76,80,84,88,92,96-97,100-101,104-1"
> > > > > > +   "05,108-109,112-113,116-117,120-121,124-125,128,132,136,140,144,148,152,156,160-161,164-165,168-169,172-173,176-1"
> > > > > > +   "77,180-181,184-185,188-189,192,196,200,204,208,212,216,220,224-225,228-229,232-233,236-237,240-241,244-245,248-2"
> > > >
> > > > I don't like this behavior of the code: each individual line is not a
> > > > valid bitmap_list. I would prefer to split original bitmap and print
> > > > list representation of parts in a compatible format; considering a
> > > > receiving part of this splitting machinery.
> > > 
> > > I agree that split is not the best here, but after all it's only 1
> > > line and this is on purpose.
> > 
> > What I see is that bitmap_print_to_buf() is called many times,
> 
> That is not what the above list shows at all, it's one long string all
> together, only split up to make it easier for us to work with.
> 
> > and
> > each time it returns something that is not a valid bitmap list string.
> > If the caller was be able to concatenate all the lines returned by
> > bitmap_print_to_buf(), he'd probably get correct result. But in such
> > case, why don't he use scnprintf("pbl") directly?
> 
> I do not understand the objection here at all.  This series is fixing a
> real problem that eeople are having

I explicitly asked about an example of this problem. Barry answered in
a great length, but the key points are:

        https://lore.kernel.org/lkml/4ab928f1fb3e4420974dfafe4b32f5b7@hisilicon.com/

        > > So, the root problem here is that some machines have so many CPUs that
        > > their cpumask text representation may not fit into the full page in the
        > > worst case. Is my understanding correct? Can you share an example of
        > > such configuration?
        > 
        > in my understanding, I have not found any machine which has really
        > caused the problem till now.

        > [...]
        > 
        > This doesn't really happen nowadays as the maximum
        > NR_CPUS is 8196 for X86_64 and 4096 for ARM64 since 8196 * 9 / 32 = 2305
        > is still smaller than 4KB page size.


If it's not true, can you or Barry please share such an example?

> and your complaining about test
> strings is _VERY_ odd.

The test itself is bad, but it's a minor issue.

My main complain is that the bitmap part of this series introduces a
function that requires O(N^2) of CPU time and O(N) of memory to just
print a string. The existing snprintf does this in O(N) and O(1)
respectively. Additionally to that, the proposed function has some
flaws in design.
 
> If you have an alternate solution, please propose it, otherwise I will
> be taking this series in the next few days.

I think I suggested a better solution in the thread for v4:

https://lore.kernel.org/lkml/YMu2amhrdGZHJ5mY@yury-ThinkPad/

        > kasprintf() does everything you need. Why don't you use it directly in
        > your driver?

I'm not that familiar to sysfs internals to submit a patch, but the
idea in more details is like this:

        cpulist_read(...)
        {
               if (bitmap_string == NULL)
                        bitmap_string = kasprintf(bitmap, ...);

        }

Where bitmap_string pointer may be stored in struct file, struct kobject,
struct bit_attribute or where it's convenient. 

If it's completely impossible to store a pointer, and the bug is real
and urgent, then the best solution I see is to move bitmap_get_print_buf()
to the driver/base/node.c code, because it's not optimal for general use.
Which I already suggested here:

https://lkml.org/lkml/2021/7/16/763

Can you or Barry tell, would one of those alternative solutions work for you?

Thanks,
Yury

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
  2021-07-22 17:09             ` Yury Norov
@ 2021-07-22 17:47               ` Greg Kroah-Hartman
  2021-07-22 18:27                 ` Yury Norov
  0 siblings, 1 reply; 35+ messages in thread
From: Greg Kroah-Hartman @ 2021-07-22 17:47 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Andy Shevchenko, Barry Song, Andrew Morton,
	Linux Kernel Mailing List, Dave Hansen, Rasmus Villemoes,
	Rafael J. Wysocki, Randy Dunlap, Alexander Gordeev,
	Stefano Brivio, Ma, Jianpeng, Valentin Schneider,
	Peter Zijlstra (Intel),
	Daniel Bristot de Oliveira, Guodong Xu, tangchengchang,
	Zengtao (B),
	yangyicong, tim.c.chen, Linuxarm

On Thu, Jul 22, 2021 at 10:09:27AM -0700, Yury Norov wrote:
> On Wed, Jul 21, 2021 at 01:40:36PM +0200, Greg Kroah-Hartman wrote:
> > On Thu, Jul 15, 2021 at 04:23:32PM -0700, Yury Norov wrote:
> > > On Fri, Jul 16, 2021 at 12:32:45AM +0300, Andy Shevchenko wrote:
> > > > On Thu, Jul 15, 2021 at 11:48 PM Yury Norov <yury.norov@gmail.com> wrote:
> > > > > On Thu, Jul 15, 2021 at 03:09:39PM +0300, Andy Shevchenko wrote:
> > > > > > On Thu, Jul 15, 2021 at 11:58:56PM +1200, Barry Song wrote:
> > > > > > > The added test items cover both cases where bitmap buf of the printed
> > > > > > > result is greater than and less than 4KB.
> > > > > > > And it also covers the case where offset for bitmap_print_to_buf is
> > > > > > > non-zero which will happen when printed buf is larger than one page
> > > > > > > in sysfs bin_attribute.
> > > > > >
> > > > > > More test cases is always a good thing, thanks!
> > > > >
> > > > > Generally yes. But in this case... I believe, Barry didn't write that
> > > > > huge line below by himself. Most probably he copy-pasted the output of
> > > > > his bitmap_print_buf() into the test. If so, this code tests nothing,
> > > > > and just enforces current behavior of snprintf.
> > > > 
> > > > I'm not sure I got what you are telling me. The big line is to test
> > > > strings that are bigger than 4k.
> > > 
> > > I'm trying to say that human are not able to verify correctness of
> > > this line. The test is supposed to check bitmap_print_to_buf(), but
> > > reference output itself is generated by bitmap_print_to_buf(). This
> > > test will always pass by design, even if there's an error somewhere
> > > in the middle, isn't it?
> > 
> > Then please manually check it to verify it is correct or not.  Once we
> > have it verified, that's fine, it will remain static in this test for
> > always going forward.
> > 
> > That's what "oracles" are for, there is nothing wrong with this test
> > case or "proof" that I can see.
> > 
> > > > 
> > > > ...
> > > > 
> > > > > > > +static const char large_list[] __initconst = /* more than 4KB */
> > > > > > > +   "0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61,64,68,72,76,80,84,88,92,96-97,100-101,104-1"
> > > > > > > +   "05,108-109,112-113,116-117,120-121,124-125,128,132,136,140,144,148,152,156,160-161,164-165,168-169,172-173,176-1"
> > > > > > > +   "77,180-181,184-185,188-189,192,196,200,204,208,212,216,220,224-225,228-229,232-233,236-237,240-241,244-245,248-2"
> > > > >
> > > > > I don't like this behavior of the code: each individual line is not a
> > > > > valid bitmap_list. I would prefer to split original bitmap and print
> > > > > list representation of parts in a compatible format; considering a
> > > > > receiving part of this splitting machinery.
> > > > 
> > > > I agree that split is not the best here, but after all it's only 1
> > > > line and this is on purpose.
> > > 
> > > What I see is that bitmap_print_to_buf() is called many times,
> > 
> > That is not what the above list shows at all, it's one long string all
> > together, only split up to make it easier for us to work with.
> > 
> > > and
> > > each time it returns something that is not a valid bitmap list string.
> > > If the caller was be able to concatenate all the lines returned by
> > > bitmap_print_to_buf(), he'd probably get correct result. But in such
> > > case, why don't he use scnprintf("pbl") directly?
> > 
> > I do not understand the objection here at all.  This series is fixing a
> > real problem that eeople are having
> 
> I explicitly asked about an example of this problem. Barry answered in
> a great length, but the key points are:
> 
>         https://lore.kernel.org/lkml/4ab928f1fb3e4420974dfafe4b32f5b7@hisilicon.com/
> 
>         > > So, the root problem here is that some machines have so many CPUs that
>         > > their cpumask text representation may not fit into the full page in the
>         > > worst case. Is my understanding correct? Can you share an example of
>         > > such configuration?
>         > 
>         > in my understanding, I have not found any machine which has really
>         > caused the problem till now.
> 
>         > [...]
>         > 
>         > This doesn't really happen nowadays as the maximum
>         > NR_CPUS is 8196 for X86_64 and 4096 for ARM64 since 8196 * 9 / 32 = 2305
>         > is still smaller than 4KB page size.
> 
> 
> If it's not true, can you or Barry please share such an example?

So for a 4k page size, if you have every-other-cpu-enabled on x86, it
will overflow this, right?

And I have heard of systems much bigger than this as well.  Why do you
not think that large number of CPUs are not around?

> > and your complaining about test
> > strings is _VERY_ odd.
> 
> The test itself is bad, but it's a minor issue.
> 
> My main complain is that the bitmap part of this series introduces a
> function that requires O(N^2) of CPU time and O(N) of memory to just
> print a string. The existing snprintf does this in O(N) and O(1)
> respectively. Additionally to that, the proposed function has some
> flaws in design.

Can you propose a better solution?

And is O(N^2) even an issue for this?  Have you run it to determine the
cpu load for such a tiny thing?  Why optimize something that no one has
even tried yet?

> > If you have an alternate solution, please propose it, otherwise I will
> > be taking this series in the next few days.
> 
> I think I suggested a better solution in the thread for v4:
> 
> https://lore.kernel.org/lkml/YMu2amhrdGZHJ5mY@yury-ThinkPad/
> 
>         > kasprintf() does everything you need. Why don't you use it directly in
>         > your driver?
> 
> I'm not that familiar to sysfs internals to submit a patch, but the
> idea in more details is like this:
> 
>         cpulist_read(...)
>         {
>                if (bitmap_string == NULL)
>                         bitmap_string = kasprintf(bitmap, ...);
> 
>         }
> 
> Where bitmap_string pointer may be stored in struct file, struct kobject,
> struct bit_attribute or where it's convenient. 

No, we are not storing strings in a kobject, sorry.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
  2021-07-22 17:47               ` Greg Kroah-Hartman
@ 2021-07-22 18:27                 ` Yury Norov
  2021-07-22 18:36                   ` Andy Shevchenko
  2021-07-22 18:45                   ` Greg Kroah-Hartman
  0 siblings, 2 replies; 35+ messages in thread
From: Yury Norov @ 2021-07-22 18:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Andy Shevchenko, Andy Shevchenko, Barry Song, Andrew Morton,
	Linux Kernel Mailing List, Dave Hansen, Rasmus Villemoes,
	Rafael J. Wysocki, Randy Dunlap, Alexander Gordeev,
	Stefano Brivio, Ma, Jianpeng, Valentin Schneider,
	Peter Zijlstra (Intel),
	Daniel Bristot de Oliveira, Guodong Xu, tangchengchang,
	Zengtao (B),
	yangyicong, tim.c.chen, Linuxarm

On Thu, Jul 22, 2021 at 07:47:28PM +0200, Greg Kroah-Hartman wrote:
> On Thu, Jul 22, 2021 at 10:09:27AM -0700, Yury Norov wrote:
> > On Wed, Jul 21, 2021 at 01:40:36PM +0200, Greg Kroah-Hartman wrote:
> > > On Thu, Jul 15, 2021 at 04:23:32PM -0700, Yury Norov wrote:
> > > > On Fri, Jul 16, 2021 at 12:32:45AM +0300, Andy Shevchenko wrote:
> > > > > On Thu, Jul 15, 2021 at 11:48 PM Yury Norov <yury.norov@gmail.com> wrote:
> > > > > > On Thu, Jul 15, 2021 at 03:09:39PM +0300, Andy Shevchenko wrote:
> > > > > > > On Thu, Jul 15, 2021 at 11:58:56PM +1200, Barry Song wrote:
> > > > > > > > The added test items cover both cases where bitmap buf of the printed
> > > > > > > > result is greater than and less than 4KB.
> > > > > > > > And it also covers the case where offset for bitmap_print_to_buf is
> > > > > > > > non-zero which will happen when printed buf is larger than one page
> > > > > > > > in sysfs bin_attribute.
> > > > > > >
> > > > > > > More test cases is always a good thing, thanks!
> > > > > >
> > > > > > Generally yes. But in this case... I believe, Barry didn't write that
> > > > > > huge line below by himself. Most probably he copy-pasted the output of
> > > > > > his bitmap_print_buf() into the test. If so, this code tests nothing,
> > > > > > and just enforces current behavior of snprintf.
> > > > > 
> > > > > I'm not sure I got what you are telling me. The big line is to test
> > > > > strings that are bigger than 4k.
> > > > 
> > > > I'm trying to say that human are not able to verify correctness of
> > > > this line. The test is supposed to check bitmap_print_to_buf(), but
> > > > reference output itself is generated by bitmap_print_to_buf(). This
> > > > test will always pass by design, even if there's an error somewhere
> > > > in the middle, isn't it?
> > > 
> > > Then please manually check it to verify it is correct or not.  Once we
> > > have it verified, that's fine, it will remain static in this test for
> > > always going forward.
> > > 
> > > That's what "oracles" are for, there is nothing wrong with this test
> > > case or "proof" that I can see.
> > > 
> > > > > 
> > > > > ...
> > > > > 
> > > > > > > > +static const char large_list[] __initconst = /* more than 4KB */
> > > > > > > > +   "0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61,64,68,72,76,80,84,88,92,96-97,100-101,104-1"
> > > > > > > > +   "05,108-109,112-113,116-117,120-121,124-125,128,132,136,140,144,148,152,156,160-161,164-165,168-169,172-173,176-1"
> > > > > > > > +   "77,180-181,184-185,188-189,192,196,200,204,208,212,216,220,224-225,228-229,232-233,236-237,240-241,244-245,248-2"
> > > > > >
> > > > > > I don't like this behavior of the code: each individual line is not a
> > > > > > valid bitmap_list. I would prefer to split original bitmap and print
> > > > > > list representation of parts in a compatible format; considering a
> > > > > > receiving part of this splitting machinery.
> > > > > 
> > > > > I agree that split is not the best here, but after all it's only 1
> > > > > line and this is on purpose.
> > > > 
> > > > What I see is that bitmap_print_to_buf() is called many times,
> > > 
> > > That is not what the above list shows at all, it's one long string all
> > > together, only split up to make it easier for us to work with.
> > > 
> > > > and
> > > > each time it returns something that is not a valid bitmap list string.
> > > > If the caller was be able to concatenate all the lines returned by
> > > > bitmap_print_to_buf(), he'd probably get correct result. But in such
> > > > case, why don't he use scnprintf("pbl") directly?
> > > 
> > > I do not understand the objection here at all.  This series is fixing a
> > > real problem that eeople are having
> > 
> > I explicitly asked about an example of this problem. Barry answered in
> > a great length, but the key points are:
> > 
> >         https://lore.kernel.org/lkml/4ab928f1fb3e4420974dfafe4b32f5b7@hisilicon.com/
> > 
> >         > > So, the root problem here is that some machines have so many CPUs that
> >         > > their cpumask text representation may not fit into the full page in the
> >         > > worst case. Is my understanding correct? Can you share an example of
> >         > > such configuration?
> >         > 
> >         > in my understanding, I have not found any machine which has really
> >         > caused the problem till now.
> > 
> >         > [...]
> >         > 
> >         > This doesn't really happen nowadays as the maximum
> >         > NR_CPUS is 8196 for X86_64 and 4096 for ARM64 since 8196 * 9 / 32 = 2305
> >         > is still smaller than 4KB page size.
> > 
> > 
> > If it's not true, can you or Barry please share such an example?
> 
> So for a 4k page size, if you have every-other-cpu-enabled on x86, it
> will overflow this, right?
> 
> And I have heard of systems much bigger than this as well.  Why do you
> not think that large number of CPUs are not around?

I asked a question: is it urgent?, and I've got an answer: not urgent.
 
> > > and your complaining about test
> > > strings is _VERY_ odd.
> > 
> > The test itself is bad, but it's a minor issue.
> > 
> > My main complain is that the bitmap part of this series introduces a
> > function that requires O(N^2) of CPU time and O(N) of memory to just
> > print a string. The existing snprintf does this in O(N) and O(1)
> > respectively. Additionally to that, the proposed function has some
> > flaws in design.
> 
> Can you propose a better solution?

Yes. Fix sysfs to let bin_attr store a pointer to relevant data.
Meanwhile, use this O(N^2) hack locally.

> And is O(N^2) even an issue for this?

If it's in lib/bitmap than yes, because it's exposed to the whole
kernel.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
  2021-07-22 18:27                 ` Yury Norov
@ 2021-07-22 18:36                   ` Andy Shevchenko
  2021-07-22 18:45                   ` Greg Kroah-Hartman
  1 sibling, 0 replies; 35+ messages in thread
From: Andy Shevchenko @ 2021-07-22 18:36 UTC (permalink / raw)
  To: Yury Norov
  Cc: Greg Kroah-Hartman, Barry Song, Andrew Morton,
	Linux Kernel Mailing List, Dave Hansen, Rasmus Villemoes,
	Rafael J. Wysocki, Randy Dunlap, Alexander Gordeev,
	Stefano Brivio, Ma, Jianpeng, Valentin Schneider,
	Peter Zijlstra (Intel),
	Daniel Bristot de Oliveira, Guodong Xu, tangchengchang,
	Zengtao (B),
	yangyicong, tim.c.chen, Linuxarm

On Thu, Jul 22, 2021 at 11:27:05AM -0700, Yury Norov wrote:
> On Thu, Jul 22, 2021 at 07:47:28PM +0200, Greg Kroah-Hartman wrote:
> > On Thu, Jul 22, 2021 at 10:09:27AM -0700, Yury Norov wrote:

...

> > And I have heard of systems much bigger than this as well.  Why do you
> > not think that large number of CPUs are not around?
> 
> I asked a question: is it urgent?, and I've got an answer: not urgent.

Sometimes that kind of answers may include grain of politeness.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
  2021-07-22 18:27                 ` Yury Norov
  2021-07-22 18:36                   ` Andy Shevchenko
@ 2021-07-22 18:45                   ` Greg Kroah-Hartman
  2021-07-28  9:08                     ` Song Bao Hua (Barry Song)
  1 sibling, 1 reply; 35+ messages in thread
From: Greg Kroah-Hartman @ 2021-07-22 18:45 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Andy Shevchenko, Barry Song, Andrew Morton,
	Linux Kernel Mailing List, Dave Hansen, Rasmus Villemoes,
	Rafael J. Wysocki, Randy Dunlap, Alexander Gordeev,
	Stefano Brivio, Ma, Jianpeng, Valentin Schneider,
	Peter Zijlstra (Intel),
	Daniel Bristot de Oliveira, Guodong Xu, tangchengchang,
	Zengtao (B),
	yangyicong, tim.c.chen, Linuxarm

On Thu, Jul 22, 2021 at 11:27:05AM -0700, Yury Norov wrote:
> On Thu, Jul 22, 2021 at 07:47:28PM +0200, Greg Kroah-Hartman wrote:
> > On Thu, Jul 22, 2021 at 10:09:27AM -0700, Yury Norov wrote:
> > > On Wed, Jul 21, 2021 at 01:40:36PM +0200, Greg Kroah-Hartman wrote:
> > > > On Thu, Jul 15, 2021 at 04:23:32PM -0700, Yury Norov wrote:
> > > > > On Fri, Jul 16, 2021 at 12:32:45AM +0300, Andy Shevchenko wrote:
> > > > > > On Thu, Jul 15, 2021 at 11:48 PM Yury Norov <yury.norov@gmail.com> wrote:
> > > > > > > On Thu, Jul 15, 2021 at 03:09:39PM +0300, Andy Shevchenko wrote:
> > > > > > > > On Thu, Jul 15, 2021 at 11:58:56PM +1200, Barry Song wrote:
> > > > > > > > > The added test items cover both cases where bitmap buf of the printed
> > > > > > > > > result is greater than and less than 4KB.
> > > > > > > > > And it also covers the case where offset for bitmap_print_to_buf is
> > > > > > > > > non-zero which will happen when printed buf is larger than one page
> > > > > > > > > in sysfs bin_attribute.
> > > > > > > >
> > > > > > > > More test cases is always a good thing, thanks!
> > > > > > >
> > > > > > > Generally yes. But in this case... I believe, Barry didn't write that
> > > > > > > huge line below by himself. Most probably he copy-pasted the output of
> > > > > > > his bitmap_print_buf() into the test. If so, this code tests nothing,
> > > > > > > and just enforces current behavior of snprintf.
> > > > > > 
> > > > > > I'm not sure I got what you are telling me. The big line is to test
> > > > > > strings that are bigger than 4k.
> > > > > 
> > > > > I'm trying to say that human are not able to verify correctness of
> > > > > this line. The test is supposed to check bitmap_print_to_buf(), but
> > > > > reference output itself is generated by bitmap_print_to_buf(). This
> > > > > test will always pass by design, even if there's an error somewhere
> > > > > in the middle, isn't it?
> > > > 
> > > > Then please manually check it to verify it is correct or not.  Once we
> > > > have it verified, that's fine, it will remain static in this test for
> > > > always going forward.
> > > > 
> > > > That's what "oracles" are for, there is nothing wrong with this test
> > > > case or "proof" that I can see.
> > > > 
> > > > > > 
> > > > > > ...
> > > > > > 
> > > > > > > > > +static const char large_list[] __initconst = /* more than 4KB */
> > > > > > > > > +   "0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61,64,68,72,76,80,84,88,92,96-97,100-101,104-1"
> > > > > > > > > +   "05,108-109,112-113,116-117,120-121,124-125,128,132,136,140,144,148,152,156,160-161,164-165,168-169,172-173,176-1"
> > > > > > > > > +   "77,180-181,184-185,188-189,192,196,200,204,208,212,216,220,224-225,228-229,232-233,236-237,240-241,244-245,248-2"
> > > > > > >
> > > > > > > I don't like this behavior of the code: each individual line is not a
> > > > > > > valid bitmap_list. I would prefer to split original bitmap and print
> > > > > > > list representation of parts in a compatible format; considering a
> > > > > > > receiving part of this splitting machinery.
> > > > > > 
> > > > > > I agree that split is not the best here, but after all it's only 1
> > > > > > line and this is on purpose.
> > > > > 
> > > > > What I see is that bitmap_print_to_buf() is called many times,
> > > > 
> > > > That is not what the above list shows at all, it's one long string all
> > > > together, only split up to make it easier for us to work with.
> > > > 
> > > > > and
> > > > > each time it returns something that is not a valid bitmap list string.
> > > > > If the caller was be able to concatenate all the lines returned by
> > > > > bitmap_print_to_buf(), he'd probably get correct result. But in such
> > > > > case, why don't he use scnprintf("pbl") directly?
> > > > 
> > > > I do not understand the objection here at all.  This series is fixing a
> > > > real problem that eeople are having
> > > 
> > > I explicitly asked about an example of this problem. Barry answered in
> > > a great length, but the key points are:
> > > 
> > >         https://lore.kernel.org/lkml/4ab928f1fb3e4420974dfafe4b32f5b7@hisilicon.com/
> > > 
> > >         > > So, the root problem here is that some machines have so many CPUs that
> > >         > > their cpumask text representation may not fit into the full page in the
> > >         > > worst case. Is my understanding correct? Can you share an example of
> > >         > > such configuration?
> > >         > 
> > >         > in my understanding, I have not found any machine which has really
> > >         > caused the problem till now.
> > > 
> > >         > [...]
> > >         > 
> > >         > This doesn't really happen nowadays as the maximum
> > >         > NR_CPUS is 8196 for X86_64 and 4096 for ARM64 since 8196 * 9 / 32 = 2305
> > >         > is still smaller than 4KB page size.
> > > 
> > > 
> > > If it's not true, can you or Barry please share such an example?
> > 
> > So for a 4k page size, if you have every-other-cpu-enabled on x86, it
> > will overflow this, right?
> > 
> > And I have heard of systems much bigger than this as well.  Why do you
> > not think that large number of CPUs are not around?
> 
> I asked a question: is it urgent?, and I've got an answer: not urgent.

Just because people can not publically state that they are running Linux
on bigger boxes than this, does NOT mean that they are not running Linux
on bigger boxes than this.

So sometimes, you just have to trust that if someone says "hey, this is
a problem over here", that maybe it really is a problem.

> > > > and your complaining about test
> > > > strings is _VERY_ odd.
> > > 
> > > The test itself is bad, but it's a minor issue.
> > > 
> > > My main complain is that the bitmap part of this series introduces a
> > > function that requires O(N^2) of CPU time and O(N) of memory to just
> > > print a string. The existing snprintf does this in O(N) and O(1)
> > > respectively. Additionally to that, the proposed function has some
> > > flaws in design.
> > 
> > Can you propose a better solution?
> 
> Yes. Fix sysfs to let bin_attr store a pointer to relevant data.

Why?  It changes all the time and should be generated dynamically.

> Meanwhile, use this O(N^2) hack locally.

Who else uses this that it matters?

> > And is O(N^2) even an issue for this?
> 
> If it's in lib/bitmap than yes, because it's exposed to the whole
> kernel.

Who else will use it that it matters for speed?

And did you measure the speed of this?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
  2021-07-22 18:45                   ` Greg Kroah-Hartman
@ 2021-07-28  9:08                     ` Song Bao Hua (Barry Song)
  2021-07-28  9:24                       ` Andy Shevchenko
  0 siblings, 1 reply; 35+ messages in thread
From: Song Bao Hua (Barry Song) @ 2021-07-28  9:08 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Yury Norov
  Cc: Andy Shevchenko, Andy Shevchenko, Andrew Morton,
	Linux Kernel Mailing List, Dave Hansen, Rasmus Villemoes,
	Rafael J. Wysocki, Randy Dunlap, Alexander Gordeev,
	Stefano Brivio, Ma, Jianpeng, Valentin Schneider,
	Peter Zijlstra (Intel),
	Daniel Bristot de Oliveira, Guodong Xu, tangchengchang,
	Zengtao (B),
	yangyicong, tim.c.chen, Linuxarm



> -----Original Message-----
> From: Greg Kroah-Hartman [mailto:gregkh@linuxfoundation.org]
> Sent: Friday, July 23, 2021 6:45 AM
> To: Yury Norov <yury.norov@gmail.com>
> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>; Andy Shevchenko
> <andriy.shevchenko@linux.intel.com>; Song Bao Hua (Barry Song)
> <song.bao.hua@hisilicon.com>; Andrew Morton <akpm@linux-foundation.org>;
> Linux Kernel Mailing List <linux-kernel@vger.kernel.org>; Dave Hansen
> <dave.hansen@intel.com>; Rasmus Villemoes <linux@rasmusvillemoes.dk>; Rafael
> J. Wysocki <rafael@kernel.org>; Randy Dunlap <rdunlap@infradead.org>;
> Alexander Gordeev <agordeev@linux.ibm.com>; Stefano Brivio
> <sbrivio@redhat.com>; Ma, Jianpeng <jianpeng.ma@intel.com>; Valentin
> Schneider <valentin.schneider@arm.com>; Peter Zijlstra (Intel)
> <peterz@infradead.org>; Daniel Bristot de Oliveira <bristot@redhat.com>;
> Guodong Xu <guodong.xu@linaro.org>; tangchengchang
> <tangchengchang@huawei.com>; Zengtao (B) <prime.zeng@hisilicon.com>;
> yangyicong <yangyicong@huawei.com>; tim.c.chen@linux.intel.com; Linuxarm
> <linuxarm@huawei.com>
> Subject: Re: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
> 
> On Thu, Jul 22, 2021 at 11:27:05AM -0700, Yury Norov wrote:
> > On Thu, Jul 22, 2021 at 07:47:28PM +0200, Greg Kroah-Hartman wrote:
> > > On Thu, Jul 22, 2021 at 10:09:27AM -0700, Yury Norov wrote:
> > > > On Wed, Jul 21, 2021 at 01:40:36PM +0200, Greg Kroah-Hartman wrote:
> > > > > On Thu, Jul 15, 2021 at 04:23:32PM -0700, Yury Norov wrote:
> > > > > > On Fri, Jul 16, 2021 at 12:32:45AM +0300, Andy Shevchenko wrote:
> > > > > > > On Thu, Jul 15, 2021 at 11:48 PM Yury Norov <yury.norov@gmail.com>
> wrote:
> > > > > > > > On Thu, Jul 15, 2021 at 03:09:39PM +0300, Andy Shevchenko wrote:
> > > > > > > > > On Thu, Jul 15, 2021 at 11:58:56PM +1200, Barry Song wrote:
> > > > > > > > > > The added test items cover both cases where bitmap buf of
> the printed
> > > > > > > > > > result is greater than and less than 4KB.
> > > > > > > > > > And it also covers the case where offset for bitmap_print_to_buf
> is
> > > > > > > > > > non-zero which will happen when printed buf is larger than
> one page
> > > > > > > > > > in sysfs bin_attribute.
> > > > > > > > >
> > > > > > > > > More test cases is always a good thing, thanks!
> > > > > > > >
> > > > > > > > Generally yes. But in this case... I believe, Barry didn't write
> that
> > > > > > > > huge line below by himself. Most probably he copy-pasted the output
> of
> > > > > > > > his bitmap_print_buf() into the test. If so, this code tests nothing,
> > > > > > > > and just enforces current behavior of snprintf.
> > > > > > >
> > > > > > > I'm not sure I got what you are telling me. The big line is to test
> > > > > > > strings that are bigger than 4k.
> > > > > >
> > > > > > I'm trying to say that human are not able to verify correctness of
> > > > > > this line. The test is supposed to check bitmap_print_to_buf(), but
> > > > > > reference output itself is generated by bitmap_print_to_buf(). This
> > > > > > test will always pass by design, even if there's an error somewhere
> > > > > > in the middle, isn't it?
> > > > >
> > > > > Then please manually check it to verify it is correct or not.  Once we
> > > > > have it verified, that's fine, it will remain static in this test for
> > > > > always going forward.
> > > > >
> > > > > That's what "oracles" are for, there is nothing wrong with this test
> > > > > case or "proof" that I can see.
> > > > >

I have manually verified the test case from multiple 
different sides. The purpose is verifying large_bitmap,
large_mask and large_list in the test case are consistent.

What I have done includes:
1. snprintf(%*pbl) from large_bitmap to an intermediate buffer - vbf, 
and strcmp(vbf, large_list)
2. snprintf(%*pb) from large_bitmap to an intermediate buffer - vbf, 
and strcmp(vbf, large_mask)
3. bitmap_parselist from large_list to an intermediate bitmap - vb, 
and bitmap_equal(vb, large_bitmap)
4. bitmap_parse from large_mask to an intermediate bitmap - vb, 
and bitmap_equal(vb, large_bitmap)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index eb8ebaf12865..3efedc86a1b9 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -781,10 +781,45 @@ static const struct test_bitmap_print
test_print[] __initconst = {
        { large_bitmap, sizeof(large_bitmap) * BITS_PER_BYTE, large_mask, large_list },
 };

+#define NEW_API_VERIFIED
+
+#ifdef  NEW_API_VERIFIED
+#define VBF_SIZE (2 * PAGE_SIZE)
+static char vbf[VBF_SIZE];
+static int vb_bits = sizeof(large_bitmap) * BITS_PER_BYTE;
+DECLARE_BITMAP(vbmap, sizeof(large_bitmap) * BITS_PER_BYTE);
+#endif
+
 static void __init test_bitmap_print_buf(void)
 {
        int i;

+#ifdef NEW_API_VERIFIED
+               snprintf(vbf, VBF_SIZE, "%*pbl\n", vb_bits, large_bitmap);
+               if (strcmp(vbf, large_list))
+                       printk("%s WRONG large bitmap list, print pbl verified\n", __func__);
+               else
+                       printk("%s CORRECT large bitmap list, print pbl verified\n", __func__);
+
+               snprintf(vbf, VBF_SIZE, "%*pb\n", vb_bits, large_bitmap);
+               if (strcmp(vbf, large_mask))
+                       printk("%s WRONG large bitmap mask, print pb verified\n", __func__);
+               else
+                       printk("%s CORRECT large bitmap mask, print pb verified\n", __func__);
+
+               bitmap_parselist(large_list, vbmap, vb_bits);
+               if (bitmap_equal(vbmap, large_bitmap, vb_bits))
+                       printk("%s CORRECT large bitmap mask, parselist verified\n", __func__);
+               else
+                       printk("%s WRONG large bitmap mask, parselist verified\n", __func__);
+
+               bitmap_parse(large_mask, sizeof(large_mask), vbmap, vb_bits);
+               if (bitmap_equal(vbmap, large_bitmap, vb_bits))
+                       printk("%s CORRECT large bitmap mask, parselist verified\n", __func__);
+               else
+                       printk("%s WRONG large bitmap mask, parselist verified\n", __func__);
+#endif
+
        for (i = 0; i < ARRAY_SIZE(test_print); i++) {
                const struct test_bitmap_print *t = &test_print[i];
                int n;

Those cases are all good:
[    1.490355] test_bitmap: loaded.
[    1.494449] test_bitmap: parselist: 14: input is '0-2047:128/256'
OK, Time: 8384
[    1.507611] test_bitmap_print_buf CORRECT large bitmap list, print pbl verified
[    1.508415] test_bitmap_print_buf CORRECT large bitmap mask, print pb verified
[    1.510337] test_bitmap_print_buf CORRECT large bitmap mask, parselist verified
[    1.510770] test_bitmap_print_buf CORRECT large bitmap mask, parse verified
[    1.512833] test_bitmap: all 1945 tests passed

> > > > > > >
> > > > > > > ...
> > > > > > >
> > > > > > > > > > +static const char large_list[] __initconst = /* more than
> 4KB */
> > > > > > > > > > +
> "0,4,8,12,16,20,24,28,32-33,36-37,40-41,44-45,48-49,52-53,56-57,60-61,64,6
> 8,72,76,80,84,88,92,96-97,100-101,104-1"
> > > > > > > > > > +
> "05,108-109,112-113,116-117,120-121,124-125,128,132,136,140,144,148,152,15
> 6,160-161,164-165,168-169,172-173,176-1"
> > > > > > > > > > +
> "77,180-181,184-185,188-189,192,196,200,204,208,212,216,220,224-225,228-22
> 9,232-233,236-237,240-241,244-245,248-2"
> > > > > > > >
> > > > > > > > I don't like this behavior of the code: each individual line is
> not a
> > > > > > > > valid bitmap_list. I would prefer to split original bitmap and
> print
> > > > > > > > list representation of parts in a compatible format; considering
> a
> > > > > > > > receiving part of this splitting machinery.
> > > > > > >
> > > > > > > I agree that split is not the best here, but after all it's only
> 1
> > > > > > > line and this is on purpose.
> > > > > >
> > > > > > What I see is that bitmap_print_to_buf() is called many times,
> > > > >
> > > > > That is not what the above list shows at all, it's one long string all
> > > > > together, only split up to make it easier for us to work with.
> > > > >
> > > > > > and
> > > > > > each time it returns something that is not a valid bitmap list string.
> > > > > > If the caller was be able to concatenate all the lines returned by
> > > > > > bitmap_print_to_buf(), he'd probably get correct result. But in such
> > > > > > case, why don't he use scnprintf("pbl") directly?
> > > > >
> > > > > I do not understand the objection here at all.  This series is fixing
> a
> > > > > real problem that eeople are having
> > > >
> > > > I explicitly asked about an example of this problem. Barry answered in
> > > > a great length, but the key points are:
> > > >
> > > >
> https://lore.kernel.org/lkml/4ab928f1fb3e4420974dfafe4b32f5b7@hisilicon.co
> m/
> > > >
> > > >         > > So, the root problem here is that some machines have so many
> CPUs that
> > > >         > > their cpumask text representation may not fit into the full
> page in the
> > > >         > > worst case. Is my understanding correct? Can you share an example
> of
> > > >         > > such configuration?
> > > >         >
> > > >         > in my understanding, I have not found any machine which has really
> > > >         > caused the problem till now.
> > > >
> > > >         > [...]
> > > >         >
> > > >         > This doesn't really happen nowadays as the maximum
> > > >         > NR_CPUS is 8196 for X86_64 and 4096 for ARM64 since 8196 * 9 /
> 32 = 2305
> > > >         > is still smaller than 4KB page size.
> > > >
> > > >
> > > > If it's not true, can you or Barry please share such an example?
> > >
> > > So for a 4k page size, if you have every-other-cpu-enabled on x86, it
> > > will overflow this, right?
> > >
> > > And I have heard of systems much bigger than this as well.  Why do you
> > > not think that large number of CPUs are not around?
> >
> > I asked a question: is it urgent?, and I've got an answer: not urgent.
> 
> Just because people can not publically state that they are running Linux
> on bigger boxes than this, does NOT mean that they are not running Linux
> on bigger boxes than this.
> 
> So sometimes, you just have to trust that if someone says "hey, this is
> a problem over here", that maybe it really is a problem.
> 
> > > > > and your complaining about test
> > > > > strings is _VERY_ odd.
> > > >
> > > > The test itself is bad, but it's a minor issue.
> > > >
> > > > My main complain is that the bitmap part of this series introduces a
> > > > function that requires O(N^2) of CPU time and O(N) of memory to just
> > > > print a string. The existing snprintf does this in O(N) and O(1)
> > > > respectively. Additionally to that, the proposed function has some
> > > > flaws in design.
> > >
> > > Can you propose a better solution?
> >
> > Yes. Fix sysfs to let bin_attr store a pointer to relevant data.
> 
> Why?  It changes all the time and should be generated dynamically.
> 
> > Meanwhile, use this O(N^2) hack locally.
> 
> Who else uses this that it matters?
> 
> > > And is O(N^2) even an issue for this?
> >
> > If it's in lib/bitmap than yes, because it's exposed to the whole
> > kernel.
> 
> Who else will use it that it matters for speed?
> 

My understanding is that nobody else will print bitmap to a human-readable
mask or list except things like sysfs ABI for userspace. So I have been
arguing performance wouldn't be a concern here.
Those kernel modules who care about performance will directly run bit
operations like and/or/andnot etc on bitmap binaries rather than on a
list or a mask.

On the other hand, Yury's comment on providing a bitmap_max_string(),
which can estimate the max size of the mask or the list according to
bitmap size, might be able to help set a relatively more precise size
for the ABI sysfs file if people care about the file size of the sysfs
entry.

int bitmap_max_string_mask(int nbits)
{
	/* each 32bits need 9 bytes like "ffffffff,"
	return DIV_ROUND_UP(nbits, 32) * 9;
}
int bitmap_max_string_list(int nbits)
{
	...
}

Perhaps this could be an incremental patch after the current patchset
settle down.
Though I'm not quite sure it can really apply to bin_attribute, maybe
we can set the ret value to bin_attribute->bsize in some way?
But, not quite sure bin_attribute->bsize can use a dynamic value since
the size is needed while sysfs file is created:
int sysfs_add_file_mode_ns(struct kernfs_node *parent,
			   const struct attribute *attr, bool is_bin,
			   umode_t mode, kuid_t uid, kgid_t gid, const void *ns)
{
	struct lock_class_key *key = NULL;
	const struct kernfs_ops *ops;
	struct kernfs_node *kn;
	loff_t size;

	if (!is_bin) {
		...

		size = PAGE_SIZE;
	} else {
		struct bin_attribute *battr = (void *)attr;

		if (battr->mmap)
			ops = &sysfs_bin_kfops_mmap;
		...

		size = battr->size;
	}


	kn = __kernfs_create_file(parent, attr->name, mode & 0777, uid, gid,
				  size, ops, (void *)attr, ns, key);
	...
	return 0;
}

> And did you measure the speed of this?
> 
> thanks,
> 
> greg k-h

Thanks
Barry

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases
  2021-07-28  9:08                     ` Song Bao Hua (Barry Song)
@ 2021-07-28  9:24                       ` Andy Shevchenko
  0 siblings, 0 replies; 35+ messages in thread
From: Andy Shevchenko @ 2021-07-28  9:24 UTC (permalink / raw)
  To: Song Bao Hua (Barry Song)
  Cc: Greg Kroah-Hartman, Yury Norov, Andy Shevchenko, Andrew Morton,
	Linux Kernel Mailing List, Dave Hansen, Rasmus Villemoes,
	Rafael J. Wysocki, Randy Dunlap, Alexander Gordeev,
	Stefano Brivio, Ma, Jianpeng, Valentin Schneider,
	Peter Zijlstra (Intel),
	Daniel Bristot de Oliveira, Guodong Xu, tangchengchang,
	Zengtao (B),
	yangyicong, tim.c.chen, Linuxarm

On Wed, Jul 28, 2021 at 12:08 PM Song Bao Hua (Barry Song)
<song.bao.hua@hisilicon.com> wrote:
> > From: Greg Kroah-Hartman [mailto:gregkh@linuxfoundation.org]
> > Sent: Friday, July 23, 2021 6:45 AM

...

> +static int vb_bits = sizeof(large_bitmap) * BITS_PER_BYTE;
> +DECLARE_BITMAP(vbmap, sizeof(large_bitmap) * BITS_PER_BYTE);

Just side note: we have BITS_PER_TYPE() macro :-)

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 0/4] use bin_attribute to break the size limitation of cpumap ABI
  2021-07-15 11:58 [PATCH v7 0/4] use bin_attribute to break the size limitation of cpumap ABI Barry Song
                   ` (3 preceding siblings ...)
  2021-07-15 11:58 ` [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases Barry Song
@ 2021-07-28 13:41 ` Greg KH
  2021-07-28 14:53   ` Yury Norov
  2021-07-28 18:58 ` [PATCH] bitmap: extend comment to bitmap_print_to_buf Yury Norov
  5 siblings, 1 reply; 35+ messages in thread
From: Greg KH @ 2021-07-28 13:41 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, andriy.shevchenko, yury.norov, linux-kernel, dave.hansen,
	linux, rafael, rdunlap, agordeev, sbrivio, jianpeng.ma,
	valentin.schneider, peterz, bristot, guodong.xu, tangchengchang,
	prime.zeng, yangyicong, tim.c.chen, linuxarm

On Thu, Jul 15, 2021 at 11:58:52PM +1200, Barry Song wrote:
> v7:
>   - update doc in code for new APIs according to the comments of
>     Andy Shevchenko;
>   - other minor cleanup and commit log fix according to the comments
>     of Andy Shevchenko

I'm lost to tell if this is the latest version or if there are more
changes?  Can you send this again with the latest changes so I can
review it?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 0/4] use bin_attribute to break the size limitation of cpumap ABI
  2021-07-28 13:41 ` [PATCH v7 0/4] use bin_attribute to break the size limitation of cpumap ABI Greg KH
@ 2021-07-28 14:53   ` Yury Norov
  2021-07-28 15:25     ` Greg KH
  0 siblings, 1 reply; 35+ messages in thread
From: Yury Norov @ 2021-07-28 14:53 UTC (permalink / raw)
  To: Greg KH
  Cc: Barry Song, akpm, andriy.shevchenko, linux-kernel, dave.hansen,
	linux, rafael, rdunlap, agordeev, sbrivio, jianpeng.ma,
	valentin.schneider, peterz, bristot, guodong.xu, tangchengchang,
	prime.zeng, yangyicong, tim.c.chen, linuxarm

On Wed, Jul 28, 2021 at 03:41:00PM +0200, Greg KH wrote:
> On Thu, Jul 15, 2021 at 11:58:52PM +1200, Barry Song wrote:
> > v7:
> >   - update doc in code for new APIs according to the comments of
> >     Andy Shevchenko;
> >   - other minor cleanup and commit log fix according to the comments
> >     of Andy Shevchenko
> 
> I'm lost to tell if this is the latest version or if there are more
> changes?  Can you send this again with the latest changes so I can
> review it?

Barry, Greg,

If you decide to keep bitmap_print_to_buf in lib/bitmap.c, could you
please add the following patch to the series.

Thanks,
Yury


From 58602766dc2877d2103a334db6c2c2e1e6b8c89b Mon Sep 17 00:00:00 2001
From: Yury Norov <yury.norov@gmail.com>
Date: Wed, 28 Jul 2021 07:39:30 -0700
Subject: [PATCH] bitmap: extend comment to bitmap_print_to_buf

Extend comment to new function to warn potential users about caveats.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 lib/bitmap.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/lib/bitmap.c b/lib/bitmap.c
index 56bcffe2fa8c..b9f557ca668c 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -545,6 +545,24 @@ EXPORT_SYMBOL(bitmap_print_to_pagebuf);
  * mainly serves bin_attribute which doesn't work with exact one page, and it
  * can break the size limit of converted decimal list and hexadecimal bitmask.
  *
+ * WARNING!
+ *
+ * This function is not a replacement for sprintf() or bitmap_print_to_pagebuf().
+ * It is intended to workaround sysfs limitations discussed above and should be
+ * used carefully in general case for the following reasons:
+ *  - Time complexity is O(nbits^2/count), comparing to O(nbits) for snprintf().
+ *  - Memory complexity is O(nbits), comparing to O(1) for snprintf().
+ *  - @off and @count are NOT offset and number of bits to print.
+ *  - If printing part of bitmap as list, the resulting string is not a correct
+ *    list representation of bitmap. Particularly, some bits within or out of
+ *    related interval may be erroneously set or unset. The format of the string
+ *    may be broken, so bitmap_parselist() may fail parsing it.
+ *  - If printing the whole bitmap as list by parts, user must ensure the order
+ *    of calls of the function such that the offset is incremented linearly.
+ *  - If printing the whole bitmap as list by parts, user must keep bitmap
+ *    unchanged between the very first and very last call. Otherwise concatenated
+ *    result may be incorrect, and format may be broken.
+ *
  * Returns the number of characters actually printed to @buf
  */
 int bitmap_print_to_buf(bool list, char *buf, const unsigned long *maskp,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v7 0/4] use bin_attribute to break the size limitation of cpumap ABI
  2021-07-28 14:53   ` Yury Norov
@ 2021-07-28 15:25     ` Greg KH
  0 siblings, 0 replies; 35+ messages in thread
From: Greg KH @ 2021-07-28 15:25 UTC (permalink / raw)
  To: Yury Norov
  Cc: Barry Song, akpm, andriy.shevchenko, linux-kernel, dave.hansen,
	linux, rafael, rdunlap, agordeev, sbrivio, jianpeng.ma,
	valentin.schneider, peterz, bristot, guodong.xu, tangchengchang,
	prime.zeng, yangyicong, tim.c.chen, linuxarm

On Wed, Jul 28, 2021 at 07:53:59AM -0700, Yury Norov wrote:
> On Wed, Jul 28, 2021 at 03:41:00PM +0200, Greg KH wrote:
> > On Thu, Jul 15, 2021 at 11:58:52PM +1200, Barry Song wrote:
> > > v7:
> > >   - update doc in code for new APIs according to the comments of
> > >     Andy Shevchenko;
> > >   - other minor cleanup and commit log fix according to the comments
> > >     of Andy Shevchenko
> > 
> > I'm lost to tell if this is the latest version or if there are more
> > changes?  Can you send this again with the latest changes so I can
> > review it?
> 
> Barry, Greg,
> 
> If you decide to keep bitmap_print_to_buf in lib/bitmap.c, could you
> please add the following patch to the series.

Feel free to submit this as an add-on patch in the proper way so that it
could be applied.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH] bitmap: extend comment to bitmap_print_to_buf
  2021-07-15 11:58 [PATCH v7 0/4] use bin_attribute to break the size limitation of cpumap ABI Barry Song
                   ` (4 preceding siblings ...)
  2021-07-28 13:41 ` [PATCH v7 0/4] use bin_attribute to break the size limitation of cpumap ABI Greg KH
@ 2021-07-28 18:58 ` Yury Norov
  2021-07-29  6:04   ` Song Bao Hua (Barry Song)
  5 siblings, 1 reply; 35+ messages in thread
From: Yury Norov @ 2021-07-28 18:58 UTC (permalink / raw)
  To: linux-kernel, gregkh, Barry Song
  Cc: Yury Norov, dave.hansen, linux, rafael, rdunlap, agordeev,
	sbrivio, jianpeng.ma, valentin.schneider, peterz, bristot,
	guodong.xu, tangchengchang, prime.zeng, yangyicong, tim.c.chen,
	linuxarm, akpm, andriy.shevchenko

Extend comment to new function to warn potential users about caveats.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 lib/bitmap.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/lib/bitmap.c b/lib/bitmap.c
index 56bcffe2fa8c..b9f557ca668c 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -545,6 +545,24 @@ EXPORT_SYMBOL(bitmap_print_to_pagebuf);
  * mainly serves bin_attribute which doesn't work with exact one page, and it
  * can break the size limit of converted decimal list and hexadecimal bitmask.
  *
+ * WARNING!
+ *
+ * This function is not a replacement for sprintf() or bitmap_print_to_pagebuf().
+ * It is intended to workaround sysfs limitations discussed above and should be
+ * used carefully in general case for the following reasons:
+ *  - Time complexity is O(nbits^2/count), comparing to O(nbits) for snprintf().
+ *  - Memory complexity is O(nbits), comparing to O(1) for snprintf().
+ *  - @off and @count are NOT offset and number of bits to print.
+ *  - If printing part of bitmap as list, the resulting string is not a correct
+ *    list representation of bitmap. Particularly, some bits within or out of
+ *    related interval may be erroneously set or unset. The format of the string
+ *    may be broken, so bitmap_parselist-like parser may fail parsing it.
+ *  - If printing the whole bitmap as list by parts, user must ensure the order
+ *    of calls of the function such that the offset is incremented linearly.
+ *  - If printing the whole bitmap as list by parts, user must keep bitmap
+ *    unchanged between the very first and very last call. Otherwise concatenated
+ *    result may be incorrect, and format may be broken.
+ *
  * Returns the number of characters actually printed to @buf
  */
 int bitmap_print_to_buf(bool list, char *buf, const unsigned long *maskp,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* RE: [PATCH] bitmap: extend comment to bitmap_print_to_buf
  2021-07-28 18:58 ` [PATCH] bitmap: extend comment to bitmap_print_to_buf Yury Norov
@ 2021-07-29  6:04   ` Song Bao Hua (Barry Song)
  0 siblings, 0 replies; 35+ messages in thread
From: Song Bao Hua (Barry Song) @ 2021-07-29  6:04 UTC (permalink / raw)
  To: Yury Norov, linux-kernel, gregkh
  Cc: dave.hansen, linux, rafael, rdunlap, agordeev, sbrivio,
	jianpeng.ma, valentin.schneider, peterz, bristot, guodong.xu,
	tangchengchang, Zengtao (B),
	yangyicong, tim.c.chen, Linuxarm, akpm, andriy.shevchenko



> -----Original Message-----
> From: Yury Norov [mailto:yury.norov@gmail.com]
> Sent: Thursday, July 29, 2021 6:59 AM
> To: linux-kernel@vger.kernel.org; gregkh@linuxfoundation.org; Song Bao Hua
> (Barry Song) <song.bao.hua@hisilicon.com>
> Cc: Yury Norov <yury.norov@gmail.com>; dave.hansen@intel.com;
> linux@rasmusvillemoes.dk; rafael@kernel.org; rdunlap@infradead.org;
> agordeev@linux.ibm.com; sbrivio@redhat.com; jianpeng.ma@intel.com;
> valentin.schneider@arm.com; peterz@infradead.org; bristot@redhat.com;
> guodong.xu@linaro.org; tangchengchang <tangchengchang@huawei.com>; Zengtao (B)
> <prime.zeng@hisilicon.com>; yangyicong <yangyicong@huawei.com>;
> tim.c.chen@linux.intel.com; Linuxarm <linuxarm@huawei.com>;
> akpm@linux-foundation.org; andriy.shevchenko@linux.intel.com
> Subject: [PATCH] bitmap: extend comment to bitmap_print_to_buf
> 
> Extend comment to new function to warn potential users about caveats.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---

Looks awesome. Thanks, Yury. I have integrated your patch into
the latest series v8.

>  lib/bitmap.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/lib/bitmap.c b/lib/bitmap.c
> index 56bcffe2fa8c..b9f557ca668c 100644
> --- a/lib/bitmap.c
> +++ b/lib/bitmap.c
> @@ -545,6 +545,24 @@ EXPORT_SYMBOL(bitmap_print_to_pagebuf);
>   * mainly serves bin_attribute which doesn't work with exact one page, and it
>   * can break the size limit of converted decimal list and hexadecimal bitmask.
>   *
> + * WARNING!
> + *
> + * This function is not a replacement for sprintf() or
> bitmap_print_to_pagebuf().
> + * It is intended to workaround sysfs limitations discussed above and should
> be
> + * used carefully in general case for the following reasons:
> + *  - Time complexity is O(nbits^2/count), comparing to O(nbits) for snprintf().
> + *  - Memory complexity is O(nbits), comparing to O(1) for snprintf().
> + *  - @off and @count are NOT offset and number of bits to print.
> + *  - If printing part of bitmap as list, the resulting string is not a correct
> + *    list representation of bitmap. Particularly, some bits within or out of
> + *    related interval may be erroneously set or unset. The format of the string
> + *    may be broken, so bitmap_parselist-like parser may fail parsing it.
> + *  - If printing the whole bitmap as list by parts, user must ensure the order
> + *    of calls of the function such that the offset is incremented linearly.
> + *  - If printing the whole bitmap as list by parts, user must keep bitmap
> + *    unchanged between the very first and very last call. Otherwise concatenated
> + *    result may be incorrect, and format may be broken.
> + *
>   * Returns the number of characters actually printed to @buf
>   */
>  int bitmap_print_to_buf(bool list, char *buf, const unsigned long *maskp,
> --
> 2.30.2

Thanks
Barry


^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2021-07-29  6:04 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-15 11:58 [PATCH v7 0/4] use bin_attribute to break the size limitation of cpumap ABI Barry Song
2021-07-15 11:58 ` [PATCH v7 1/4] cpumask: introduce cpumap_print_to_buf to support large bitmask and list Barry Song
2021-07-15 15:28   ` Yury Norov
2021-07-15 21:08     ` Song Bao Hua (Barry Song)
2021-07-16  0:57     ` Song Bao Hua (Barry Song)
2021-07-15 11:58 ` [PATCH v7 2/4] topology: use bin_attribute to break the size limitation of cpumap ABI Barry Song
2021-07-16  8:49   ` Song Bao Hua (Barry Song)
2021-07-16 20:04     ` Yury Norov
2021-07-17  0:16       ` Song Bao Hua (Barry Song)
2021-07-17  1:12         ` Yury Norov
2021-07-19  9:07           ` andriy.shevchenko
2021-07-19 11:10             ` Song Bao Hua (Barry Song)
2021-07-19 17:10               ` Yury Norov
2021-07-21  9:30                 ` Song Bao Hua (Barry Song)
2021-07-15 11:58 ` [PATCH v7 3/4] drivers/base/node.c: " Barry Song
2021-07-15 11:58 ` [PATCH v7 4/4] lib: test_bitmap: add bitmap_print_to_buf test cases Barry Song
2021-07-15 12:09   ` Andy Shevchenko
2021-07-15 20:47     ` Yury Norov
2021-07-15 21:32       ` Andy Shevchenko
2021-07-15 23:23         ` Yury Norov
2021-07-16  0:41           ` Song Bao Hua (Barry Song)
2021-07-21 11:40           ` Greg Kroah-Hartman
2021-07-22 17:09             ` Yury Norov
2021-07-22 17:47               ` Greg Kroah-Hartman
2021-07-22 18:27                 ` Yury Norov
2021-07-22 18:36                   ` Andy Shevchenko
2021-07-22 18:45                   ` Greg Kroah-Hartman
2021-07-28  9:08                     ` Song Bao Hua (Barry Song)
2021-07-28  9:24                       ` Andy Shevchenko
2021-07-15 15:36   ` Yury Norov
2021-07-28 13:41 ` [PATCH v7 0/4] use bin_attribute to break the size limitation of cpumap ABI Greg KH
2021-07-28 14:53   ` Yury Norov
2021-07-28 15:25     ` Greg KH
2021-07-28 18:58 ` [PATCH] bitmap: extend comment to bitmap_print_to_buf Yury Norov
2021-07-29  6:04   ` Song Bao Hua (Barry Song)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).