All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] workqueue: reduce the sizeof pool_workqueue
@ 2020-06-01  8:44 Lai Jiangshan
  2020-06-01  8:44 ` [PATCH 1/4] workqueue: fix a piece of comment about reserved bits for work flags Lai Jiangshan
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Lai Jiangshan @ 2020-06-01  8:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: Lai Jiangshan, Tejun Heo

The major memory ussage in workqueue is on the pool_workqueue.
The pool_workqueue has alignment requirement which often leads
to padding.

Reducing the memory usage for the pool_workqueue is valuable.

And 32bit system often has less memory, less workqueues,
less works, less concurrent flush_workqueue()s, so we can
slash the flush color on 32bit system to reduce memory usage

Before patch:
The sizeof the struct pool_workqueue is 256 bytes,
only 136 bytes is in use in 32bit system

After patch:
The sizeof the struct pool_workqueue is 128 bytes,
only 104 bytes is in use in 32bit system, there is still
room for future usage.

Setting WORK_STRUCT_COLOR_BITS to 3 can't reduce the sizeof
the struct pool_workqueue in 64bit system, unless combined
with big refactor for unbound pwq.

Lai Jiangshan (4):
  workqueue: fix a piece of comment about reserved bits for work flags
  workqueue: use BUILD_BUG_ON() for compile time test instead of
    WARN_ON()
  workqueue: add a BUILD_BUG_ON() to check the size of struct
    pool_workqueue
  workqueue: slash half memory usage in 32bit system

Cc: Tejun Heo <tj@kernel.org>

 include/linux/workqueue.h |  8 +++++++-
 kernel/workqueue.c        | 10 +++++++++-
 2 files changed, 16 insertions(+), 2 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] workqueue: fix a piece of comment about reserved bits for work flags
  2020-06-01  8:44 [PATCH 0/4] workqueue: reduce the sizeof pool_workqueue Lai Jiangshan
@ 2020-06-01  8:44 ` Lai Jiangshan
  2020-06-01  8:44 ` [PATCH 2/4] workqueue: use BUILD_BUG_ON() for compile time test instead of WARN_ON() Lai Jiangshan
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Lai Jiangshan @ 2020-06-01  8:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: Lai Jiangshan, Tejun Heo, Lai Jiangshan

8a2e8e5dec7e("workqueue: fix cwq->nr_active underflow")
allocated one more bit from the work flags, and it updated
partial of the comments (128 bytes -> 256 bytes), but it
failed to update the info about the number of reserved bits.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 include/linux/workqueue.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 8b505d22fc0e..26de0cae2a0a 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -62,7 +62,7 @@ enum {
 	WORK_CPU_UNBOUND	= NR_CPUS,
 
 	/*
-	 * Reserve 7 bits off of pwq pointer w/ debugobjects turned off.
+	 * Reserve 8 bits off of pwq pointer w/ debugobjects turned off.
 	 * This makes pwqs aligned to 256 bytes and allows 15 workqueue
 	 * flush colors.
 	 */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] workqueue: use BUILD_BUG_ON() for compile time test instead of WARN_ON()
  2020-06-01  8:44 [PATCH 0/4] workqueue: reduce the sizeof pool_workqueue Lai Jiangshan
  2020-06-01  8:44 ` [PATCH 1/4] workqueue: fix a piece of comment about reserved bits for work flags Lai Jiangshan
@ 2020-06-01  8:44 ` Lai Jiangshan
  2020-06-01 15:08   ` Tejun Heo
  2020-06-01  8:44 ` [PATCH 3/4] workqueue: add a BUILD_BUG_ON() to check the size of struct pool_workqueue Lai Jiangshan
  2020-06-01  8:44 ` [PATCH 4/4] workqueue: slash half memory usage in 32bit system Lai Jiangshan
  3 siblings, 1 reply; 9+ messages in thread
From: Lai Jiangshan @ 2020-06-01  8:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: Lai Jiangshan, Tejun Heo, Lai Jiangshan

Any runtime WARN_ON() has to be fixed, and BUILD_BUG_ON() can
help you nitice it earlier.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 kernel/workqueue.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 7a1fc9fe6314..35120b909234 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5905,7 +5905,7 @@ void __init workqueue_init_early(void)
 	int hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
 	int i, cpu;
 
-	WARN_ON(__alignof__(struct pool_workqueue) < __alignof__(long long));
+	BUILD_BUG_ON(__alignof__(struct pool_workqueue) < __alignof__(long long));
 
 	BUG_ON(!alloc_cpumask_var(&wq_unbound_cpumask, GFP_KERNEL));
 	cpumask_copy(wq_unbound_cpumask, housekeeping_cpumask(hk_flags));
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] workqueue: add a BUILD_BUG_ON() to check the size of struct pool_workqueue
  2020-06-01  8:44 [PATCH 0/4] workqueue: reduce the sizeof pool_workqueue Lai Jiangshan
  2020-06-01  8:44 ` [PATCH 1/4] workqueue: fix a piece of comment about reserved bits for work flags Lai Jiangshan
  2020-06-01  8:44 ` [PATCH 2/4] workqueue: use BUILD_BUG_ON() for compile time test instead of WARN_ON() Lai Jiangshan
@ 2020-06-01  8:44 ` Lai Jiangshan
  2020-06-01  8:44 ` [PATCH 4/4] workqueue: slash half memory usage in 32bit system Lai Jiangshan
  3 siblings, 0 replies; 9+ messages in thread
From: Lai Jiangshan @ 2020-06-01  8:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: Lai Jiangshan, Tejun Heo, Lai Jiangshan

Currently, the size of struct pool_workqueue is smaller than the
smellest requirement (1 << WORK_STRUCT_FLAG_BITS). When the size
of struct pool_workqueue is increasing in future development and
exceed the smellest requirement a little, it will be a big waste
due to alignment.

Add a check in case we don't skimp enough in future.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 kernel/workqueue.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 35120b909234..1921c982f920 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5907,6 +5907,14 @@ void __init workqueue_init_early(void)
 
 	BUILD_BUG_ON(__alignof__(struct pool_workqueue) < __alignof__(long long));
 
+	/*
+	 * If it complains when being built, there might be considerable
+	 * memory waste in struct pool_workqueue. Whatever the reason is,
+	 * an updated check here or a re-arrangement in
+	 * struct pool_workqueue is required.
+	 */
+	BUILD_BUG_ON(sizeof(struct pool_workqueue) > (1 << WORK_STRUCT_FLAG_BITS));
+
 	BUG_ON(!alloc_cpumask_var(&wq_unbound_cpumask, GFP_KERNEL));
 	cpumask_copy(wq_unbound_cpumask, housekeeping_cpumask(hk_flags));
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] workqueue: slash half memory usage in 32bit system
  2020-06-01  8:44 [PATCH 0/4] workqueue: reduce the sizeof pool_workqueue Lai Jiangshan
                   ` (2 preceding siblings ...)
  2020-06-01  8:44 ` [PATCH 3/4] workqueue: add a BUILD_BUG_ON() to check the size of struct pool_workqueue Lai Jiangshan
@ 2020-06-01  8:44 ` Lai Jiangshan
  2020-06-01 15:07   ` Tejun Heo
  3 siblings, 1 reply; 9+ messages in thread
From: Lai Jiangshan @ 2020-06-01  8:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: Lai Jiangshan, Tejun Heo, Lai Jiangshan

The major memory ussage in workqueue is on the pool_workqueue.
The pool_workqueue has alignment requirement which often leads
to padding.

Reducing the memory usage for the pool_workqueue is valuable.

And 32bit system often has less memory, less workqueues,
less works, less concurrent flush_workqueue()s, so we can
slash the flush color on 32bit system to reduce memory usage

Before patch:
The sizeof the struct pool_workqueue is 256 bytes,
only 136 bytes is in use in 32bit system

After patch:
The sizeof the struct pool_workqueue is 128 bytes,
only 104 bytes is in use in 32bit system, there is still
room for future usage.

Setting WORK_STRUCT_COLOR_BITS to 3 can't reduce the sizeof
the struct pool_workqueue in 64bit system, unless combined
with big refactor for unbound pwq.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 include/linux/workqueue.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 26de0cae2a0a..c0f311926d01 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -39,7 +39,11 @@ enum {
 	WORK_STRUCT_COLOR_SHIFT	= 4,	/* color for workqueue flushing */
 #endif
 
+#if BITS_PER_LONG == 32
+	WORK_STRUCT_COLOR_BITS	= 3,
+#else
 	WORK_STRUCT_COLOR_BITS	= 4,
+#endif
 
 	WORK_STRUCT_PENDING	= 1 << WORK_STRUCT_PENDING_BIT,
 	WORK_STRUCT_DELAYED	= 1 << WORK_STRUCT_DELAYED_BIT,
@@ -65,6 +69,8 @@ enum {
 	 * Reserve 8 bits off of pwq pointer w/ debugobjects turned off.
 	 * This makes pwqs aligned to 256 bytes and allows 15 workqueue
 	 * flush colors.
+	 * For 32 bit system, the numbers are 7 bits, 128 bytes, 7 colors
+	 * respectively.
 	 */
 	WORK_STRUCT_FLAG_BITS	= WORK_STRUCT_COLOR_SHIFT +
 				  WORK_STRUCT_COLOR_BITS,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/4] workqueue: slash half memory usage in 32bit system
  2020-06-01  8:44 ` [PATCH 4/4] workqueue: slash half memory usage in 32bit system Lai Jiangshan
@ 2020-06-01 15:07   ` Tejun Heo
  2020-06-02  0:08     ` Lai Jiangshan
  0 siblings, 1 reply; 9+ messages in thread
From: Tejun Heo @ 2020-06-01 15:07 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel, Lai Jiangshan

On Mon, Jun 01, 2020 at 08:44:42AM +0000, Lai Jiangshan wrote:
> The major memory ussage in workqueue is on the pool_workqueue.
> The pool_workqueue has alignment requirement which often leads
> to padding.
> 
> Reducing the memory usage for the pool_workqueue is valuable.
> 
> And 32bit system often has less memory, less workqueues,
> less works, less concurrent flush_workqueue()s, so we can
> slash the flush color on 32bit system to reduce memory usage
> 
> Before patch:
> The sizeof the struct pool_workqueue is 256 bytes,
> only 136 bytes is in use in 32bit system
> 
> After patch:
> The sizeof the struct pool_workqueue is 128 bytes,
> only 104 bytes is in use in 32bit system, there is still
> room for future usage.
> 
> Setting WORK_STRUCT_COLOR_BITS to 3 can't reduce the sizeof
> the struct pool_workqueue in 64bit system, unless combined
> with big refactor for unbound pwq.

Have you calculated how much memory is actually saved this way on a typical
system?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/4] workqueue: use BUILD_BUG_ON() for compile time test instead of WARN_ON()
  2020-06-01  8:44 ` [PATCH 2/4] workqueue: use BUILD_BUG_ON() for compile time test instead of WARN_ON() Lai Jiangshan
@ 2020-06-01 15:08   ` Tejun Heo
  0 siblings, 0 replies; 9+ messages in thread
From: Tejun Heo @ 2020-06-01 15:08 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel, Lai Jiangshan

On Mon, Jun 01, 2020 at 08:44:40AM +0000, Lai Jiangshan wrote:
> Any runtime WARN_ON() has to be fixed, and BUILD_BUG_ON() can
> help you nitice it earlier.
> 
> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>

Applied 1-2 to wq/for-5.8.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/4] workqueue: slash half memory usage in 32bit system
  2020-06-01 15:07   ` Tejun Heo
@ 2020-06-02  0:08     ` Lai Jiangshan
  2020-06-02 16:18       ` Tejun Heo
  0 siblings, 1 reply; 9+ messages in thread
From: Lai Jiangshan @ 2020-06-02  0:08 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, LKML

On Mon, Jun 1, 2020 at 11:07 PM Tejun Heo <tj@kernel.org> wrote:
>
> On Mon, Jun 01, 2020 at 08:44:42AM +0000, Lai Jiangshan wrote:
> > The major memory ussage in workqueue is on the pool_workqueue.
> > The pool_workqueue has alignment requirement which often leads
> > to padding.
> >
> > Reducing the memory usage for the pool_workqueue is valuable.
> >
> > And 32bit system often has less memory, less workqueues,
> > less works, less concurrent flush_workqueue()s, so we can
> > slash the flush color on 32bit system to reduce memory usage
> >
> > Before patch:
> > The sizeof the struct pool_workqueue is 256 bytes,
> > only 136 bytes is in use in 32bit system
> >
> > After patch:
> > The sizeof the struct pool_workqueue is 128 bytes,
> > only 104 bytes is in use in 32bit system, there is still
> > room for future usage.
> >
> > Setting WORK_STRUCT_COLOR_BITS to 3 can't reduce the sizeof
> > the struct pool_workqueue in 64bit system, unless combined
> > with big refactor for unbound pwq.
>
> Have you calculated how much memory is actually saved this way on a typical
> system?

It is not noticable from the "free" command.
By counting the number of allocated pwq (mainly percpu pwq),
it saves 20k in my simple kvm guest (4cpu).
I guess it highly various in different boxes with various
kernel modules loaded.

>
> Thanks.
>
> --
> tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/4] workqueue: slash half memory usage in 32bit system
  2020-06-02  0:08     ` Lai Jiangshan
@ 2020-06-02 16:18       ` Tejun Heo
  0 siblings, 0 replies; 9+ messages in thread
From: Tejun Heo @ 2020-06-02 16:18 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Lai Jiangshan, LKML

Hello,

On Tue, Jun 02, 2020 at 08:08:10AM +0800, Lai Jiangshan wrote:
> It is not noticable from the "free" command.
> By counting the number of allocated pwq (mainly percpu pwq),
> it saves 20k in my simple kvm guest (4cpu).
> I guess it highly various in different boxes with various
> kernel modules loaded.

Hmm... I find it difficult to judge in any direction. 32 bit machines are
smaller and have less of everything - including CPUs and workqueues
themselves, so while changing configuration for 32bit systems would reduce
memory usage the impact isn't gonna be that big either and I have no data
supporting whether the reduction is gonna help or hurt.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-06-02 16:18 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-01  8:44 [PATCH 0/4] workqueue: reduce the sizeof pool_workqueue Lai Jiangshan
2020-06-01  8:44 ` [PATCH 1/4] workqueue: fix a piece of comment about reserved bits for work flags Lai Jiangshan
2020-06-01  8:44 ` [PATCH 2/4] workqueue: use BUILD_BUG_ON() for compile time test instead of WARN_ON() Lai Jiangshan
2020-06-01 15:08   ` Tejun Heo
2020-06-01  8:44 ` [PATCH 3/4] workqueue: add a BUILD_BUG_ON() to check the size of struct pool_workqueue Lai Jiangshan
2020-06-01  8:44 ` [PATCH 4/4] workqueue: slash half memory usage in 32bit system Lai Jiangshan
2020-06-01 15:07   ` Tejun Heo
2020-06-02  0:08     ` Lai Jiangshan
2020-06-02 16:18       ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.