[PATCH V3 0/2] mm: compaction: proactive compaction trigger by user

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH V3 0/2] mm: compaction: proactive compaction trigger by user
@ 2021-05-31 10:54 Charan Teja Reddy
  2021-05-31 10:54 ` [PATCH v3 1/2] mm: compaction: support triggering of proactive compaction " Charan Teja Reddy
  2021-05-31 10:54 ` [PATCH v3 2/2] mm: compaction: fix wakeup logic of proactive compaction Charan Teja Reddy
  0 siblings, 2 replies; 10+ messages in thread
From: Charan Teja Reddy @ 2021-05-31 10:54 UTC (permalink / raw)
  To: akpm, vbabka, nigupta, hannes, corbet, mcgrof, keescook, yzaikin,
	aarcange, cl, xi.fengfei, mchehab+huawei, andrew.a.klychkov,
	dave.hansen, bhe, iamjoonsoo.kim, mateusznosek0, sh_def,
	vinmenon
  Cc: linux-doc, linux-kernel, linux-mm, linux-fsdevel, Charan Teja Reddy

These patches support triggering of proactive compaction by user on write
to the /proc/sys/vm/compaction_proactiveness. 

Changes in V3:
 - Incorporated review comments.
Changes in V2:
 - https://lore.kernel.org/patchwork/patch/1431283/
Changes in V1:
 - https://lore.kernel.org/patchwork/patch/1417064/

Charan Teja Reddy (2):
  mm: compaction: support triggering of proactive compaction by user
  mm: compaction: fix wakeup logic of proactive compaction

 Documentation/admin-guide/sysctl/vm.rst |  3 +-
 include/linux/compaction.h              |  2 ++
 include/linux/mmzone.h                  |  1 +
 kernel/sysctl.c                         |  2 +-
 mm/compaction.c                         | 49 ++++++++++++++++++++++++++++++---
 5 files changed, 51 insertions(+), 6 deletions(-)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/2] mm: compaction: support triggering of proactive compaction by user
  2021-05-31 10:54 [PATCH V3 0/2] mm: compaction: proactive compaction trigger by user Charan Teja Reddy
@ 2021-05-31 10:54 ` Charan Teja Reddy
  2021-06-07 10:38   ` Charan Teja Kalla
  2021-06-16 11:59   ` Vlastimil Babka
  2021-05-31 10:54 ` [PATCH v3 2/2] mm: compaction: fix wakeup logic of proactive compaction Charan Teja Reddy
  1 sibling, 2 replies; 10+ messages in thread
From: Charan Teja Reddy @ 2021-05-31 10:54 UTC (permalink / raw)
  To: akpm, vbabka, nigupta, hannes, corbet, mcgrof, keescook, yzaikin,
	aarcange, cl, xi.fengfei, mchehab+huawei, andrew.a.klychkov,
	dave.hansen, bhe, iamjoonsoo.kim, mateusznosek0, sh_def,
	vinmenon
  Cc: linux-doc, linux-kernel, linux-mm, linux-fsdevel, Charan Teja Reddy

The proactive compaction[1] gets triggered for every 500msec and run
compaction on the node for COMPACTION_HPAGE_ORDER (usually order-9)
pages based on the value set to sysctl.compaction_proactiveness.
Triggering the compaction for every 500msec in search of
COMPACTION_HPAGE_ORDER pages is not needed for all applications,
especially on the embedded system usecases which may have few MB's of
RAM. Enabling the proactive compaction in its state will endup in
running almost always on such systems.

Other side, proactive compaction can still be very much useful for
getting a set of higher order pages in some controllable
manner(controlled by using the sysctl.compaction_proactiveness). Thus on
systems where enabling the proactive compaction always may proove not
required, can trigger the same from user space on write to its sysctl
interface. As an example, say app launcher decide to launch the memory
heavy application which can be launched fast if it gets more higher
order pages thus launcher can prepare the system in advance by
triggering the proactive compaction from userspace.

This triggering of proactive compaction is done on a write to
sysctl.compaction_proactiveness by user.

[1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=facdaa917c4d5a376d09d25865f5a863f906234a

Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
---
changes in V2:
 - https://lore.kernel.org/patchwork/patch/1431283/

changes in V1:
 -  https://lore.kernel.org/lkml/1619098678-8501-1-git-send-email-charante@codeaurora.org/

 Documentation/admin-guide/sysctl/vm.rst |  3 ++-
 include/linux/compaction.h              |  2 ++
 include/linux/mmzone.h                  |  1 +
 kernel/sysctl.c                         |  2 +-
 mm/compaction.c                         | 44 ++++++++++++++++++++++++++++++---
 5 files changed, 47 insertions(+), 5 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index 586cd4b..5e8097d 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -126,7 +126,8 @@ compaction_proactiveness
 
 This tunable takes a value in the range [0, 100] with a default value of
 20. This tunable determines how aggressively compaction is done in the
-background. Setting it to 0 disables proactive compaction.
+background. On write of non zero value to this tunable will immediately
+trigger the proactive compaction. Setting it to 0 disables proactive compaction.
 
 Note that compaction has a non-trivial system-wide impact as pages
 belonging to different processes are moved around, which could also lead
diff --git a/include/linux/compaction.h b/include/linux/compaction.h
index 4221888..04d5d9f 100644
--- a/include/linux/compaction.h
+++ b/include/linux/compaction.h
@@ -84,6 +84,8 @@ static inline unsigned long compact_gap(unsigned int order)
 extern unsigned int sysctl_compaction_proactiveness;
 extern int sysctl_compaction_handler(struct ctl_table *table, int write,
 			void *buffer, size_t *length, loff_t *ppos);
+extern int compaction_proactiveness_sysctl_handler(struct ctl_table *table,
+		int write, void *buffer, size_t *length, loff_t *ppos);
 extern int sysctl_extfrag_threshold;
 extern int sysctl_compact_unevictable_allowed;
 
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 0d53eba..9455809 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -815,6 +815,7 @@ typedef struct pglist_data {
 	enum zone_type kcompactd_highest_zoneidx;
 	wait_queue_head_t kcompactd_wait;
 	struct task_struct *kcompactd;
+	bool proactive_compact_trigger;
 #endif
 	/*
 	 * This is a per-node reserve of pages that are not available
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 14edf84..bed2fad 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -2840,7 +2840,7 @@ static struct ctl_table vm_table[] = {
 		.data		= &sysctl_compaction_proactiveness,
 		.maxlen		= sizeof(sysctl_compaction_proactiveness),
 		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
+		.proc_handler	= compaction_proactiveness_sysctl_handler,
 		.extra1		= SYSCTL_ZERO,
 		.extra2		= &one_hundred,
 	},
diff --git a/mm/compaction.c b/mm/compaction.c
index 84fde27..197e203 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2708,6 +2708,30 @@ static void compact_nodes(void)
  */
 unsigned int __read_mostly sysctl_compaction_proactiveness = 20;
 
+int compaction_proactiveness_sysctl_handler(struct ctl_table *table, int write,
+		void *buffer, size_t *length, loff_t *ppos)
+{
+	int rc, nid;
+
+	rc = proc_dointvec_minmax(table, write, buffer, length, ppos);
+	if (rc)
+		return rc;
+
+	if (write && sysctl_compaction_proactiveness) {
+		for_each_online_node(nid) {
+			pg_data_t *pgdat = NODE_DATA(nid);
+
+			if (pgdat->proactive_compact_trigger)
+				continue;
+
+			pgdat->proactive_compact_trigger = true;
+			wake_up_interruptible(&pgdat->kcompactd_wait);
+		}
+	}
+
+	return 0;
+}
+
 /*
  * This is the entry point for compacting all nodes via
  * /proc/sys/vm/compact_memory
@@ -2752,7 +2776,8 @@ void compaction_unregister_node(struct node *node)
 
 static inline bool kcompactd_work_requested(pg_data_t *pgdat)
 {
-	return pgdat->kcompactd_max_order > 0 || kthread_should_stop();
+	return pgdat->kcompactd_max_order > 0 || kthread_should_stop() ||
+		pgdat->proactive_compact_trigger;
 }
 
 static bool kcompactd_node_suitable(pg_data_t *pgdat)
@@ -2905,7 +2930,8 @@ static int kcompactd(void *p)
 		trace_mm_compaction_kcompactd_sleep(pgdat->node_id);
 		if (wait_event_freezable_timeout(pgdat->kcompactd_wait,
 			kcompactd_work_requested(pgdat),
-			msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC))) {
+			msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC)) &&
+			!pgdat->proactive_compact_trigger) {
 
 			psi_memstall_enter(&pflags);
 			kcompactd_do_work(pgdat);
@@ -2917,10 +2943,20 @@ static int kcompactd(void *p)
 		if (should_proactive_compact_node(pgdat)) {
 			unsigned int prev_score, score;
 
-			if (proactive_defer) {
+			/*
+			 * On wakeup of proactive compaction by sysctl
+			 * write, ignore the accumulated defer score.
+			 * Anyway, if the proactive compaction didn't
+			 * make any progress for the new value, it will
+			 * be further deferred by 2^COMPACT_MAX_DEFER_SHIFT
+			 * times.
+			 */
+			if (proactive_defer &&
+				!pgdat->proactive_compact_trigger) {
 				proactive_defer--;
 				continue;
 			}
+
 			prev_score = fragmentation_score_node(pgdat);
 			proactive_compact_node(pgdat);
 			score = fragmentation_score_node(pgdat);
@@ -2931,6 +2967,8 @@ static int kcompactd(void *p)
 			proactive_defer = score < prev_score ?
 					0 : 1 << COMPACT_MAX_DEFER_SHIFT;
 		}
+		if (pgdat->proactive_compact_trigger)
+			pgdat->proactive_compact_trigger = false;
 	}
 
 	return 0;
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/2] mm: compaction: fix wakeup logic of proactive compaction
  2021-05-31 10:54 [PATCH V3 0/2] mm: compaction: proactive compaction trigger by user Charan Teja Reddy
  2021-05-31 10:54 ` [PATCH v3 1/2] mm: compaction: support triggering of proactive compaction " Charan Teja Reddy
@ 2021-05-31 10:54 ` Charan Teja Reddy
  1 sibling, 0 replies; 10+ messages in thread
From: Charan Teja Reddy @ 2021-05-31 10:54 UTC (permalink / raw)
  To: akpm, vbabka, nigupta, hannes, corbet, mcgrof, keescook, yzaikin,
	aarcange, cl, xi.fengfei, mchehab+huawei, andrew.a.klychkov,
	dave.hansen, bhe, iamjoonsoo.kim, mateusznosek0, sh_def,
	vinmenon
  Cc: linux-doc, linux-kernel, linux-mm, linux-fsdevel, Charan Teja Reddy

Currently, proactive compaction tries to get triggered for every
HPAGE_FRAG_CHECK_INTERVAL_MSEC(=500msec) even when proactive compaction
is disabled with sysctl.compaction_proactiveness = 0. This results in
kcompactd thread wakes up and goes to sleep for every 500msec with out
the need of doing proactive compaction. Though this doesn't have any
overhead, few cpu cycles can be saved by avoid of waking up kcompactd
thread for proactive compaction when it is disabled.

Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
---

 - This patch is newly raised in V3, thus no changes exist in V1 and V2 

 mm/compaction.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 197e203..0edcd0f 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2926,11 +2926,14 @@ static int kcompactd(void *p)
 
 	while (!kthread_should_stop()) {
 		unsigned long pflags;
+		long timeout;
 
+		timeout = sysctl_compaction_proactiveness ?
+			msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC) :
+			MAX_SCHEDULE_TIMEOUT;
 		trace_mm_compaction_kcompactd_sleep(pgdat->node_id);
 		if (wait_event_freezable_timeout(pgdat->kcompactd_wait,
-			kcompactd_work_requested(pgdat),
-			msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC)) &&
+			kcompactd_work_requested(pgdat), timeout) &&
 			!pgdat->proactive_compact_trigger) {
 
 			psi_memstall_enter(&pflags);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/2] mm: compaction: support triggering of proactive compaction by user
  2021-05-31 10:54 ` [PATCH v3 1/2] mm: compaction: support triggering of proactive compaction " Charan Teja Reddy
@ 2021-06-07 10:38   ` Charan Teja Kalla
  2021-06-14 14:57     ` Charan Teja Kalla
  2021-06-16 11:59   ` Vlastimil Babka
  1 sibling, 1 reply; 10+ messages in thread
From: Charan Teja Kalla @ 2021-06-07 10:38 UTC (permalink / raw)
  To: akpm, vbabka, nigupta, hannes, corbet, mcgrof, keescook, yzaikin,
	aarcange, cl, xi.fengfei, mchehab+huawei, andrew.a.klychkov,
	dave.hansen, bhe, iamjoonsoo.kim, mateusznosek0, sh_def,
	vinmenon
  Cc: linux-doc, linux-kernel, linux-mm, linux-fsdevel

A gentle ping.

--Charan

On 5/31/2021 4:24 PM, Charan Teja Reddy wrote:
> The proactive compaction[1] gets triggered for every 500msec and run
> compaction on the node for COMPACTION_HPAGE_ORDER (usually order-9)
> pages based on the value set to sysctl.compaction_proactiveness.
> Triggering the compaction for every 500msec in search of
> COMPACTION_HPAGE_ORDER pages is not needed for all applications,
> especially on the embedded system usecases which may have few MB's of
> RAM. Enabling the proactive compaction in its state will endup in
> running almost always on such systems.
> 
> Other side, proactive compaction can still be very much useful for
> getting a set of higher order pages in some controllable
> manner(controlled by using the sysctl.compaction_proactiveness). Thus on
> systems where enabling the proactive compaction always may proove not
> required, can trigger the same from user space on write to its sysctl
> interface. As an example, say app launcher decide to launch the memory
> heavy application which can be launched fast if it gets more higher
> order pages thus launcher can prepare the system in advance by
> triggering the proactive compaction from userspace.
> 
> This triggering of proactive compaction is done on a write to
> sysctl.compaction_proactiveness by user.
> 
> [1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=facdaa917c4d5a376d09d25865f5a863f906234a
> 
> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
> ---
> changes in V2:
>  - https://lore.kernel.org/patchwork/patch/1431283/
> 
> changes in V1:
>  -  https://lore.kernel.org/lkml/1619098678-8501-1-git-send-email-charante@codeaurora.org/
> 
>  Documentation/admin-guide/sysctl/vm.rst |  3 ++-
>  include/linux/compaction.h              |  2 ++
>  include/linux/mmzone.h                  |  1 +
>  kernel/sysctl.c                         |  2 +-
>  mm/compaction.c                         | 44 ++++++++++++++++++++++++++++++---
>  5 files changed, 47 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
> index 586cd4b..5e8097d 100644
> --- a/Documentation/admin-guide/sysctl/vm.rst
> +++ b/Documentation/admin-guide/sysctl/vm.rst
> @@ -126,7 +126,8 @@ compaction_proactiveness
>  
>  This tunable takes a value in the range [0, 100] with a default value of
>  20. This tunable determines how aggressively compaction is done in the
> -background. Setting it to 0 disables proactive compaction.
> +background. On write of non zero value to this tunable will immediately
> +trigger the proactive compaction. Setting it to 0 disables proactive compaction.
>  
>  Note that compaction has a non-trivial system-wide impact as pages
>  belonging to different processes are moved around, which could also lead
> diff --git a/include/linux/compaction.h b/include/linux/compaction.h
> index 4221888..04d5d9f 100644
> --- a/include/linux/compaction.h
> +++ b/include/linux/compaction.h
> @@ -84,6 +84,8 @@ static inline unsigned long compact_gap(unsigned int order)
>  extern unsigned int sysctl_compaction_proactiveness;
>  extern int sysctl_compaction_handler(struct ctl_table *table, int write,
>  			void *buffer, size_t *length, loff_t *ppos);
> +extern int compaction_proactiveness_sysctl_handler(struct ctl_table *table,
> +		int write, void *buffer, size_t *length, loff_t *ppos);
>  extern int sysctl_extfrag_threshold;
>  extern int sysctl_compact_unevictable_allowed;
>  
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 0d53eba..9455809 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -815,6 +815,7 @@ typedef struct pglist_data {
>  	enum zone_type kcompactd_highest_zoneidx;
>  	wait_queue_head_t kcompactd_wait;
>  	struct task_struct *kcompactd;
> +	bool proactive_compact_trigger;
>  #endif
>  	/*
>  	 * This is a per-node reserve of pages that are not available
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 14edf84..bed2fad 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -2840,7 +2840,7 @@ static struct ctl_table vm_table[] = {
>  		.data		= &sysctl_compaction_proactiveness,
>  		.maxlen		= sizeof(sysctl_compaction_proactiveness),
>  		.mode		= 0644,
> -		.proc_handler	= proc_dointvec_minmax,
> +		.proc_handler	= compaction_proactiveness_sysctl_handler,
>  		.extra1		= SYSCTL_ZERO,
>  		.extra2		= &one_hundred,
>  	},
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 84fde27..197e203 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -2708,6 +2708,30 @@ static void compact_nodes(void)
>   */
>  unsigned int __read_mostly sysctl_compaction_proactiveness = 20;
>  
> +int compaction_proactiveness_sysctl_handler(struct ctl_table *table, int write,
> +		void *buffer, size_t *length, loff_t *ppos)
> +{
> +	int rc, nid;
> +
> +	rc = proc_dointvec_minmax(table, write, buffer, length, ppos);
> +	if (rc)
> +		return rc;
> +
> +	if (write && sysctl_compaction_proactiveness) {
> +		for_each_online_node(nid) {
> +			pg_data_t *pgdat = NODE_DATA(nid);
> +
> +			if (pgdat->proactive_compact_trigger)
> +				continue;
> +
> +			pgdat->proactive_compact_trigger = true;
> +			wake_up_interruptible(&pgdat->kcompactd_wait);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
>  /*
>   * This is the entry point for compacting all nodes via
>   * /proc/sys/vm/compact_memory
> @@ -2752,7 +2776,8 @@ void compaction_unregister_node(struct node *node)
>  
>  static inline bool kcompactd_work_requested(pg_data_t *pgdat)
>  {
> -	return pgdat->kcompactd_max_order > 0 || kthread_should_stop();
> +	return pgdat->kcompactd_max_order > 0 || kthread_should_stop() ||
> +		pgdat->proactive_compact_trigger;
>  }
>  
>  static bool kcompactd_node_suitable(pg_data_t *pgdat)
> @@ -2905,7 +2930,8 @@ static int kcompactd(void *p)
>  		trace_mm_compaction_kcompactd_sleep(pgdat->node_id);
>  		if (wait_event_freezable_timeout(pgdat->kcompactd_wait,
>  			kcompactd_work_requested(pgdat),
> -			msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC))) {
> +			msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC)) &&
> +			!pgdat->proactive_compact_trigger) {
>  
>  			psi_memstall_enter(&pflags);
>  			kcompactd_do_work(pgdat);
> @@ -2917,10 +2943,20 @@ static int kcompactd(void *p)
>  		if (should_proactive_compact_node(pgdat)) {
>  			unsigned int prev_score, score;
>  
> -			if (proactive_defer) {
> +			/*
> +			 * On wakeup of proactive compaction by sysctl
> +			 * write, ignore the accumulated defer score.
> +			 * Anyway, if the proactive compaction didn't
> +			 * make any progress for the new value, it will
> +			 * be further deferred by 2^COMPACT_MAX_DEFER_SHIFT
> +			 * times.
> +			 */
> +			if (proactive_defer &&
> +				!pgdat->proactive_compact_trigger) {
>  				proactive_defer--;
>  				continue;
>  			}
> +
>  			prev_score = fragmentation_score_node(pgdat);
>  			proactive_compact_node(pgdat);
>  			score = fragmentation_score_node(pgdat);
> @@ -2931,6 +2967,8 @@ static int kcompactd(void *p)
>  			proactive_defer = score < prev_score ?
>  					0 : 1 << COMPACT_MAX_DEFER_SHIFT;
>  		}
> +		if (pgdat->proactive_compact_trigger)
> +			pgdat->proactive_compact_trigger = false;
>  	}
>  
>  	return 0;
> 

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/2] mm: compaction: support triggering of proactive compaction by user
  2021-06-07 10:38   ` Charan Teja Kalla
@ 2021-06-14 14:57     ` Charan Teja Kalla
  0 siblings, 0 replies; 10+ messages in thread
From: Charan Teja Kalla @ 2021-06-14 14:57 UTC (permalink / raw)
  To: akpm, vbabka, nigupta, hannes, corbet, mcgrof, keescook, yzaikin,
	aarcange, cl, xi.fengfei, mchehab+huawei, andrew.a.klychkov,
	dave.hansen, bhe, iamjoonsoo.kim, mateusznosek0, sh_def,
	vinmenon
  Cc: linux-doc, linux-kernel, linux-mm, linux-fsdevel

Just another ping to have your comments.

Thanks in advance!!

On 6/7/2021 4:08 PM, Charan Teja Kalla wrote:
> A gentle ping.
> 
> --Charan
> 
> On 5/31/2021 4:24 PM, Charan Teja Reddy wrote:
>> The proactive compaction[1] gets triggered for every 500msec and run
>> compaction on the node for COMPACTION_HPAGE_ORDER (usually order-9)
>> pages based on the value set to sysctl.compaction_proactiveness.
>> Triggering the compaction for every 500msec in search of
>> COMPACTION_HPAGE_ORDER pages is not needed for all applications,
>> especially on the embedded system usecases which may have few MB's of
>> RAM. Enabling the proactive compaction in its state will endup in
>> running almost always on such systems.
>>
>> Other side, proactive compaction can still be very much useful for
>> getting a set of higher order pages in some controllable
>> manner(controlled by using the sysctl.compaction_proactiveness). Thus on
>> systems where enabling the proactive compaction always may proove not
>> required, can trigger the same from user space on write to its sysctl
>> interface. As an example, say app launcher decide to launch the memory
>> heavy application which can be launched fast if it gets more higher
>> order pages thus launcher can prepare the system in advance by
>> triggering the proactive compaction from userspace.
>>
>> This triggering of proactive compaction is done on a write to
>> sysctl.compaction_proactiveness by user.
>>
>> [1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=facdaa917c4d5a376d09d25865f5a863f906234a
>>
>> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
>> ---
>> changes in V2:
>>  - https://lore.kernel.org/patchwork/patch/1431283/
>>
>> changes in V1:
>>  -  https://lore.kernel.org/lkml/1619098678-8501-1-git-send-email-charante@codeaurora.org/
>>
>>  Documentation/admin-guide/sysctl/vm.rst |  3 ++-
>>  include/linux/compaction.h              |  2 ++
>>  include/linux/mmzone.h                  |  1 +
>>  kernel/sysctl.c                         |  2 +-
>>  mm/compaction.c                         | 44 ++++++++++++++++++++++++++++++---
>>  5 files changed, 47 insertions(+), 5 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
>> index 586cd4b..5e8097d 100644
>> --- a/Documentation/admin-guide/sysctl/vm.rst
>> +++ b/Documentation/admin-guide/sysctl/vm.rst
>> @@ -126,7 +126,8 @@ compaction_proactiveness
>>  
>>  This tunable takes a value in the range [0, 100] with a default value of
>>  20. This tunable determines how aggressively compaction is done in the
>> -background. Setting it to 0 disables proactive compaction.
>> +background. On write of non zero value to this tunable will immediately
>> +trigger the proactive compaction. Setting it to 0 disables proactive compaction.
>>  
>>  Note that compaction has a non-trivial system-wide impact as pages
>>  belonging to different processes are moved around, which could also lead
>> diff --git a/include/linux/compaction.h b/include/linux/compaction.h
>> index 4221888..04d5d9f 100644
>> --- a/include/linux/compaction.h
>> +++ b/include/linux/compaction.h
>> @@ -84,6 +84,8 @@ static inline unsigned long compact_gap(unsigned int order)
>>  extern unsigned int sysctl_compaction_proactiveness;
>>  extern int sysctl_compaction_handler(struct ctl_table *table, int write,
>>  			void *buffer, size_t *length, loff_t *ppos);
>> +extern int compaction_proactiveness_sysctl_handler(struct ctl_table *table,
>> +		int write, void *buffer, size_t *length, loff_t *ppos);
>>  extern int sysctl_extfrag_threshold;
>>  extern int sysctl_compact_unevictable_allowed;
>>  
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index 0d53eba..9455809 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -815,6 +815,7 @@ typedef struct pglist_data {
>>  	enum zone_type kcompactd_highest_zoneidx;
>>  	wait_queue_head_t kcompactd_wait;
>>  	struct task_struct *kcompactd;
>> +	bool proactive_compact_trigger;
>>  #endif
>>  	/*
>>  	 * This is a per-node reserve of pages that are not available
>> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
>> index 14edf84..bed2fad 100644
>> --- a/kernel/sysctl.c
>> +++ b/kernel/sysctl.c
>> @@ -2840,7 +2840,7 @@ static struct ctl_table vm_table[] = {
>>  		.data		= &sysctl_compaction_proactiveness,
>>  		.maxlen		= sizeof(sysctl_compaction_proactiveness),
>>  		.mode		= 0644,
>> -		.proc_handler	= proc_dointvec_minmax,
>> +		.proc_handler	= compaction_proactiveness_sysctl_handler,
>>  		.extra1		= SYSCTL_ZERO,
>>  		.extra2		= &one_hundred,
>>  	},
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index 84fde27..197e203 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -2708,6 +2708,30 @@ static void compact_nodes(void)
>>   */
>>  unsigned int __read_mostly sysctl_compaction_proactiveness = 20;
>>  
>> +int compaction_proactiveness_sysctl_handler(struct ctl_table *table, int write,
>> +		void *buffer, size_t *length, loff_t *ppos)
>> +{
>> +	int rc, nid;
>> +
>> +	rc = proc_dointvec_minmax(table, write, buffer, length, ppos);
>> +	if (rc)
>> +		return rc;
>> +
>> +	if (write && sysctl_compaction_proactiveness) {
>> +		for_each_online_node(nid) {
>> +			pg_data_t *pgdat = NODE_DATA(nid);
>> +
>> +			if (pgdat->proactive_compact_trigger)
>> +				continue;
>> +
>> +			pgdat->proactive_compact_trigger = true;
>> +			wake_up_interruptible(&pgdat->kcompactd_wait);
>> +		}
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>  /*
>>   * This is the entry point for compacting all nodes via
>>   * /proc/sys/vm/compact_memory
>> @@ -2752,7 +2776,8 @@ void compaction_unregister_node(struct node *node)
>>  
>>  static inline bool kcompactd_work_requested(pg_data_t *pgdat)
>>  {
>> -	return pgdat->kcompactd_max_order > 0 || kthread_should_stop();
>> +	return pgdat->kcompactd_max_order > 0 || kthread_should_stop() ||
>> +		pgdat->proactive_compact_trigger;
>>  }
>>  
>>  static bool kcompactd_node_suitable(pg_data_t *pgdat)
>> @@ -2905,7 +2930,8 @@ static int kcompactd(void *p)
>>  		trace_mm_compaction_kcompactd_sleep(pgdat->node_id);
>>  		if (wait_event_freezable_timeout(pgdat->kcompactd_wait,
>>  			kcompactd_work_requested(pgdat),
>> -			msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC))) {
>> +			msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC)) &&
>> +			!pgdat->proactive_compact_trigger) {
>>  
>>  			psi_memstall_enter(&pflags);
>>  			kcompactd_do_work(pgdat);
>> @@ -2917,10 +2943,20 @@ static int kcompactd(void *p)
>>  		if (should_proactive_compact_node(pgdat)) {
>>  			unsigned int prev_score, score;
>>  
>> -			if (proactive_defer) {
>> +			/*
>> +			 * On wakeup of proactive compaction by sysctl
>> +			 * write, ignore the accumulated defer score.
>> +			 * Anyway, if the proactive compaction didn't
>> +			 * make any progress for the new value, it will
>> +			 * be further deferred by 2^COMPACT_MAX_DEFER_SHIFT
>> +			 * times.
>> +			 */
>> +			if (proactive_defer &&
>> +				!pgdat->proactive_compact_trigger) {
>>  				proactive_defer--;
>>  				continue;
>>  			}
>> +
>>  			prev_score = fragmentation_score_node(pgdat);
>>  			proactive_compact_node(pgdat);
>>  			score = fragmentation_score_node(pgdat);
>> @@ -2931,6 +2967,8 @@ static int kcompactd(void *p)
>>  			proactive_defer = score < prev_score ?
>>  					0 : 1 << COMPACT_MAX_DEFER_SHIFT;
>>  		}
>> +		if (pgdat->proactive_compact_trigger)
>> +			pgdat->proactive_compact_trigger = false;
>>  	}
>>  
>>  	return 0;
>>
> 

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/2] mm: compaction: support triggering of proactive compaction by user
  2021-05-31 10:54 ` [PATCH v3 1/2] mm: compaction: support triggering of proactive compaction " Charan Teja Reddy
  2021-06-07 10:38   ` Charan Teja Kalla
@ 2021-06-16 11:59   ` Vlastimil Babka
  2021-06-17  7:30     ` Charan Teja Kalla
  1 sibling, 1 reply; 10+ messages in thread
From: Vlastimil Babka @ 2021-06-16 11:59 UTC (permalink / raw)
  To: Charan Teja Reddy, akpm, nigupta, hannes, corbet, mcgrof,
	keescook, yzaikin, aarcange, cl, xi.fengfei, mchehab+huawei,
	andrew.a.klychkov, dave.hansen, bhe, iamjoonsoo.kim,
	mateusznosek0, sh_def, vinmenon
  Cc: linux-doc, linux-kernel, linux-mm, linux-fsdevel

On 5/31/21 12:54 PM, Charan Teja Reddy wrote:
> The proactive compaction[1] gets triggered for every 500msec and run
> compaction on the node for COMPACTION_HPAGE_ORDER (usually order-9)
> pages based on the value set to sysctl.compaction_proactiveness.
> Triggering the compaction for every 500msec in search of
> COMPACTION_HPAGE_ORDER pages is not needed for all applications,
> especially on the embedded system usecases which may have few MB's of
> RAM. Enabling the proactive compaction in its state will endup in
> running almost always on such systems.
> 
> Other side, proactive compaction can still be very much useful for
> getting a set of higher order pages in some controllable
> manner(controlled by using the sysctl.compaction_proactiveness). Thus on
> systems where enabling the proactive compaction always may proove not
> required, can trigger the same from user space on write to its sysctl
> interface. As an example, say app launcher decide to launch the memory
> heavy application which can be launched fast if it gets more higher
> order pages thus launcher can prepare the system in advance by
> triggering the proactive compaction from userspace.
> 
> This triggering of proactive compaction is done on a write to
> sysctl.compaction_proactiveness by user.
> 
> [1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=facdaa917c4d5a376d09d25865f5a863f906234a
> 
> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
> ---
> changes in V2:

You forgot to also summarize the changes. Please do in next version.

>   */
>  unsigned int __read_mostly sysctl_compaction_proactiveness = 20;
>  
> +int compaction_proactiveness_sysctl_handler(struct ctl_table *table, int write,
> +		void *buffer, size_t *length, loff_t *ppos)
> +{
> +	int rc, nid;
> +
> +	rc = proc_dointvec_minmax(table, write, buffer, length, ppos);
> +	if (rc)
> +		return rc;
> +
> +	if (write && sysctl_compaction_proactiveness) {
> +		for_each_online_node(nid) {
> +			pg_data_t *pgdat = NODE_DATA(nid);
> +
> +			if (pgdat->proactive_compact_trigger)
> +				continue;
> +
> +			pgdat->proactive_compact_trigger = true;

I don't like the new variable. I wish we could do without it. I understand this
is added to ignore proactive_defer.
We could instead expose proactive_defer in pgdat and reset it to 0 before wakeup
(instead being a thread variable in kcompactd). But that would be racy with the
decreases done by kcompactd.
But I like the patch 2/2 and the idea could be extended to proactive_defer
handling. If there's no proactive_defer, timeout is
HPAGE_FRAG_CHECK_INTERVAL_MSEC. If kcompactd decides to defer, timeout would be
HPAGE_FRAG_CHECK_INTERVAL_MSEC << COMPACT_MAX_DEFER_SHIFT. Thus, no more waking
up just to decrease proactive_defer, we can then get rid of the counter. On
writing new proactiveness just wake up and that's it, regardless of which
timeout there was at the moment.
The only change is, if we get woken up to do non-proactive work, by
wakeup_kcompactd(), the proactive_defer value would be now be effectively lost.
I think it's OK as wakeup_kcompactd() means the condition of the zone changed
substantionally anyway and carrying on with previous defer makes not much sense.
What do you think?

> +			wake_up_interruptible(&pgdat->kcompactd_wait);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
>  /*
>   * This is the entry point for compacting all nodes via
>   * /proc/sys/vm/compact_memory
> @@ -2752,7 +2776,8 @@ void compaction_unregister_node(struct node *node)
>  
>  static inline bool kcompactd_work_requested(pg_data_t *pgdat)
>  {
> -	return pgdat->kcompactd_max_order > 0 || kthread_should_stop();
> +	return pgdat->kcompactd_max_order > 0 || kthread_should_stop() ||
> +		pgdat->proactive_compact_trigger;
>  }
>  
>  static bool kcompactd_node_suitable(pg_data_t *pgdat)
> @@ -2905,7 +2930,8 @@ static int kcompactd(void *p)
>  		trace_mm_compaction_kcompactd_sleep(pgdat->node_id);
>  		if (wait_event_freezable_timeout(pgdat->kcompactd_wait,
>  			kcompactd_work_requested(pgdat),
> -			msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC))) {
> +			msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC)) &&
> +			!pgdat->proactive_compact_trigger) {
>  
>  			psi_memstall_enter(&pflags);
>  			kcompactd_do_work(pgdat);
> @@ -2917,10 +2943,20 @@ static int kcompactd(void *p)
>  		if (should_proactive_compact_node(pgdat)) {
>  			unsigned int prev_score, score;
>  
> -			if (proactive_defer) {
> +			/*
> +			 * On wakeup of proactive compaction by sysctl
> +			 * write, ignore the accumulated defer score.
> +			 * Anyway, if the proactive compaction didn't
> +			 * make any progress for the new value, it will
> +			 * be further deferred by 2^COMPACT_MAX_DEFER_SHIFT
> +			 * times.
> +			 */
> +			if (proactive_defer &&
> +				!pgdat->proactive_compact_trigger) {
>  				proactive_defer--;
>  				continue;
>  			}
> +
>  			prev_score = fragmentation_score_node(pgdat);
>  			proactive_compact_node(pgdat);
>  			score = fragmentation_score_node(pgdat);
> @@ -2931,6 +2967,8 @@ static int kcompactd(void *p)
>  			proactive_defer = score < prev_score ?
>  					0 : 1 << COMPACT_MAX_DEFER_SHIFT;
>  		}
> +		if (pgdat->proactive_compact_trigger)
> +			pgdat->proactive_compact_trigger = false;
>  	}
>  
>  	return 0;
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/2] mm: compaction: support triggering of proactive compaction by user
  2021-06-16 11:59   ` Vlastimil Babka
@ 2021-06-17  7:30     ` Charan Teja Kalla
  2021-06-17 14:37       ` Vlastimil Babka
  0 siblings, 1 reply; 10+ messages in thread
From: Charan Teja Kalla @ 2021-06-17  7:30 UTC (permalink / raw)
  To: Vlastimil Babka, akpm, nigupta, hannes, corbet, mcgrof, keescook,
	yzaikin, aarcange, cl, xi.fengfei, mchehab+huawei,
	andrew.a.klychkov, dave.hansen, bhe, iamjoonsoo.kim,
	mateusznosek0, sh_def, vinmenon
  Cc: linux-doc, linux-kernel, linux-mm, linux-fsdevel

Thanks Vlastimil for your inputs!!

On 6/16/2021 5:29 PM, Vlastimil Babka wrote:
>> This triggering of proactive compaction is done on a write to
>> sysctl.compaction_proactiveness by user.
>>
>> [1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=facdaa917c4d5a376d09d25865f5a863f906234a
>>
>> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
>> ---
>> changes in V2:
> You forgot to also summarize the changes. Please do in next version.

Sure. Will take care this in the next version.

> 
>>   */
>>  unsigned int __read_mostly sysctl_compaction_proactiveness = 20;
>>  
>> +int compaction_proactiveness_sysctl_handler(struct ctl_table *table, int write,
>> +		void *buffer, size_t *length, loff_t *ppos)
>> +{
>> +	int rc, nid;
>> +
>> +	rc = proc_dointvec_minmax(table, write, buffer, length, ppos);
>> +	if (rc)
>> +		return rc;
>> +
>> +	if (write && sysctl_compaction_proactiveness) {
>> +		for_each_online_node(nid) {
>> +			pg_data_t *pgdat = NODE_DATA(nid);
>> +
>> +			if (pgdat->proactive_compact_trigger)
>> +				continue;
>> +
>> +			pgdat->proactive_compact_trigger = true;
> I don't like the new variable. I wish we could do without it. I understand this
> is added to ignore proactive_defer.
> We could instead expose proactive_defer in pgdat and reset it to 0 before wakeup
> (instead being a thread variable in kcompactd). But that would be racy with the
> decreases done by kcompactd.
> But I like the patch 2/2 and the idea could be extended to proactive_defer
> handling. If there's no proactive_defer, timeout is
> HPAGE_FRAG_CHECK_INTERVAL_MSEC. If kcompactd decides to defer, timeout would be
> HPAGE_FRAG_CHECK_INTERVAL_MSEC << COMPACT_MAX_DEFER_SHIFT. Thus, no more waking
> up just to decrease proactive_defer, we can then get rid of the counter. On
> writing new proactiveness just wake up and that's it, regardless of which
> timeout there was at the moment.

I think we can get rid off 'proactive_defer' thread variable with the
timeout approach you suggested. But it is still requires to have one
additional variable 'proactive_compact_trigger', which main purpose is
to decide if the kcompactd wakeup is for proactive compaction or not.
Please see below code:
   if (wait_event_freezable_timeout() && !proactive_compact_trigger) {
	// do the non-proactive work
	continue
   }
   // do the proactive work
     .................

Thus I feel that on writing new proactiveness, it is required to do
wakeup_kcomppactd() + set a flag that this wakeup is for proactive work.

Am I failed to get your point here?


> The only change is, if we get woken up to do non-proactive work, by
> wakeup_kcompactd(), the proactive_defer value would be now be effectively lost.
> I think it's OK as wakeup_kcompactd() means the condition of the zone changed
> substantionally anyway and carrying on with previous defer makes not much sense.
> What do you think?

Agree.

> 
>> +			wake_up_interruptible(&pgdat->kcompactd_wait);
>> +		}
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>  /*
>>   * This is the entry point for compacting all nodes via
>>   * /proc/sys/vm/compact_memory
>> @@ -2752,7 +2776,8 @@ void compaction_unregister_node(struct node *node)
>>  
>>  static inline bool kcompactd_work_requested(pg_data_t *pgdat)
>>  {
>> -	return pgdat->kcompactd_max_order > 0 || kthread_should_stop();
>> +	return pgdat->kcompactd_max_order > 0 || kthread_should_stop() ||
>> +		pgdat->proactive_compact_trigger;
>>  }
>>  
>>  static bool kcompactd_node_suitable(pg_data_t *pgdat)
>> @@ -2905,7 +2930,8 @@ static int kcompactd(void *p)
>>  		trace_mm_compaction_kcompactd_sleep(pgdat->node_id);
>>  		if (wait_event_freezable_timeout(pgdat->kcompactd_wait,
>>  			kcompactd_work_requested(pgdat),
>> -			msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC))) {
>> +			msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC)) &&
>> +			!pgdat->proactive_compact_trigger) {
>>  
>>  			psi_memstall_enter(&pflags);
>>  			kcompactd_do_work(pgdat);
>> @@ -2917,10 +2943,20 @@ static int kcompactd(void *p)
>>  		if (should_proactive_compact_node(pgdat)) {
>>  			unsigned int prev_score, score;
>>  
>> -			if (proactive_defer) {
>> +			/*
>> +			 * On wakeup of proactive compaction by sysctl
>> +			 * write, ignore the accumulated defer score.
>> +			 * Anyway, if the proactive compaction didn't
>> +			 * make any progress for the new value, it will
>> +			 * be further deferred by 2^COMPACT_MAX_DEFER_SHIFT
>> +			 * times.
>> +			 */
>> +			if (proactive_defer &&
>> +				!pgdat->proactive_compact_trigger) {
>>  				proactive_defer--;
>>  				continue;
>>  			}
>> +
>>  			prev_score = fragmentation_score_node(pgdat);
>>  			proactive_compact_node(pgdat);
>>  			score = fragmentation_score_node(pgdat);
>> @@ -2931,6 +2967,8 @@ static int kcompactd(void *p)
>>  			proactive_defer = score < prev_score ?
>>  					0 : 1 << COMPACT_MAX_DEFER_SHIFT;
>>  		}
>> +		if (pgdat->proactive_compact_trigger)
>> +			pgdat->proactive_compact_trigger = false;
>>  	}
>>  
>>  	return 0;

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/2] mm: compaction: support triggering of proactive compaction by user
  2021-06-17  7:30     ` Charan Teja Kalla
@ 2021-06-17 14:37       ` Vlastimil Babka
  2021-06-17 16:05         ` Charan Teja Kalla
  0 siblings, 1 reply; 10+ messages in thread
From: Vlastimil Babka @ 2021-06-17 14:37 UTC (permalink / raw)
  To: Charan Teja Kalla, akpm, nigupta, hannes, corbet, mcgrof,
	keescook, yzaikin, aarcange, cl, xi.fengfei, mchehab+huawei,
	andrew.a.klychkov, dave.hansen, bhe, iamjoonsoo.kim,
	mateusznosek0, sh_def, vinmenon
  Cc: linux-doc, linux-kernel, linux-mm, linux-fsdevel

On 6/17/21 9:30 AM, Charan Teja Kalla wrote:
> Thanks Vlastimil for your inputs!!
> 
> On 6/16/2021 5:29 PM, Vlastimil Babka wrote:
>>> This triggering of proactive compaction is done on a write to
>>> sysctl.compaction_proactiveness by user.
>>>
>>> [1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=facdaa917c4d5a376d09d25865f5a863f906234a
>>>
>>> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
>>> ---
>>> changes in V2:
>> You forgot to also summarize the changes. Please do in next version.
> 
> I think we can get rid off 'proactive_defer' thread variable with the
> timeout approach you suggested. But it is still requires to have one
> additional variable 'proactive_compact_trigger', which main purpose is
> to decide if the kcompactd wakeup is for proactive compaction or not.
> Please see below code:
>    if (wait_event_freezable_timeout() && !proactive_compact_trigger) {
> 	// do the non-proactive work
> 	continue
>    }
>    // do the proactive work
>      .................
> 
> Thus I feel that on writing new proactiveness, it is required to do
> wakeup_kcomppactd() + set a flag that this wakeup is for proactive work.
> 
> Am I failed to get your point here?

The check whether to do non-proactive work is already guarded by
kcompactd_work_requested(), which looks at pgdat->kcompactd_max_order and this
is set by wakeup_kcompactd().

So with a plain wakeup where we don't set pgdat->kcompactd_max_order will make
it consider proactive work instead and we don't need another trigger variable
AFAICS.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/2] mm: compaction: support triggering of proactive compaction by user
  2021-06-17 14:37       ` Vlastimil Babka
@ 2021-06-17 16:05         ` Charan Teja Kalla
  2021-06-17 16:17           ` Vlastimil Babka
  0 siblings, 1 reply; 10+ messages in thread
From: Charan Teja Kalla @ 2021-06-17 16:05 UTC (permalink / raw)
  To: Vlastimil Babka, akpm, nigupta, hannes, corbet, mcgrof, keescook,
	yzaikin, aarcange, cl, xi.fengfei, mchehab+huawei,
	andrew.a.klychkov, dave.hansen, bhe, iamjoonsoo.kim,
	mateusznosek0, sh_def, vinmenon
  Cc: linux-doc, linux-kernel, linux-mm, linux-fsdevel

Thanks Vlastimil !!

On 6/17/2021 8:07 PM, Vlastimil Babka wrote:
> On 6/17/21 9:30 AM, Charan Teja Kalla wrote:
>> Thanks Vlastimil for your inputs!!
>>
>> On 6/16/2021 5:29 PM, Vlastimil Babka wrote:
>>>> This triggering of proactive compaction is done on a write to
>>>> sysctl.compaction_proactiveness by user.
>>>>
>>>> [1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=facdaa917c4d5a376d09d25865f5a863f906234a
>>>>
>>>> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
>>>> ---
>>>> changes in V2:
>>> You forgot to also summarize the changes. Please do in next version.
>>
>> I think we can get rid off 'proactive_defer' thread variable with the
>> timeout approach you suggested. But it is still requires to have one
>> additional variable 'proactive_compact_trigger', which main purpose is
>> to decide if the kcompactd wakeup is for proactive compaction or not.
>> Please see below code:
>>    if (wait_event_freezable_timeout() && !proactive_compact_trigger) {
>> 	// do the non-proactive work
>> 	continue
>>    }
>>    // do the proactive work
>>      .................
>>
>> Thus I feel that on writing new proactiveness, it is required to do
>> wakeup_kcomppactd() + set a flag that this wakeup is for proactive work.
>>
>> Am I failed to get your point here?
> 
> The check whether to do non-proactive work is already guarded by
> kcompactd_work_requested(), which looks at pgdat->kcompactd_max_order and this
> is set by wakeup_kcompactd().
> 
> So with a plain wakeup where we don't set pgdat->kcompactd_max_order will make
> it consider proactive work instead and we don't need another trigger variable
> AFAICS.

The wait_event/freezable_timeout() documentation says that:
 * Returns:
 * 0 if the @condition evaluated to %false after the @timeout elapsed,
			or
 * 1 if the @condition evaluated to %true after the @timeout elapsed,
 * or the remaining jiffies (at least 1) if the @condition evaluated
 * to %true before the @timeout elapsed.

which means the condition must be evaluated to true or timeout should be
elapsed for the function wait_event_freezable_timeout() to return.

Please check the macro implementation of __wait_event, where it will be
in for(;;) till the condition is evaluated to true or timeout happens.
#define __wait_event_freezable_timeout(wq_head, condition, timeout)

        ___wait_event(wq_head, ___wait_cond_timeout(condition),

                      TASK_INTERRUPTIBLE, 0, timeout,

                      __ret = freezable_schedule_timeout(__ret))

Thus the plain wakeup of kcompactd don't do the proactive compact work.
And so we should identify its wakeup for proactive work with a separate
flag.
> 

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/2] mm: compaction: support triggering of proactive compaction by user
  2021-06-17 16:05         ` Charan Teja Kalla
@ 2021-06-17 16:17           ` Vlastimil Babka
  0 siblings, 0 replies; 10+ messages in thread
From: Vlastimil Babka @ 2021-06-17 16:17 UTC (permalink / raw)
  To: Charan Teja Kalla, akpm, nigupta, hannes, corbet, mcgrof,
	keescook, yzaikin, aarcange, cl, xi.fengfei, mchehab+huawei,
	andrew.a.klychkov, dave.hansen, bhe, iamjoonsoo.kim,
	mateusznosek0, sh_def, vinmenon
  Cc: linux-doc, linux-kernel, linux-mm, linux-fsdevel

On 6/17/21 6:05 PM, Charan Teja Kalla wrote:
> The wait_event/freezable_timeout() documentation says that:
>  * Returns:
>  * 0 if the @condition evaluated to %false after the @timeout elapsed,
> 			or
>  * 1 if the @condition evaluated to %true after the @timeout elapsed,
>  * or the remaining jiffies (at least 1) if the @condition evaluated
>  * to %true before the @timeout elapsed.
> 
> which means the condition must be evaluated to true or timeout should be
> elapsed for the function wait_event_freezable_timeout() to return.
> 
> Please check the macro implementation of __wait_event, where it will be
> in for(;;) till the condition is evaluated to true or timeout happens.
> #define __wait_event_freezable_timeout(wq_head, condition, timeout)
> 
>         ___wait_event(wq_head, ___wait_cond_timeout(condition),
> 
>                       TASK_INTERRUPTIBLE, 0, timeout,
> 
>                       __ret = freezable_schedule_timeout(__ret))
> 
> Thus the plain wakeup of kcompactd don't do the proactive compact work.
> And so we should identify its wakeup for proactive work with a separate
> flag.

OK, you're right, I forgot that the macro has the for loop to guard against
spurious wakeups.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-06-17 16:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-31 10:54 [PATCH V3 0/2] mm: compaction: proactive compaction trigger by user Charan Teja Reddy
2021-05-31 10:54 ` [PATCH v3 1/2] mm: compaction: support triggering of proactive compaction " Charan Teja Reddy
2021-06-07 10:38   ` Charan Teja Kalla
2021-06-14 14:57     ` Charan Teja Kalla
2021-06-16 11:59   ` Vlastimil Babka
2021-06-17  7:30     ` Charan Teja Kalla
2021-06-17 14:37       ` Vlastimil Babka
2021-06-17 16:05         ` Charan Teja Kalla
2021-06-17 16:17           ` Vlastimil Babka
2021-05-31 10:54 ` [PATCH v3 2/2] mm: compaction: fix wakeup logic of proactive compaction Charan Teja Reddy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).