All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Charan Teja Reddy <charante@codeaurora.org>,
	akpm@linux-foundation.org, nigupta@nvidia.com,
	hannes@cmpxchg.org, corbet@lwn.net, mcgrof@kernel.org,
	keescook@chromium.org, yzaikin@google.com, aarcange@redhat.com,
	cl@linux.com, xi.fengfei@h3c.com, mchehab+huawei@kernel.org,
	andrew.a.klychkov@gmail.com, dave.hansen@linux.intel.com,
	bhe@redhat.com, iamjoonsoo.kim@lge.com, mateusznosek0@gmail.com,
	sh_def@163.com, vinmenon@codeaurora.org
Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v3 1/2] mm: compaction: support triggering of proactive compaction by user
Date: Wed, 16 Jun 2021 13:59:40 +0200	[thread overview]
Message-ID: <88abfdb6-2c13-b5a6-5b46-742d12d1c910@suse.cz> (raw)
In-Reply-To: <7db6a29a64b29d56cde46c713204428a4b95f0ab.1622454385.git.charante@codeaurora.org>

On 5/31/21 12:54 PM, Charan Teja Reddy wrote:
> The proactive compaction[1] gets triggered for every 500msec and run
> compaction on the node for COMPACTION_HPAGE_ORDER (usually order-9)
> pages based on the value set to sysctl.compaction_proactiveness.
> Triggering the compaction for every 500msec in search of
> COMPACTION_HPAGE_ORDER pages is not needed for all applications,
> especially on the embedded system usecases which may have few MB's of
> RAM. Enabling the proactive compaction in its state will endup in
> running almost always on such systems.
> 
> Other side, proactive compaction can still be very much useful for
> getting a set of higher order pages in some controllable
> manner(controlled by using the sysctl.compaction_proactiveness). Thus on
> systems where enabling the proactive compaction always may proove not
> required, can trigger the same from user space on write to its sysctl
> interface. As an example, say app launcher decide to launch the memory
> heavy application which can be launched fast if it gets more higher
> order pages thus launcher can prepare the system in advance by
> triggering the proactive compaction from userspace.
> 
> This triggering of proactive compaction is done on a write to
> sysctl.compaction_proactiveness by user.
> 
> [1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=facdaa917c4d5a376d09d25865f5a863f906234a
> 
> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
> ---
> changes in V2:

You forgot to also summarize the changes. Please do in next version.

>   */
>  unsigned int __read_mostly sysctl_compaction_proactiveness = 20;
>  
> +int compaction_proactiveness_sysctl_handler(struct ctl_table *table, int write,
> +		void *buffer, size_t *length, loff_t *ppos)
> +{
> +	int rc, nid;
> +
> +	rc = proc_dointvec_minmax(table, write, buffer, length, ppos);
> +	if (rc)
> +		return rc;
> +
> +	if (write && sysctl_compaction_proactiveness) {
> +		for_each_online_node(nid) {
> +			pg_data_t *pgdat = NODE_DATA(nid);
> +
> +			if (pgdat->proactive_compact_trigger)
> +				continue;
> +
> +			pgdat->proactive_compact_trigger = true;

I don't like the new variable. I wish we could do without it. I understand this
is added to ignore proactive_defer.
We could instead expose proactive_defer in pgdat and reset it to 0 before wakeup
(instead being a thread variable in kcompactd). But that would be racy with the
decreases done by kcompactd.
But I like the patch 2/2 and the idea could be extended to proactive_defer
handling. If there's no proactive_defer, timeout is
HPAGE_FRAG_CHECK_INTERVAL_MSEC. If kcompactd decides to defer, timeout would be
HPAGE_FRAG_CHECK_INTERVAL_MSEC << COMPACT_MAX_DEFER_SHIFT. Thus, no more waking
up just to decrease proactive_defer, we can then get rid of the counter. On
writing new proactiveness just wake up and that's it, regardless of which
timeout there was at the moment.
The only change is, if we get woken up to do non-proactive work, by
wakeup_kcompactd(), the proactive_defer value would be now be effectively lost.
I think it's OK as wakeup_kcompactd() means the condition of the zone changed
substantionally anyway and carrying on with previous defer makes not much sense.
What do you think?

> +			wake_up_interruptible(&pgdat->kcompactd_wait);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
>  /*
>   * This is the entry point for compacting all nodes via
>   * /proc/sys/vm/compact_memory
> @@ -2752,7 +2776,8 @@ void compaction_unregister_node(struct node *node)
>  
>  static inline bool kcompactd_work_requested(pg_data_t *pgdat)
>  {
> -	return pgdat->kcompactd_max_order > 0 || kthread_should_stop();
> +	return pgdat->kcompactd_max_order > 0 || kthread_should_stop() ||
> +		pgdat->proactive_compact_trigger;
>  }
>  
>  static bool kcompactd_node_suitable(pg_data_t *pgdat)
> @@ -2905,7 +2930,8 @@ static int kcompactd(void *p)
>  		trace_mm_compaction_kcompactd_sleep(pgdat->node_id);
>  		if (wait_event_freezable_timeout(pgdat->kcompactd_wait,
>  			kcompactd_work_requested(pgdat),
> -			msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC))) {
> +			msecs_to_jiffies(HPAGE_FRAG_CHECK_INTERVAL_MSEC)) &&
> +			!pgdat->proactive_compact_trigger) {
>  
>  			psi_memstall_enter(&pflags);
>  			kcompactd_do_work(pgdat);
> @@ -2917,10 +2943,20 @@ static int kcompactd(void *p)
>  		if (should_proactive_compact_node(pgdat)) {
>  			unsigned int prev_score, score;
>  
> -			if (proactive_defer) {
> +			/*
> +			 * On wakeup of proactive compaction by sysctl
> +			 * write, ignore the accumulated defer score.
> +			 * Anyway, if the proactive compaction didn't
> +			 * make any progress for the new value, it will
> +			 * be further deferred by 2^COMPACT_MAX_DEFER_SHIFT
> +			 * times.
> +			 */
> +			if (proactive_defer &&
> +				!pgdat->proactive_compact_trigger) {
>  				proactive_defer--;
>  				continue;
>  			}
> +
>  			prev_score = fragmentation_score_node(pgdat);
>  			proactive_compact_node(pgdat);
>  			score = fragmentation_score_node(pgdat);
> @@ -2931,6 +2967,8 @@ static int kcompactd(void *p)
>  			proactive_defer = score < prev_score ?
>  					0 : 1 << COMPACT_MAX_DEFER_SHIFT;
>  		}
> +		if (pgdat->proactive_compact_trigger)
> +			pgdat->proactive_compact_trigger = false;
>  	}
>  
>  	return 0;
> 


  parent reply	other threads:[~2021-06-16 11:59 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-31 10:54 [PATCH V3 0/2] mm: compaction: proactive compaction trigger by user Charan Teja Reddy
2021-05-31 10:54 ` [PATCH v3 1/2] mm: compaction: support triggering of proactive compaction " Charan Teja Reddy
2021-06-07 10:38   ` Charan Teja Kalla
2021-06-14 14:57     ` Charan Teja Kalla
2021-06-16 11:59   ` Vlastimil Babka [this message]
2021-06-17  7:30     ` Charan Teja Kalla
2021-06-17 14:37       ` Vlastimil Babka
2021-06-17 16:05         ` Charan Teja Kalla
2021-06-17 16:17           ` Vlastimil Babka
2021-05-31 10:54 ` [PATCH v3 2/2] mm: compaction: fix wakeup logic of proactive compaction Charan Teja Reddy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=88abfdb6-2c13-b5a6-5b46-742d12d1c910@suse.cz \
    --to=vbabka@suse.cz \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.a.klychkov@gmail.com \
    --cc=bhe@redhat.com \
    --cc=charante@codeaurora.org \
    --cc=cl@linux.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=keescook@chromium.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mateusznosek0@gmail.com \
    --cc=mcgrof@kernel.org \
    --cc=mchehab+huawei@kernel.org \
    --cc=nigupta@nvidia.com \
    --cc=sh_def@163.com \
    --cc=vinmenon@codeaurora.org \
    --cc=xi.fengfei@h3c.com \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.