All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ryan Roberts <ryan.roberts@arm.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Zi Yan <ziy@nvidia.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Yang Shi <shy828301@gmail.com>,
	Huang Ying <ying.huang@intel.com>
Subject: Re: [PATCH v3 10/18] mm: Allow non-hugetlb large folios to be batch processed
Date: Fri, 8 Mar 2024 11:44:35 +0000	[thread overview]
Message-ID: <e8911b30-e96b-486c-a92a-c3513facc12e@arm.com> (raw)
In-Reply-To: <Zen6VDC5B_SN4zpR@casper.infradead.org>

> The thought occurs that we don't need to take the folios off the list.
> I don't know that will fix anything, but this will fix your "running out
> of memory" problem -- I forgot to drop the reference if folio_trylock()
> failed.  Of course, I can't call folio_put() inside the lock, so may
> as well move the trylock back to the second loop.
> 
> Again, compile-tessted only.
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index fd745bcc97ff..4a2ab17f802d 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3312,7 +3312,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
>  	struct pglist_data *pgdata = NODE_DATA(sc->nid);
>  	struct deferred_split *ds_queue = &pgdata->deferred_split_queue;
>  	unsigned long flags;
> -	LIST_HEAD(list);
> +	struct folio_batch batch;
>  	struct folio *folio, *next;
>  	int split = 0;
>  
> @@ -3321,36 +3321,31 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
>  		ds_queue = &sc->memcg->deferred_split_queue;
>  #endif
>  
> +	folio_batch_init(&batch);
>  	spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
> -	/* Take pin on all head pages to avoid freeing them under us */
> +	/* Take ref on all folios to avoid freeing them under us */
>  	list_for_each_entry_safe(folio, next, &ds_queue->split_queue,
>  							_deferred_list) {
> -		if (folio_try_get(folio)) {
> -			list_move(&folio->_deferred_list, &list);
> -		} else {
> -			/* We lost race with folio_put() */
> -			list_del_init(&folio->_deferred_list);
> -			ds_queue->split_queue_len--;
> +		if (!folio_try_get(folio))
> +			continue;
> +		if (folio_batch_add(&batch, folio) == 0) {
> +			--sc->nr_to_scan;
> +			break;
>  		}
>  		if (!--sc->nr_to_scan)
>  			break;
>  	}
>  	spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);
>  
> -	list_for_each_entry_safe(folio, next, &list, _deferred_list) {
> +	while ((folio = folio_batch_next(&batch)) != NULL) {
>  		if (!folio_trylock(folio))
> -			goto next;
> -		/* split_huge_page() removes page from list on success */
> +			continue;
>  		if (!split_folio(folio))
>  			split++;
>  		folio_unlock(folio);
> -next:
> -		folio_put(folio);
>  	}
>  
> -	spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
> -	list_splice_tail(&list, &ds_queue->split_queue);
> -	spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);
> +	folios_put(&batch);
>  
>  	/*
>  	 * Stop shrinker if we didn't split any page, but the queue is empty.


OK I've tested this; the good news is that I haven't seen any oopses or memory
leaks. The bad news is that it still takes an absolute age (hours) to complete
the same test that without "mm: Allow non-hugetlb large folios to be batch
processed" took a couple of mins. And during that time, the system is completely
unresponsive - serial terminal doesn't work - can't even break in with sysreq.
And sometimes I see RCU stall warnings.

Dumping all the CPU back traces with gdb, all the cores (except one) are
contending on the the deferred split lock.

A couple of thoughts:

 - Since we are now taking a maximum of 15 folios into a batch,
deferred_split_scan() is called much more often (in a tight loop from
do_shrink_slab()). Could it be that we are just trying to take the lock so much
more often now? I don't think it's quite that simple because we take the lock
for every single folio when adding it to the queue, so the dequeing cost should
still be a factor of 15 locks less.

- do_shrink_slab() might be calling deferred_split_scan() in a tight loop with
deferred_split_scan() returning 0 most of the time. If there are still folios on
the deferred split list but deferred_split_scan() was unable to lock any folios
then it will return 0, not SHRINK_STOP, so do_shrink_slab() will keep calling
it, essentially live locking. Has your patch changed the duration of the folio
being locked? I don't think so...

- Ahh, perhaps its as simple as your fix has removed the code that removed the
folio from the deferred split queue if it fails to get a reference? That could
mean we end up returning 0 instead of SHRINK_STOP too. I'll have play.



  parent reply	other threads:[~2024-03-08 11:44 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-27 17:42 [PATCH v3 00/18] Rearrange batched folio freeing Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 01/18] mm: Make folios_put() the basis of release_pages() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 02/18] mm: Convert free_unref_page_list() to use folios Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 03/18] mm: Add free_unref_folios() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 04/18] mm: Use folios_put() in __folio_batch_release() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 05/18] memcg: Add mem_cgroup_uncharge_folios() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 06/18] mm: Remove use of folio list from folios_put() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 07/18] mm: Use free_unref_folios() in put_pages_list() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 08/18] mm: use __page_cache_release() in folios_put() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 09/18] mm: Handle large folios in free_unref_folios() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 10/18] mm: Allow non-hugetlb large folios to be batch processed Matthew Wilcox (Oracle)
2024-03-06 13:42   ` Ryan Roberts
2024-03-06 16:09     ` Matthew Wilcox
2024-03-06 16:19       ` Ryan Roberts
2024-03-06 17:41         ` Ryan Roberts
2024-03-06 18:41           ` Zi Yan
2024-03-06 19:55             ` Matthew Wilcox
2024-03-06 21:55               ` Matthew Wilcox
2024-03-07  8:56                 ` Ryan Roberts
2024-03-07 13:50                   ` Yin, Fengwei
2024-03-07 14:05                     ` Re: Matthew Wilcox
2024-03-07 15:24                       ` Re: Ryan Roberts
2024-03-07 16:24                         ` Re: Ryan Roberts
2024-03-07 23:02                           ` Re: Matthew Wilcox
2024-03-08  1:06                       ` Re: Yin, Fengwei
2024-03-07 17:33                   ` [PATCH v3 10/18] mm: Allow non-hugetlb large folios to be batch processed Matthew Wilcox
2024-03-07 18:35                     ` Ryan Roberts
2024-03-07 20:42                       ` Matthew Wilcox
2024-03-08 11:44                     ` Ryan Roberts [this message]
2024-03-08 12:09                       ` Ryan Roberts
2024-03-08 14:21                         ` Ryan Roberts
2024-03-08 15:11                           ` Matthew Wilcox
2024-03-08 16:03                             ` Matthew Wilcox
2024-03-08 17:13                               ` Ryan Roberts
2024-03-08 18:09                                 ` Ryan Roberts
2024-03-08 18:18                                   ` Matthew Wilcox
2024-03-09  4:34                                     ` Andrew Morton
2024-03-09  4:52                                       ` Matthew Wilcox
2024-03-09  8:05                                         ` Ryan Roberts
2024-03-09 12:33                                           ` Ryan Roberts
2024-03-10 13:38                                             ` Matthew Wilcox
2024-03-08 15:33                         ` Matthew Wilcox
2024-03-09  6:09                       ` Matthew Wilcox
2024-03-09  7:59                         ` Ryan Roberts
2024-03-09  8:18                           ` Ryan Roberts
2024-03-09  9:38                             ` Ryan Roberts
2024-03-10  4:23                               ` Matthew Wilcox
2024-03-10  8:23                                 ` Ryan Roberts
2024-03-10 11:08                                   ` Matthew Wilcox
2024-03-10 11:01       ` Ryan Roberts
2024-03-10 11:11         ` Matthew Wilcox
2024-03-10 16:31           ` Ryan Roberts
2024-03-10 19:57             ` Matthew Wilcox
2024-03-10 19:59             ` Ryan Roberts
2024-03-10 20:46               ` Matthew Wilcox
2024-03-10 21:52                 ` Matthew Wilcox
2024-03-11  9:01                   ` Ryan Roberts
2024-03-11 12:26                     ` Matthew Wilcox
2024-03-11 12:36                       ` Ryan Roberts
2024-03-11 15:50                         ` Matthew Wilcox
2024-03-11 16:14                           ` Ryan Roberts
2024-03-11 17:49                             ` Matthew Wilcox
2024-03-12 11:57                               ` Ryan Roberts
2024-03-11 19:26                             ` Matthew Wilcox
2024-03-10 11:14         ` Ryan Roberts
2024-02-27 17:42 ` [PATCH v3 11/18] mm: Free folios in a batch in shrink_folio_list() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 12/18] mm: Free folios directly in move_folios_to_lru() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 13/18] memcg: Remove mem_cgroup_uncharge_list() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 14/18] mm: Remove free_unref_page_list() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 15/18] mm: Remove lru_to_page() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 16/18] mm: Convert free_pages_and_swap_cache() to use folios_put() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 17/18] mm: Use a folio in __collapse_huge_page_copy_succeeded() Matthew Wilcox (Oracle)
2024-02-27 17:42 ` [PATCH v3 18/18] mm: Convert free_swap_cache() to take a folio Matthew Wilcox (Oracle)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e8911b30-e96b-486c-a92a-c3513facc12e@arm.com \
    --to=ryan.roberts@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=shy828301@gmail.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.