* [PATCH] mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty()
@ 2018-05-30 1:50 Hugh Dickins
2018-05-30 7:59 ` Konstantin Khlebnikov
2018-05-30 10:27 ` Kirill A. Shutemov
0 siblings, 2 replies; 3+ messages in thread
From: Hugh Dickins @ 2018-05-30 1:50 UTC (permalink / raw)
To: Andrew Morton
Cc: Konstantin Khlebnikov, Kirill A. Shutemov, Nicholas Piggin,
linux-kernel, linux-mm
Swapping load on huge=always tmpfs (with khugepaged tuned up to be very
eager, but I'm not sure that is relevant) soon hung uninterruptibly,
waiting for page lock in shmem_getpage_gfp()'s find_lock_entry(), most
often when "cp -a" was trying to write to a smallish file. Debug showed
that the page in question was not locked, and page->mapping NULL by now,
but page->index consistent with having been in a huge page before.
Reproduced in minutes on a 4.15 kernel, even with 4.17's 605ca5ede764
("mm/huge_memory.c: reorder operations in __split_huge_page_tail()")
added in; but took hours to reproduce on a 4.17 kernel (no idea why).
The culprit proved to be the __ClearPageDirty() on tails beyond i_size
in __split_huge_page(): the non-atomic __bitoperation may have been safe
when 4.8's baa355fd3314 ("thp: file pages support for split_huge_page()")
introduced it, but liable to erase PageWaiters after 4.10's 62906027091f
("mm: add PageWaiters indicating tasks are waiting for a page bit").
Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit")
Signed-off-by: Hugh Dickins <hughd@google.com>
---
It's not a 4.17-rc regression that this fixes, so no great need to slip
this into 4.17 at the last moment - though it makes a good companion to
Konstantin's 605ca5ede764. I think they both should go to stable, but
since Konstantin's already went into rc1 without that tag, we shall
have to recommend Konstantin's to GregKH out-of-band.
mm/huge_memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- 4.17-rc7/mm/huge_memory.c 2018-04-26 10:48:36.019288258 -0700
+++ linux/mm/huge_memory.c 2018-05-29 18:14:52.095512715 -0700
@@ -2431,7 +2431,7 @@ static void __split_huge_page(struct pag
__split_huge_page_tail(head, i, lruvec, list);
/* Some pages can be beyond i_size: drop them from page cache */
if (head[i].index >= end) {
- __ClearPageDirty(head + i);
+ ClearPageDirty(head + i);
__delete_from_page_cache(head + i, NULL);
if (IS_ENABLED(CONFIG_SHMEM) && PageSwapBacked(head))
shmem_uncharge(head->mapping->host, 1);
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty()
2018-05-30 1:50 [PATCH] mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty() Hugh Dickins
@ 2018-05-30 7:59 ` Konstantin Khlebnikov
2018-05-30 10:27 ` Kirill A. Shutemov
1 sibling, 0 replies; 3+ messages in thread
From: Konstantin Khlebnikov @ 2018-05-30 7:59 UTC (permalink / raw)
To: Hugh Dickins, Andrew Morton
Cc: Kirill A. Shutemov, Nicholas Piggin, linux-kernel, linux-mm
On 30.05.2018 04:50, Hugh Dickins wrote:
> Swapping load on huge=always tmpfs (with khugepaged tuned up to be very
> eager, but I'm not sure that is relevant) soon hung uninterruptibly,
> waiting for page lock in shmem_getpage_gfp()'s find_lock_entry(), most
> often when "cp -a" was trying to write to a smallish file. Debug showed
> that the page in question was not locked, and page->mapping NULL by now,
> but page->index consistent with having been in a huge page before.
>
> Reproduced in minutes on a 4.15 kernel, even with 4.17's 605ca5ede764
> ("mm/huge_memory.c: reorder operations in __split_huge_page_tail()")
> added in; but took hours to reproduce on a 4.17 kernel (no idea why).
>
> The culprit proved to be the __ClearPageDirty() on tails beyond i_size
> in __split_huge_page(): the non-atomic __bitoperation may have been safe
> when 4.8's baa355fd3314 ("thp: file pages support for split_huge_page()")
> introduced it, but liable to erase PageWaiters after 4.10's 62906027091f
> ("mm: add PageWaiters indicating tasks are waiting for a page bit").
>
> Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit")
> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
>
> It's not a 4.17-rc regression that this fixes, so no great need to slip
> this into 4.17 at the last moment - though it makes a good companion to
> Konstantin's 605ca5ede764. I think they both should go to stable, but
> since Konstantin's already went into rc1 without that tag, we shall
> have to recommend Konstantin's to GregKH out-of-band.
Good catch.
This is the same issue, so all 4.10+ needs them both.
Preserving known regressions in core pieces like lock_page() is a bad idea.
>
> mm/huge_memory.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- 4.17-rc7/mm/huge_memory.c 2018-04-26 10:48:36.019288258 -0700
> +++ linux/mm/huge_memory.c 2018-05-29 18:14:52.095512715 -0700
> @@ -2431,7 +2431,7 @@ static void __split_huge_page(struct pag
> __split_huge_page_tail(head, i, lruvec, list);
> /* Some pages can be beyond i_size: drop them from page cache */
> if (head[i].index >= end) {
> - __ClearPageDirty(head + i);
> + ClearPageDirty(head + i);
> __delete_from_page_cache(head + i, NULL);
> if (IS_ENABLED(CONFIG_SHMEM) && PageSwapBacked(head))
> shmem_uncharge(head->mapping->host, 1);
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty()
2018-05-30 1:50 [PATCH] mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty() Hugh Dickins
2018-05-30 7:59 ` Konstantin Khlebnikov
@ 2018-05-30 10:27 ` Kirill A. Shutemov
1 sibling, 0 replies; 3+ messages in thread
From: Kirill A. Shutemov @ 2018-05-30 10:27 UTC (permalink / raw)
To: Hugh Dickins
Cc: Andrew Morton, Konstantin Khlebnikov, Nicholas Piggin,
linux-kernel, linux-mm
On Wed, May 30, 2018 at 01:50:22AM +0000, Hugh Dickins wrote:
> Swapping load on huge=always tmpfs (with khugepaged tuned up to be very
> eager, but I'm not sure that is relevant) soon hung uninterruptibly,
> waiting for page lock in shmem_getpage_gfp()'s find_lock_entry(), most
> often when "cp -a" was trying to write to a smallish file. Debug showed
> that the page in question was not locked, and page->mapping NULL by now,
> but page->index consistent with having been in a huge page before.
>
> Reproduced in minutes on a 4.15 kernel, even with 4.17's 605ca5ede764
> ("mm/huge_memory.c: reorder operations in __split_huge_page_tail()")
> added in; but took hours to reproduce on a 4.17 kernel (no idea why).
>
> The culprit proved to be the __ClearPageDirty() on tails beyond i_size
> in __split_huge_page(): the non-atomic __bitoperation may have been safe
> when 4.8's baa355fd3314 ("thp: file pages support for split_huge_page()")
> introduced it, but liable to erase PageWaiters after 4.10's 62906027091f
> ("mm: add PageWaiters indicating tasks are waiting for a page bit").
>
> Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit")
> Signed-off-by: Hugh Dickins <hughd@google.com>
Thanks for catching this.
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2018-05-30 10:27 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-30 1:50 [PATCH] mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty() Hugh Dickins
2018-05-30 7:59 ` Konstantin Khlebnikov
2018-05-30 10:27 ` Kirill A. Shutemov
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.