linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 5.10 00/21] 5.10.79-rc1 review
@ 2021-11-13 13:19 Tim Lewis
  2021-11-13 13:36 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 16+ messages in thread
From: Tim Lewis @ 2021-11-13 13:19 UTC (permalink / raw)
  To: Yang Shi
  Cc: Greg Kroah-Hartman, Naresh Kamboju, Sudip Mukherjee, f.fainelli,
	torvalds, open list, lkft-triage, patches, stable, pavel, akpm,
	jonathanh, shuah, linux, Naoya Horiguchi, Kirill A. Shutemov,
	Hugh Dickins, Matthew Wilcox, Oscar Salvador, Peter Xu

> commit 8615ff6dd1ac9e01b6fcf0fc0652353f79f524ed
> Author: Yang Shi <shy828301@gmail.com>
> Date:   Thu Oct 28 14:36:11 2021 -0700
>
>     mm: filemap: check if THP has hwpoisoned subpage for PMD page fault
>
>     commit eac96c3efdb593df1a57bb5b95dbe037bfa9a522 upstream.

For the sake of testing,
other than this breaking systemd-journal,
postgresql is another service that would hang forever with 100% CPU,
on arm64 (odroid-c4) using Ubuntu 20.04.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5.10 00/21] 5.10.79-rc1 review
  2021-11-13 13:19 [PATCH 5.10 00/21] 5.10.79-rc1 review Tim Lewis
@ 2021-11-13 13:36 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 16+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-13 13:36 UTC (permalink / raw)
  To: Tim Lewis
  Cc: Yang Shi, Naresh Kamboju, Sudip Mukherjee, f.fainelli, torvalds,
	open list, lkft-triage, patches, stable, pavel, akpm, jonathanh,
	shuah, linux, Naoya Horiguchi, Kirill A. Shutemov, Hugh Dickins,
	Matthew Wilcox, Oscar Salvador, Peter Xu

On Sat, Nov 13, 2021 at 08:19:12AM -0500, Tim Lewis wrote:
> > commit 8615ff6dd1ac9e01b6fcf0fc0652353f79f524ed
> > Author: Yang Shi <shy828301@gmail.com>
> > Date:   Thu Oct 28 14:36:11 2021 -0700
> >
> >     mm: filemap: check if THP has hwpoisoned subpage for PMD page fault
> >
> >     commit eac96c3efdb593df1a57bb5b95dbe037bfa9a522 upstream.
> 
> For the sake of testing,
> other than this breaking systemd-journal,
> postgresql is another service that would hang forever with 100% CPU,
> on arm64 (odroid-c4) using Ubuntu 20.04.

Thanks, this commit was dropped from this release.

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5.10 00/21] 5.10.79-rc1 review
  2021-11-11 14:54   ` Naresh Kamboju
@ 2021-11-12 13:47     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 16+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-12 13:47 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: Sudip Mukherjee, f.fainelli, torvalds, linux-kernel, lkft-triage,
	patches, stable, pavel, akpm, jonathanh, shuah, linux, Yang Shi,
	Naoya Horiguchi, Kirill A. Shutemov, Hugh Dickins,
	Matthew Wilcox, Oscar Salvador, Peter Xu

On Thu, Nov 11, 2021 at 08:24:42PM +0530, Naresh Kamboju wrote:
> On Thu, 11 Nov 2021 at 18:32, Sudip Mukherjee
> <sudipm.mukherjee@gmail.com> wrote:
> >
> > Hi Greg,
> >
> > On Wed, Nov 10, 2021 at 07:43:46PM +0100, Greg Kroah-Hartman wrote:
> > > This is the start of the stable review cycle for the 5.10.79 release.
> > > There are 21 patches in this series, all will be posted as a response
> > > to this one.  If anyone has any issues with these being applied, please
> > > let me know.
> > >
> > > Responses should be made by Fri, 12 Nov 2021 18:19:54 +0000.
> > > Anything received after that time might be too late.
> >
> > systemd-journal-flush.service failed due to a timeout resulting in a very very
> > slow boot on my test laptop. qemu test on openqa failed due to the same problem.
> >
> > https://openqa.qa.codethink.co.uk/tests/365
> >
> > A bisect showed the problem to be 8615ff6dd1ac ("mm: filemap: check if THP has
> > hwpoisoned subpage for PMD page fault"). Reverting it on top of 5.10.79-rc1
> > fixed the problem.
> > Incidentally, I was having similar problem with Linus's tree
> > for last few days and was failing since 20211106 (did not get the time to check).
> > I will test mainline again with this commit reverted.
> 
> I have also noticed this problem and Anders bisected and found this
> first bad commit.
> 
> Failed test log link,
> A start job is running for Journal Service (5s / 1min 27s)
> https://lkft.validation.linaro.org/scheduler/job/3901980#L2234
> 
> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
> 
> Bisect log:
> 
> # bad: [b85617a6291f710807d0cd078c230626dee60b16] Linux 5.10.79-rc1
> # good: [5040520482a594e92d4f69141229a6dd26173511] Linux 5.10.78
> git bisect start 'b85617a6291f710807d0cd078c230626dee60b16'
> '5040520482a594e92d4f69141229a6dd26173511'
> # bad: [7ceeda856035991a6c9804916987a03759745fb0] staging: rtl8712:
> fix use-after-free in rtl8712_dl_fw
> git bisect bad 7ceeda856035991a6c9804916987a03759745fb0
> # bad: [8615ff6dd1ac9e01b6fcf0fc0652353f79f524ed] mm: filemap: check
> if THP has hwpoisoned subpage for PMD page fault
> git bisect bad 8615ff6dd1ac9e01b6fcf0fc0652353f79f524ed
> # good: [e9cb6ce4690749d42013f1d56874c624d7241740] Revert "x86/kvm:
> fix vcpu-id indexed array sizes"
> git bisect good e9cb6ce4690749d42013f1d56874c624d7241740
> # good: [dc385dfc126d51d7a93db694f8e151afe60eb06a] mm: hwpoison:
> remove the unnecessary THP check
> git bisect good dc385dfc126d51d7a93db694f8e151afe60eb06a
> # first bad commit: [8615ff6dd1ac9e01b6fcf0fc0652353f79f524ed] mm:
> filemap: check if THP has hwpoisoned subpage for PMD page fault
> commit 8615ff6dd1ac9e01b6fcf0fc0652353f79f524ed
> Author: Yang Shi <shy828301@gmail.com>
> Date:   Thu Oct 28 14:36:11 2021 -0700
> 
>     mm: filemap: check if THP has hwpoisoned subpage for PMD page fault
> 
>     commit eac96c3efdb593df1a57bb5b95dbe037bfa9a522 upstream.
> 
>     When handling shmem page fault the THP with corrupted subpage could be
>     PMD mapped if certain conditions are satisfied.  But kernel is supposed
>     to send SIGBUS when trying to map hwpoisoned page.
> 
>     There are two paths which may do PMD map: fault around and regular
>     fault.
> 
>     Before commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault()
>     codepaths") the thing was even worse in fault around path.  The THP
>     could be PMD mapped as long as the VMA fits regardless what subpage is
>     accessed and corrupted.  After this commit as long as head page is not
>     corrupted the THP could be PMD mapped.
> 
>     In the regular fault path the THP could be PMD mapped as long as the
>     corrupted page is not accessed and the VMA fits.
> 
>     This loophole could be fixed by iterating every subpage to check if any
>     of them is hwpoisoned or not, but it is somewhat costly in page fault
>     path.
> 
>     So introduce a new page flag called HasHWPoisoned on the first tail
>     page.  It indicates the THP has hwpoisoned subpage(s).  It is set if any
>     subpage of THP is found hwpoisoned by memory failure and after the
>     refcount is bumped successfully, then cleared when the THP is freed or
>     split.
> 
>     The soft offline path doesn't need this since soft offline handler just
>     marks a subpage hwpoisoned when the subpage is migrated successfully.
>     But shmem THP didn't get split then migrated at all.
> 
>     Link: https://lkml.kernel.org/r/20211020210755.23964-3-shy828301@gmail.com
>     Fixes: 800d8c63b2e9 ("shmem: add huge pages support")
>     Signed-off-by: Yang Shi <shy828301@gmail.com>
>     Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
>     Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>     Cc: Hugh Dickins <hughd@google.com>
>     Cc: Matthew Wilcox <willy@infradead.org>
>     Cc: Oscar Salvador <osalvador@suse.de>
>     Cc: Peter Xu <peterx@redhat.com>
>     Cc: <stable@vger.kernel.org>
>     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
>  include/linux/page-flags.h | 23 +++++++++++++++++++++++
>  mm/huge_memory.c           |  2 ++
>  mm/memory-failure.c        | 14 ++++++++++++++
>  mm/memory.c                |  9 +++++++++
>  mm/page_alloc.c            |  4 +++-
>  5 files changed, 51 insertions(+), 1 deletion(-)
> 

Thanks, I'm going to go drop this patch again.

This has been the second time we have tried to add it.  Yang, are you
_SURE_ it needs to be in the 5.10.y tree?  So far it's been nothing but
build and boot failures :(

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5.10 00/21] 5.10.79-rc1 review
  2021-11-11 19:45   ` Sudip Mukherjee
@ 2021-11-12 13:46     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 16+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-12 13:46 UTC (permalink / raw)
  To: Sudip Mukherjee
  Cc: linux-kernel, Linus Torvalds, Andrew Morton, Guenter Roeck,
	Shuah Khan, patches, lkft-triage, Pavel Machek, Jonathan Hunter,
	Florian Fainelli, Stable

On Thu, Nov 11, 2021 at 07:45:09PM +0000, Sudip Mukherjee wrote:
> On Thu, Nov 11, 2021 at 1:01 PM Sudip Mukherjee
> <sudipm.mukherjee@gmail.com> wrote:
> >
> > Hi Greg,
> >
> > On Wed, Nov 10, 2021 at 07:43:46PM +0100, Greg Kroah-Hartman wrote:
> > > This is the start of the stable review cycle for the 5.10.79 release.
> > > There are 21 patches in this series, all will be posted as a response
> > > to this one.  If anyone has any issues with these being applied, please
> > > let me know.
> > >
> > > Responses should be made by Fri, 12 Nov 2021 18:19:54 +0000.
> > > Anything received after that time might be too late.
> >
> > systemd-journal-flush.service failed due to a timeout resulting in a very very
> > slow boot on my test laptop. qemu test on openqa failed due to the same problem.
> 
> Build test:
> mips (gcc version 11.2.1 20211104): 63 configs -> no new failure
> arm (gcc version 11.2.1 20211104): 105 configs -> no new failure
> arm64 (gcc version 11.2.1 20211104): 3 configs -> no failure
> x86_64 (gcc version 11.2.1 20211104): 4 configs -> no failure
> 
> Boot test:
> x86_64: Regression mail sent earlier.  Caused by 8615ff6dd1ac ("mm:
> filemap: check if THP has
> hwpoisoned subpage for PMD page fault").
> 
> arm64: Booted on rpi4b (4GB model). No regression. [1]
> 
> [1]. https://openqa.qa.codethink.co.uk/tests/362
> 
> 
> Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>

Will go drop the offending patch, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5.10 00/21] 5.10.79-rc1 review
  2021-11-11 21:36   ` Shuah Khan
@ 2021-11-12 13:46     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 16+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-12 13:46 UTC (permalink / raw)
  To: Shuah Khan
  Cc: Sudip Mukherjee, linux-kernel, torvalds, akpm, linux, shuah,
	patches, lkft-triage, pavel, jonathanh, f.fainelli, stable

On Thu, Nov 11, 2021 at 02:36:08PM -0700, Shuah Khan wrote:
> On 11/11/21 6:01 AM, Sudip Mukherjee wrote:
> > Hi Greg,
> > 
> > On Wed, Nov 10, 2021 at 07:43:46PM +0100, Greg Kroah-Hartman wrote:
> > > This is the start of the stable review cycle for the 5.10.79 release.
> > > There are 21 patches in this series, all will be posted as a response
> > > to this one.  If anyone has any issues with these being applied, please
> > > let me know.
> > > 
> > > Responses should be made by Fri, 12 Nov 2021 18:19:54 +0000.
> > > Anything received after that time might be too late.
> > 
> > systemd-journal-flush.service failed due to a timeout resulting in a very very
> > slow boot on my test laptop. qemu test on openqa failed due to the same problem.
> > 
> > https://openqa.qa.codethink.co.uk/tests/365
> > 
> > A bisect showed the problem to be 8615ff6dd1ac ("mm: filemap: check if THP has
> > hwpoisoned subpage for PMD page fault"). Reverting it on top of 5.10.79-rc1
> > fixed the problem.
> > Incidentally, I was having similar problem with Linus's tree
> > for last few days and was failing since 20211106 (did not get the time to check).
> > I will test mainline again with this commit reverted.
> > 
> > 
> 
> Reverting mm: filemap: check if THP has hwpoisoned subpage for PMD page fault"
> worked for me on my test system.
> 
> With this commit boots are long and shutdown was at the 20+ minute m ark when
> I powered it down. This commit isn't in any of the other release candidates.

Thanks, will go drop this commit.

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5.10 00/21] 5.10.79-rc1 review
  2021-11-12  1:15 ` Guenter Roeck
@ 2021-11-12 13:45   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 16+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-12 13:45 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-kernel, torvalds, akpm, shuah, patches, lkft-triage, pavel,
	jonathanh, f.fainelli, stable

On Thu, Nov 11, 2021 at 05:15:01PM -0800, Guenter Roeck wrote:
> On Wed, Nov 10, 2021 at 07:43:46PM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.10.79 release.
> > There are 21 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Fri, 12 Nov 2021 18:19:54 +0000.
> > Anything received after that time might be too late.
> > 
> 
> Build results:
> 	total: 159 pass: 159 fail: 0
> Qemu test results:
> 	total: 474 pass: 469 fail: 5
> Failed tests:
> 	ppc64:powernv:powernv_defconfig:smp2:nvme:net,i82559a:rootfs
> 	ppc64:powernv:powernv_defconfig:usb-xhci:net,i82562:rootfs
> 	ppc64:powernv:powernv_defconfig:scsi[MEGASAS]:net,i82557a:rootfs
> 	ppc64:powernv:powernv_defconfig:smp2:sdhci:mmc:net,i82801:rootfs
> 	ppc64:powernv:powernv_defconfig:mtd32:net,rtl8139:rootfs
> 
> Reverting commit 8615ff6dd1ac ("mm: filemap: check if THP has hwpoisoned
> subpage for PMD page fault") fixes the problem.

Ugh, ok, I'm going to drop this patch (and the one before it) again.

thanks for the testing.

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5.10 00/21] 5.10.79-rc1 review
  2021-11-10 18:43 Greg Kroah-Hartman
                   ` (4 preceding siblings ...)
  2021-11-11 16:42 ` Pavel Machek
@ 2021-11-12  1:15 ` Guenter Roeck
  2021-11-12 13:45   ` Greg Kroah-Hartman
  5 siblings, 1 reply; 16+ messages in thread
From: Guenter Roeck @ 2021-11-12  1:15 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, shuah, patches, lkft-triage, pavel,
	jonathanh, f.fainelli, stable

On Wed, Nov 10, 2021 at 07:43:46PM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.10.79 release.
> There are 21 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Fri, 12 Nov 2021 18:19:54 +0000.
> Anything received after that time might be too late.
> 

Build results:
	total: 159 pass: 159 fail: 0
Qemu test results:
	total: 474 pass: 469 fail: 5
Failed tests:
	ppc64:powernv:powernv_defconfig:smp2:nvme:net,i82559a:rootfs
	ppc64:powernv:powernv_defconfig:usb-xhci:net,i82562:rootfs
	ppc64:powernv:powernv_defconfig:scsi[MEGASAS]:net,i82557a:rootfs
	ppc64:powernv:powernv_defconfig:smp2:sdhci:mmc:net,i82801:rootfs
	ppc64:powernv:powernv_defconfig:mtd32:net,rtl8139:rootfs

Reverting commit 8615ff6dd1ac ("mm: filemap: check if THP has hwpoisoned
subpage for PMD page fault") fixes the problem.

Guenter

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5.10 00/21] 5.10.79-rc1 review
  2021-11-11 13:01 ` Sudip Mukherjee
  2021-11-11 14:54   ` Naresh Kamboju
  2021-11-11 19:45   ` Sudip Mukherjee
@ 2021-11-11 21:36   ` Shuah Khan
  2021-11-12 13:46     ` Greg Kroah-Hartman
  2 siblings, 1 reply; 16+ messages in thread
From: Shuah Khan @ 2021-11-11 21:36 UTC (permalink / raw)
  To: Sudip Mukherjee, Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, linux, shuah, patches, lkft-triage,
	pavel, jonathanh, f.fainelli, stable, Shuah Khan

On 11/11/21 6:01 AM, Sudip Mukherjee wrote:
> Hi Greg,
> 
> On Wed, Nov 10, 2021 at 07:43:46PM +0100, Greg Kroah-Hartman wrote:
>> This is the start of the stable review cycle for the 5.10.79 release.
>> There are 21 patches in this series, all will be posted as a response
>> to this one.  If anyone has any issues with these being applied, please
>> let me know.
>>
>> Responses should be made by Fri, 12 Nov 2021 18:19:54 +0000.
>> Anything received after that time might be too late.
> 
> systemd-journal-flush.service failed due to a timeout resulting in a very very
> slow boot on my test laptop. qemu test on openqa failed due to the same problem.
> 
> https://openqa.qa.codethink.co.uk/tests/365
> 
> A bisect showed the problem to be 8615ff6dd1ac ("mm: filemap: check if THP has
> hwpoisoned subpage for PMD page fault"). Reverting it on top of 5.10.79-rc1
> fixed the problem.
> Incidentally, I was having similar problem with Linus's tree
> for last few days and was failing since 20211106 (did not get the time to check).
> I will test mainline again with this commit reverted.
> 
> 

Reverting mm: filemap: check if THP has hwpoisoned subpage for PMD page fault"
worked for me on my test system.

With this commit boots are long and shutdown was at the 20+ minute m ark when
I powered it down. This commit isn't in any of the other release candidates.

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5.10 00/21] 5.10.79-rc1 review
  2021-11-11 13:01 ` Sudip Mukherjee
  2021-11-11 14:54   ` Naresh Kamboju
@ 2021-11-11 19:45   ` Sudip Mukherjee
  2021-11-12 13:46     ` Greg Kroah-Hartman
  2021-11-11 21:36   ` Shuah Khan
  2 siblings, 1 reply; 16+ messages in thread
From: Sudip Mukherjee @ 2021-11-11 19:45 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, Linus Torvalds, Andrew Morton, Guenter Roeck,
	Shuah Khan, patches, lkft-triage, Pavel Machek, Jonathan Hunter,
	Florian Fainelli, Stable

On Thu, Nov 11, 2021 at 1:01 PM Sudip Mukherjee
<sudipm.mukherjee@gmail.com> wrote:
>
> Hi Greg,
>
> On Wed, Nov 10, 2021 at 07:43:46PM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.10.79 release.
> > There are 21 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Fri, 12 Nov 2021 18:19:54 +0000.
> > Anything received after that time might be too late.
>
> systemd-journal-flush.service failed due to a timeout resulting in a very very
> slow boot on my test laptop. qemu test on openqa failed due to the same problem.

Build test:
mips (gcc version 11.2.1 20211104): 63 configs -> no new failure
arm (gcc version 11.2.1 20211104): 105 configs -> no new failure
arm64 (gcc version 11.2.1 20211104): 3 configs -> no failure
x86_64 (gcc version 11.2.1 20211104): 4 configs -> no failure

Boot test:
x86_64: Regression mail sent earlier.  Caused by 8615ff6dd1ac ("mm:
filemap: check if THP has
hwpoisoned subpage for PMD page fault").

arm64: Booted on rpi4b (4GB model). No regression. [1]

[1]. https://openqa.qa.codethink.co.uk/tests/362


Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>

--
Regards
Sudip

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5.10 00/21] 5.10.79-rc1 review
  2021-11-10 18:43 Greg Kroah-Hartman
                   ` (3 preceding siblings ...)
  2021-11-11 16:20 ` Shuah Khan
@ 2021-11-11 16:42 ` Pavel Machek
  2021-11-12  1:15 ` Guenter Roeck
  5 siblings, 0 replies; 16+ messages in thread
From: Pavel Machek @ 2021-11-11 16:42 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, linux, shuah, patches, lkft-triage,
	pavel, jonathanh, f.fainelli, stable

[-- Attachment #1: Type: text/plain, Size: 660 bytes --]

Hi!

> This is the start of the stable review cycle for the 5.10.79 release.
> There are 21 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.

CIP testing did not find any problems here:

https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-5.10.y

Tested-by: Pavel Machek (CIP) <pavel@denx.de>

Best regards,
                                                                Pavel
-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5.10 00/21] 5.10.79-rc1 review
  2021-11-10 18:43 Greg Kroah-Hartman
                   ` (2 preceding siblings ...)
  2021-11-11 13:01 ` Sudip Mukherjee
@ 2021-11-11 16:20 ` Shuah Khan
  2021-11-11 16:42 ` Pavel Machek
  2021-11-12  1:15 ` Guenter Roeck
  5 siblings, 0 replies; 16+ messages in thread
From: Shuah Khan @ 2021-11-11 16:20 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, linux, shuah, patches, lkft-triage, pavel,
	jonathanh, f.fainelli, stable, Shuah Khan

On 11/10/21 11:43 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.10.79 release.
> There are 21 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Fri, 12 Nov 2021 18:19:54 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.79-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Compiled and booted on my test system.

dmesg regressions. It took a very long time in trying to start
Journal services and finally timed out. Pervious boot was on
5.14.18-rc1 both boot and shutdown were clean.

> systemd[1]: systemd-journald.service: Failed with result 'timeout'.
> systemd[1]: Failed to start Journal Service.
> systemd[1]: systemd-journald.service: Consumed 3min 490ms CPU time.
> systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 6.
> systemd[1]: Stopped Journal Service.
> systemd[1]: systemd-journald.service: Consumed 3min 490ms CPU time.
> systemd[1]: Starting Journal Service...
> systemd-journald[913]: File /run/log/journal/351d6659a0b4490baeff8ad3c4704a35/system.journal corrupted or uncleanly shut down, renaming and replacing.
> systemd[1]: Started Journal Service.


Tested-by: Shuah Khan <skhan@linuxfoundation.org>

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5.10 00/21] 5.10.79-rc1 review
  2021-11-11 13:01 ` Sudip Mukherjee
@ 2021-11-11 14:54   ` Naresh Kamboju
  2021-11-12 13:47     ` Greg Kroah-Hartman
  2021-11-11 19:45   ` Sudip Mukherjee
  2021-11-11 21:36   ` Shuah Khan
  2 siblings, 1 reply; 16+ messages in thread
From: Naresh Kamboju @ 2021-11-11 14:54 UTC (permalink / raw)
  To: Sudip Mukherjee
  Cc: Greg Kroah-Hartman, f.fainelli, torvalds, linux-kernel,
	lkft-triage, patches, stable, pavel, akpm, jonathanh, shuah,
	linux, Yang Shi, Naoya Horiguchi, Kirill A. Shutemov,
	Hugh Dickins, Matthew Wilcox, Oscar Salvador, Peter Xu

On Thu, 11 Nov 2021 at 18:32, Sudip Mukherjee
<sudipm.mukherjee@gmail.com> wrote:
>
> Hi Greg,
>
> On Wed, Nov 10, 2021 at 07:43:46PM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.10.79 release.
> > There are 21 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Fri, 12 Nov 2021 18:19:54 +0000.
> > Anything received after that time might be too late.
>
> systemd-journal-flush.service failed due to a timeout resulting in a very very
> slow boot on my test laptop. qemu test on openqa failed due to the same problem.
>
> https://openqa.qa.codethink.co.uk/tests/365
>
> A bisect showed the problem to be 8615ff6dd1ac ("mm: filemap: check if THP has
> hwpoisoned subpage for PMD page fault"). Reverting it on top of 5.10.79-rc1
> fixed the problem.
> Incidentally, I was having similar problem with Linus's tree
> for last few days and was failing since 20211106 (did not get the time to check).
> I will test mainline again with this commit reverted.

I have also noticed this problem and Anders bisected and found this
first bad commit.

Failed test log link,
A start job is running for Journal Service (5s / 1min 27s)
https://lkft.validation.linaro.org/scheduler/job/3901980#L2234

Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>

Bisect log:

# bad: [b85617a6291f710807d0cd078c230626dee60b16] Linux 5.10.79-rc1
# good: [5040520482a594e92d4f69141229a6dd26173511] Linux 5.10.78
git bisect start 'b85617a6291f710807d0cd078c230626dee60b16'
'5040520482a594e92d4f69141229a6dd26173511'
# bad: [7ceeda856035991a6c9804916987a03759745fb0] staging: rtl8712:
fix use-after-free in rtl8712_dl_fw
git bisect bad 7ceeda856035991a6c9804916987a03759745fb0
# bad: [8615ff6dd1ac9e01b6fcf0fc0652353f79f524ed] mm: filemap: check
if THP has hwpoisoned subpage for PMD page fault
git bisect bad 8615ff6dd1ac9e01b6fcf0fc0652353f79f524ed
# good: [e9cb6ce4690749d42013f1d56874c624d7241740] Revert "x86/kvm:
fix vcpu-id indexed array sizes"
git bisect good e9cb6ce4690749d42013f1d56874c624d7241740
# good: [dc385dfc126d51d7a93db694f8e151afe60eb06a] mm: hwpoison:
remove the unnecessary THP check
git bisect good dc385dfc126d51d7a93db694f8e151afe60eb06a
# first bad commit: [8615ff6dd1ac9e01b6fcf0fc0652353f79f524ed] mm:
filemap: check if THP has hwpoisoned subpage for PMD page fault
commit 8615ff6dd1ac9e01b6fcf0fc0652353f79f524ed
Author: Yang Shi <shy828301@gmail.com>
Date:   Thu Oct 28 14:36:11 2021 -0700

    mm: filemap: check if THP has hwpoisoned subpage for PMD page fault

    commit eac96c3efdb593df1a57bb5b95dbe037bfa9a522 upstream.

    When handling shmem page fault the THP with corrupted subpage could be
    PMD mapped if certain conditions are satisfied.  But kernel is supposed
    to send SIGBUS when trying to map hwpoisoned page.

    There are two paths which may do PMD map: fault around and regular
    fault.

    Before commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault()
    codepaths") the thing was even worse in fault around path.  The THP
    could be PMD mapped as long as the VMA fits regardless what subpage is
    accessed and corrupted.  After this commit as long as head page is not
    corrupted the THP could be PMD mapped.

    In the regular fault path the THP could be PMD mapped as long as the
    corrupted page is not accessed and the VMA fits.

    This loophole could be fixed by iterating every subpage to check if any
    of them is hwpoisoned or not, but it is somewhat costly in page fault
    path.

    So introduce a new page flag called HasHWPoisoned on the first tail
    page.  It indicates the THP has hwpoisoned subpage(s).  It is set if any
    subpage of THP is found hwpoisoned by memory failure and after the
    refcount is bumped successfully, then cleared when the THP is freed or
    split.

    The soft offline path doesn't need this since soft offline handler just
    marks a subpage hwpoisoned when the subpage is migrated successfully.
    But shmem THP didn't get split then migrated at all.

    Link: https://lkml.kernel.org/r/20211020210755.23964-3-shy828301@gmail.com
    Fixes: 800d8c63b2e9 ("shmem: add huge pages support")
    Signed-off-by: Yang Shi <shy828301@gmail.com>
    Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
    Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 include/linux/page-flags.h | 23 +++++++++++++++++++++++
 mm/huge_memory.c           |  2 ++
 mm/memory-failure.c        | 14 ++++++++++++++
 mm/memory.c                |  9 +++++++++
 mm/page_alloc.c            |  4 +++-
 5 files changed, 51 insertions(+), 1 deletion(-)


--
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5.10 00/21] 5.10.79-rc1 review
  2021-11-10 18:43 Greg Kroah-Hartman
  2021-11-10 20:09 ` Florian Fainelli
  2021-11-10 21:42 ` Fox Chen
@ 2021-11-11 13:01 ` Sudip Mukherjee
  2021-11-11 14:54   ` Naresh Kamboju
                     ` (2 more replies)
  2021-11-11 16:20 ` Shuah Khan
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 16+ messages in thread
From: Sudip Mukherjee @ 2021-11-11 13:01 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-kernel, torvalds, akpm, linux, shuah, patches, lkft-triage,
	pavel, jonathanh, f.fainelli, stable

Hi Greg,

On Wed, Nov 10, 2021 at 07:43:46PM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.10.79 release.
> There are 21 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Fri, 12 Nov 2021 18:19:54 +0000.
> Anything received after that time might be too late.

systemd-journal-flush.service failed due to a timeout resulting in a very very
slow boot on my test laptop. qemu test on openqa failed due to the same problem.

https://openqa.qa.codethink.co.uk/tests/365

A bisect showed the problem to be 8615ff6dd1ac ("mm: filemap: check if THP has
hwpoisoned subpage for PMD page fault"). Reverting it on top of 5.10.79-rc1
fixed the problem.
Incidentally, I was having similar problem with Linus's tree
for last few days and was failing since 20211106 (did not get the time to check).
I will test mainline again with this commit reverted.


--
Regards
Sudip

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH 5.10 00/21] 5.10.79-rc1 review
  2021-11-10 18:43 Greg Kroah-Hartman
  2021-11-10 20:09 ` Florian Fainelli
@ 2021-11-10 21:42 ` Fox Chen
  2021-11-11 13:01 ` Sudip Mukherjee
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Fox Chen @ 2021-11-10 21:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, stable, Fox Chen

On Wed, 10 Nov 2021 19:43:46 +0100, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> This is the start of the stable review cycle for the 5.10.79 release.
> There are 21 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Fri, 12 Nov 2021 18:19:54 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.79-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

5.10.79-rc1 Successfully Compiled and booted on my Raspberry PI 4b (8g) (bcm2711)
                
Tested-by: Fox Chen <foxhlchen@gmail.com>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5.10 00/21] 5.10.79-rc1 review
  2021-11-10 18:43 Greg Kroah-Hartman
@ 2021-11-10 20:09 ` Florian Fainelli
  2021-11-10 21:42 ` Fox Chen
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Florian Fainelli @ 2021-11-10 20:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-kernel
  Cc: torvalds, akpm, linux, shuah, patches, lkft-triage, pavel,
	jonathanh, stable

On 11/10/21 10:43 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.10.79 release.
> There are 21 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Fri, 12 Nov 2021 18:19:54 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.79-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

On ARCH_BRCMSTB, using 32-bit and 64-bit ARM kernels:

Tested-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 5.10 00/21] 5.10.79-rc1 review
@ 2021-11-10 18:43 Greg Kroah-Hartman
  2021-11-10 20:09 ` Florian Fainelli
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Greg Kroah-Hartman @ 2021-11-10 18:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, torvalds, akpm, linux, shuah, patches,
	lkft-triage, pavel, jonathanh, f.fainelli, stable

This is the start of the stable review cycle for the 5.10.79 release.
There are 21 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Fri, 12 Nov 2021 18:19:54 +0000.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
	https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.79-rc1.gz
or in the git tree and branch at:
	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Linux 5.10.79-rc1

Johan Hovold <johan@kernel.org>
    rsi: fix control-message timeout

Gustavo A. R. Silva <gustavoars@kernel.org>
    media: staging/intel-ipu3: css: Fix wrong size comparison imgu_css_fw_init

Johan Hovold <johan@kernel.org>
    staging: rtl8192u: fix control-message timeouts

Johan Hovold <johan@kernel.org>
    staging: r8712u: fix control-message timeout

Johan Hovold <johan@kernel.org>
    comedi: vmk80xx: fix bulk and interrupt message timeouts

Johan Hovold <johan@kernel.org>
    comedi: vmk80xx: fix bulk-buffer overflow

Johan Hovold <johan@kernel.org>
    comedi: vmk80xx: fix transfer-buffer overflows

Johan Hovold <johan@kernel.org>
    comedi: ni_usb6501: fix NULL-deref in command paths

Johan Hovold <johan@kernel.org>
    comedi: dt9812: fix DMA buffers on stack

Jan Kara <jack@suse.cz>
    isofs: Fix out of bound access for corrupted isofs image

Pavel Skripkin <paskripkin@gmail.com>
    staging: rtl8712: fix use-after-free in rtl8712_dl_fw

Petr Mladek <pmladek@suse.com>
    printk/console: Allow to disable console output by using console="" or console=null

Todd Kjos <tkjos@google.com>
    binder: don't detect sender/target during buffer cleanup

James Buren <braewoods+lkml@braewoods.net>
    usb-storage: Add compatibility quirk flags for iODD 2531/2541

Viraj Shah <viraj.shah@linutronix.de>
    usb: musb: Balance list entry in musb_gadget_queue

Geert Uytterhoeven <geert@linux-m68k.org>
    usb: gadget: Mark USB_FSL_QE broken on 64-bit

Yang Shi <shy828301@gmail.com>
    mm: filemap: check if THP has hwpoisoned subpage for PMD page fault

Yang Shi <shy828301@gmail.com>
    mm: hwpoison: remove the unnecessary THP check

Neal Liu <neal_liu@aspeedtech.com>
    usb: ehci: handshake CMD_RUN instead of STS_HALT

Juergen Gross <jgross@suse.com>
    Revert "x86/kvm: fix vcpu-id indexed array sizes"

Paolo Bonzini <pbonzini@redhat.com>
    KVM: x86: avoid warning with -Wbitwise-instead-of-logical


-------------

Diffstat:

 Makefile                                    |   4 +-
 arch/x86/kvm/ioapic.c                       |   2 +-
 arch/x86/kvm/ioapic.h                       |   4 +-
 arch/x86/kvm/mmu/mmu.c                      |   2 +-
 drivers/android/binder.c                    |  14 ++--
 drivers/net/wireless/rsi/rsi_91x_usb.c      |   2 +-
 drivers/staging/comedi/drivers/dt9812.c     | 115 +++++++++++++++++++++-------
 drivers/staging/comedi/drivers/ni_usb6501.c |  10 +++
 drivers/staging/comedi/drivers/vmk80xx.c    |  28 +++----
 drivers/staging/media/ipu3/ipu3-css-fw.c    |   7 +-
 drivers/staging/media/ipu3/ipu3-css-fw.h    |   2 +-
 drivers/staging/rtl8192u/r8192U_core.c      |  18 ++---
 drivers/staging/rtl8712/usb_intf.c          |   4 +-
 drivers/staging/rtl8712/usb_ops_linux.c     |   2 +-
 drivers/usb/gadget/udc/Kconfig              |   1 +
 drivers/usb/host/ehci-hcd.c                 |  11 ++-
 drivers/usb/host/ehci-platform.c            |   6 ++
 drivers/usb/host/ehci.h                     |   1 +
 drivers/usb/musb/musb_gadget.c              |   4 +-
 drivers/usb/storage/unusual_devs.h          |  10 +++
 fs/isofs/inode.c                            |   2 +
 include/linux/page-flags.h                  |  23 ++++++
 kernel/printk/printk.c                      |   9 ++-
 mm/huge_memory.c                            |   2 +
 mm/memory-failure.c                         |  28 +++----
 mm/memory.c                                 |   9 +++
 mm/page_alloc.c                             |   4 +-
 27 files changed, 233 insertions(+), 91 deletions(-)



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-11-13 13:36 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-13 13:19 [PATCH 5.10 00/21] 5.10.79-rc1 review Tim Lewis
2021-11-13 13:36 ` Greg Kroah-Hartman
  -- strict thread matches above, loose matches on Subject: below --
2021-11-10 18:43 Greg Kroah-Hartman
2021-11-10 20:09 ` Florian Fainelli
2021-11-10 21:42 ` Fox Chen
2021-11-11 13:01 ` Sudip Mukherjee
2021-11-11 14:54   ` Naresh Kamboju
2021-11-12 13:47     ` Greg Kroah-Hartman
2021-11-11 19:45   ` Sudip Mukherjee
2021-11-12 13:46     ` Greg Kroah-Hartman
2021-11-11 21:36   ` Shuah Khan
2021-11-12 13:46     ` Greg Kroah-Hartman
2021-11-11 16:20 ` Shuah Khan
2021-11-11 16:42 ` Pavel Machek
2021-11-12  1:15 ` Guenter Roeck
2021-11-12 13:45   ` Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).