From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,UNPARSEABLE_RELAY,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72746C4360F for ; Fri, 1 Mar 2019 21:04:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 36FB3206B6 for ; Fri, 1 Mar 2019 21:04:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726121AbfCAVEx (ORCPT ); Fri, 1 Mar 2019 16:04:53 -0500 Received: from bhuna.collabora.co.uk ([46.235.227.227]:33020 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725905AbfCAVEx (ORCPT ); Fri, 1 Mar 2019 16:04:53 -0500 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: gtucker) with ESMTPSA id 7B14C27D914 Subject: Re: next/master boot bisection: next-20190215 on beaglebone-black To: Andrew Morton Cc: Dan Williams , Michal Hocko , Mark Brown , Tomeu Vizoso , Matt Hart , Stephen Rothwell , khilman@baylibre.com, enric.balletbo@collabora.com, Nicholas Piggin , Dominik Brodowski , Masahiro Yamada , Kees Cook , Adrian Reber , Linux Kernel Mailing List , Johannes Weiner , Linux MM , Mathieu Desnoyers , Richard Guy Briggs , "Peter Zijlstra (Intel)" , info@kernelci.org References: <5c6702da.1c69fb81.12a14.4ece@mx.google.com> <20190215104325.039dbbd9c3bfb35b95f9247b@linux-foundation.org> <20190215185151.GG7897@sirena.org.uk> <20190226155948.299aa894a5576e61dda3e5aa@linux-foundation.org> <20190228151438.fc44921e66f2f5d393c8d7b4@linux-foundation.org> <026b5082-32f2-e813-5396-e4a148c813ea@collabora.com> <20190301124100.62a02e2f622ff6b5f178a7c3@linux-foundation.org> From: Guillaume Tucker Message-ID: <3fafb552-ae75-6f63-453c-0d0e57d818f3@collabora.com> Date: Fri, 1 Mar 2019 21:04:46 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <20190301124100.62a02e2f622ff6b5f178a7c3@linux-foundation.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/03/2019 20:41, Andrew Morton wrote: > On Fri, 1 Mar 2019 09:25:24 +0100 Guillaume Tucker wrote: > >>>>> Michal had asked if the free space accounting fix up addressed this >>>>> boot regression? I was awaiting word on that. >>>> >>>> hm, does bot@kernelci.org actually read emails? Let's try info@ as well.. >> >> bot@kernelci.org is not person, it's a send-only account for >> automated reports. So no, it doesn't read emails. >> >> I guess the tricky point here is that the authors of the commits >> found by bisections may not always have the hardware needed to >> reproduce the problem. So it needs to be dealt with on a >> case-by-case basis: sometimes they do have the hardware, >> sometimes someone else on the list or on CC does, and sometimes >> it's better for the people who have access to the test lab which >> ran the KernelCI test to deal with it. >> >> This case seems to fall into the last category. As I have access >> to the Collabora lab, I can do some quick checks to confirm >> whether the proposed patch does fix the issue. I hadn't realised >> that someone was waiting for this to happen, especially as the >> BeagleBone Black is a very common platform. Sorry about that, >> I'll take a look today. >> >> It may be a nice feature to be able to give access to the >> KernelCI test infrastructure to anyone who wants to debug an >> issue reported by KernelCI or verify a fix, so they won't need to >> have the hardware locally. Something to think about for the >> future. > > Thanks, that all sounds good. > >>>> Is it possible to determine whether this regression is still present in >>>> current linux-next? >> >> I'll try to re-apply the patch that caused the issue, then see if >> the suggested change fixes it. As far as the current linux-next >> master branch is concerned, KernelCI boot tests are passing fine >> on that platform. > > They would, because I dropped > mm-shuffle-default-enable-all-shuffling.patch, so your tests presumably > now have shuffling disabled. > > Is it possible to add the below to linux-next and try again? I've actually already done that, and essentially the issue can still be reproduced by applying that patch. See this branch: https://gitlab.collabora.com/gtucker/linux/commits/next-20190301-beaglebone-black-debug next-20190301 boots fine but the head fails, using multi_v7_defconfig + SMP=n in both cases and SHUFFLE_PAGE_ALLOCATOR=y enabled in the 2nd case as a result of the change in the default value. The change suggested by Michal Hocko on Feb 15th has now been applied in linux-next, it's part of this commit but as explained above it does not actually resolve the boot failure: 98cf198ee8ce mm: move buddy list manipulations into helpers I can send more details on Monday and do a bit of debugging to help narrowing down the problem. Please let me know if there's anything in particular that would seem be worth trying. > Or I can re-add this to linux-next. Where should we go to determine > the results of such a change? There are a heck of a lot of results on > https://kernelci.org/boot/ and entering "beaglebone-black" doesn't get > me anything. The BeagleBone Black board was offline for a few days in our lab, which probably explains why you're not getting much results from the web interface. Hopefully we'll see passing boot results in linux-next tomorrow now that the board is back on track. It's quite easy for me to submit test jobs with kernels I've built myself instead of going through the full linux-next and KernelCI loop. So that's the best way to try things out, then when a fix has been found it can be applied in linux-next on top of the mm/shuffle change to verify it in KernelCI. Guillaume > From: Dan Williams > Subject: mm/shuffle: default enable all shuffling > > Per Andrew's request arrange for all memory allocation shuffling code to > be enabled by default. > > The page_alloc.shuffle command line parameter can still be used to disable > shuffling at boot, but the kernel will default enable the shuffling if the > command line option is not specified. > > Link: http://lkml.kernel.org/r/154943713572.3858443.11206307988382889377.stgit@dwillia2-desk3.amr.corp.intel.com > Signed-off-by: Dan Williams > Cc: Kees Cook > Cc: Michal Hocko > Cc: Dave Hansen > Cc: Keith Busch > > Signed-off-by: Andrew Morton > --- > > init/Kconfig | 4 ++-- > mm/shuffle.c | 4 ++-- > mm/shuffle.h | 2 +- > 3 files changed, 5 insertions(+), 5 deletions(-) > > --- a/init/Kconfig~mm-shuffle-default-enable-all-shuffling > +++ a/init/Kconfig > @@ -1709,7 +1709,7 @@ config SLAB_MERGE_DEFAULT > command line. > > config SLAB_FREELIST_RANDOM > - default n > + default y > depends on SLAB || SLUB > bool "SLAB freelist randomization" > help > @@ -1728,7 +1728,7 @@ config SLAB_FREELIST_HARDENED > > config SHUFFLE_PAGE_ALLOCATOR > bool "Page allocator randomization" > - default SLAB_FREELIST_RANDOM && ACPI_NUMA > + default y > help > Randomization of the page allocator improves the average > utilization of a direct-mapped memory-side-cache. See section > --- a/mm/shuffle.c~mm-shuffle-default-enable-all-shuffling > +++ a/mm/shuffle.c > @@ -9,8 +9,8 @@ > #include "internal.h" > #include "shuffle.h" > > -DEFINE_STATIC_KEY_FALSE(page_alloc_shuffle_key); > -static unsigned long shuffle_state __ro_after_init; > +DEFINE_STATIC_KEY_TRUE(page_alloc_shuffle_key); > +static unsigned long shuffle_state __ro_after_init = 1 << SHUFFLE_ENABLE; > > /* > * Depending on the architecture, module parameter parsing may run > --- a/mm/shuffle.h~mm-shuffle-default-enable-all-shuffling > +++ a/mm/shuffle.h > @@ -19,7 +19,7 @@ enum mm_shuffle_ctl { > #define SHUFFLE_ORDER (MAX_ORDER-1) > > #ifdef CONFIG_SHUFFLE_PAGE_ALLOCATOR > -DECLARE_STATIC_KEY_FALSE(page_alloc_shuffle_key); > +DECLARE_STATIC_KEY_TRUE(page_alloc_shuffle_key); > extern void page_alloc_shuffle(enum mm_shuffle_ctl ctl); > extern void __shuffle_free_memory(pg_data_t *pgdat); > static inline void shuffle_free_memory(pg_data_t *pgdat) > _ > From mboxrd@z Thu Jan 1 00:00:00 1970 Subject: Re: next/master boot bisection: next-20190215 on beaglebone-black References: <5c6702da.1c69fb81.12a14.4ece@mx.google.com> <20190215104325.039dbbd9c3bfb35b95f9247b@linux-foundation.org> <20190215185151.GG7897@sirena.org.uk> <20190226155948.299aa894a5576e61dda3e5aa@linux-foundation.org> <20190228151438.fc44921e66f2f5d393c8d7b4@linux-foundation.org> <026b5082-32f2-e813-5396-e4a148c813ea@collabora.com> <20190301124100.62a02e2f622ff6b5f178a7c3@linux-foundation.org> From: "Guillaume Tucker" Message-ID: <3fafb552-ae75-6f63-453c-0d0e57d818f3@collabora.com> Date: Fri, 1 Mar 2019 21:04:46 +0000 MIME-Version: 1.0 In-Reply-To: <20190301124100.62a02e2f622ff6b5f178a7c3@linux-foundation.org> List-ID: List-Help: , Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit List-ID: To: Andrew Morton Cc: Dan Williams , Michal Hocko , Mark Brown , Tomeu Vizoso , Matt Hart , Stephen Rothwell , khilman@baylibre.com, enric.balletbo@collabora.com, Nicholas Piggin , Dominik Brodowski , Masahiro Yamada , Kees Cook , Adrian Reber , Linux Kernel Mailing List , Johannes Weiner , Linux MM , Mathieu Desnoyers , Richard Guy Briggs , "Peter Zijlstra (Intel)" , info@kernelci.org On 01/03/2019 20:41, Andrew Morton wrote: > On Fri, 1 Mar 2019 09:25:24 +0100 Guillaume Tucker wrote: > >>>>> Michal had asked if the free space accounting fix up addressed this >>>>> boot regression? I was awaiting word on that. >>>> >>>> hm, does bot@kernelci.org actually read emails? Let's try info@ as well.. >> >> bot@kernelci.org is not person, it's a send-only account for >> automated reports. So no, it doesn't read emails. >> >> I guess the tricky point here is that the authors of the commits >> found by bisections may not always have the hardware needed to >> reproduce the problem. So it needs to be dealt with on a >> case-by-case basis: sometimes they do have the hardware, >> sometimes someone else on the list or on CC does, and sometimes >> it's better for the people who have access to the test lab which >> ran the KernelCI test to deal with it. >> >> This case seems to fall into the last category. As I have access >> to the Collabora lab, I can do some quick checks to confirm >> whether the proposed patch does fix the issue. I hadn't realised >> that someone was waiting for this to happen, especially as the >> BeagleBone Black is a very common platform. Sorry about that, >> I'll take a look today. >> >> It may be a nice feature to be able to give access to the >> KernelCI test infrastructure to anyone who wants to debug an >> issue reported by KernelCI or verify a fix, so they won't need to >> have the hardware locally. Something to think about for the >> future. > > Thanks, that all sounds good. > >>>> Is it possible to determine whether this regression is still present in >>>> current linux-next? >> >> I'll try to re-apply the patch that caused the issue, then see if >> the suggested change fixes it. As far as the current linux-next >> master branch is concerned, KernelCI boot tests are passing fine >> on that platform. > > They would, because I dropped > mm-shuffle-default-enable-all-shuffling.patch, so your tests presumably > now have shuffling disabled. > > Is it possible to add the below to linux-next and try again? I've actually already done that, and essentially the issue can still be reproduced by applying that patch. See this branch: https://gitlab.collabora.com/gtucker/linux/commits/next-20190301-beaglebone-black-debug next-20190301 boots fine but the head fails, using multi_v7_defconfig + SMP=n in both cases and SHUFFLE_PAGE_ALLOCATOR=y enabled in the 2nd case as a result of the change in the default value. The change suggested by Michal Hocko on Feb 15th has now been applied in linux-next, it's part of this commit but as explained above it does not actually resolve the boot failure: 98cf198ee8ce mm: move buddy list manipulations into helpers I can send more details on Monday and do a bit of debugging to help narrowing down the problem. Please let me know if there's anything in particular that would seem be worth trying. > Or I can re-add this to linux-next. Where should we go to determine > the results of such a change? There are a heck of a lot of results on > https://kernelci.org/boot/ and entering "beaglebone-black" doesn't get > me anything. The BeagleBone Black board was offline for a few days in our lab, which probably explains why you're not getting much results from the web interface. Hopefully we'll see passing boot results in linux-next tomorrow now that the board is back on track. It's quite easy for me to submit test jobs with kernels I've built myself instead of going through the full linux-next and KernelCI loop. So that's the best way to try things out, then when a fix has been found it can be applied in linux-next on top of the mm/shuffle change to verify it in KernelCI. Guillaume > From: Dan Williams > Subject: mm/shuffle: default enable all shuffling > > Per Andrew's request arrange for all memory allocation shuffling code to > be enabled by default. > > The page_alloc.shuffle command line parameter can still be used to disable > shuffling at boot, but the kernel will default enable the shuffling if the > command line option is not specified. > > Link: http://lkml.kernel.org/r/154943713572.3858443.11206307988382889377.stgit@dwillia2-desk3.amr.corp.intel.com > Signed-off-by: Dan Williams > Cc: Kees Cook > Cc: Michal Hocko > Cc: Dave Hansen > Cc: Keith Busch > > Signed-off-by: Andrew Morton > --- > > init/Kconfig | 4 ++-- > mm/shuffle.c | 4 ++-- > mm/shuffle.h | 2 +- > 3 files changed, 5 insertions(+), 5 deletions(-) > > --- a/init/Kconfig~mm-shuffle-default-enable-all-shuffling > +++ a/init/Kconfig > @@ -1709,7 +1709,7 @@ config SLAB_MERGE_DEFAULT > command line. > > config SLAB_FREELIST_RANDOM > - default n > + default y > depends on SLAB || SLUB > bool "SLAB freelist randomization" > help > @@ -1728,7 +1728,7 @@ config SLAB_FREELIST_HARDENED > > config SHUFFLE_PAGE_ALLOCATOR > bool "Page allocator randomization" > - default SLAB_FREELIST_RANDOM && ACPI_NUMA > + default y > help > Randomization of the page allocator improves the average > utilization of a direct-mapped memory-side-cache. See section > --- a/mm/shuffle.c~mm-shuffle-default-enable-all-shuffling > +++ a/mm/shuffle.c > @@ -9,8 +9,8 @@ > #include "internal.h" > #include "shuffle.h" > > -DEFINE_STATIC_KEY_FALSE(page_alloc_shuffle_key); > -static unsigned long shuffle_state __ro_after_init; > +DEFINE_STATIC_KEY_TRUE(page_alloc_shuffle_key); > +static unsigned long shuffle_state __ro_after_init = 1 << SHUFFLE_ENABLE; > > /* > * Depending on the architecture, module parameter parsing may run > --- a/mm/shuffle.h~mm-shuffle-default-enable-all-shuffling > +++ a/mm/shuffle.h > @@ -19,7 +19,7 @@ enum mm_shuffle_ctl { > #define SHUFFLE_ORDER (MAX_ORDER-1) > > #ifdef CONFIG_SHUFFLE_PAGE_ALLOCATOR > -DECLARE_STATIC_KEY_FALSE(page_alloc_shuffle_key); > +DECLARE_STATIC_KEY_TRUE(page_alloc_shuffle_key); > extern void page_alloc_shuffle(enum mm_shuffle_ctl ctl); > extern void __shuffle_free_memory(pg_data_t *pgdat); > static inline void shuffle_free_memory(pg_data_t *pgdat) > _ >