From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from list by lists.gnu.org with archive (Exim 4.90_1) id 1mOU1Z-0001go-F6 for mharc-grub-devel@gnu.org; Thu, 09 Sep 2021 20:04:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53356) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mOU1T-0001fm-0j for grub-devel@gnu.org; Thu, 09 Sep 2021 20:04:05 -0400 Received: from dibed.net-space.pl ([84.10.22.86]:35148) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_3DES_EDE_CBC_SHA1:192) (Exim 4.90_1) (envelope-from ) id 1mOU1Q-0008Qq-4T for grub-devel@gnu.org; Thu, 09 Sep 2021 20:04:02 -0400 Received: from router-fw.i.net-space.pl ([192.168.52.1]:50812 "EHLO tomti.i.net-space.pl") by router-fw-old.i.net-space.pl with ESMTP id S2114763AbhIJADv (ORCPT ); Fri, 10 Sep 2021 02:03:51 +0200 X-Comment: RFC 2476 MSA function at dibed.net-space.pl logged sender identity as: dkiper Date: Fri, 10 Sep 2021 02:03:45 +0200 From: Daniel Kiper To: Daniel Axtens Cc: Patrick Steinhardt , grub-devel@gnu.org, Leif Lindholm , Stefan Berger Subject: Re: [PATCH v3 2/6] mm: Allow dynamically requesting additional memory regions Message-ID: <20210910000345.oyrb4g3vqv7c43h7@tomti.i.net-space.pl> References: <3f0ec2a76c1b3aa722efdd540b4d3ecce1789750.1629025332.git.ps@pks.im> <87sfyo5p3b.fsf@dja-thinkpad.axtens.net> <20210902124012.jg672of3psba4ie3@tomti.i.net-space.pl> <87pmtm3yfc.fsf@dja-thinkpad.axtens.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87pmtm3yfc.fsf@dja-thinkpad.axtens.net> User-Agent: NeoMutt/20170113 (1.7.2) Received-SPF: pass client-ip=84.10.22.86; envelope-from=dkiper@net-space.pl; helo=dibed.net-space.pl X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: grub-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: The development of GNU GRUB List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Sep 2021 00:04:06 -0000 On Mon, Sep 06, 2021 at 06:23:19PM +1000, Daniel Axtens wrote: > >> I think you get away with this on EFI because you use BYTES_TO_PAGES > >> and get page-aligned memory, but I think you should probably round up > >> to the next power of 2 for smaller allocations or to the next page or > >> so for larger allocations. > > > > I think we could allocate at least e.g. 128 MiB from firmware if there is > > not enough memory available in the GRUB mm. This way we would avoid frequent > > calls to firmware and could satisfy requests for larger alignments. > > 128 MiB and 64 MiB cause some tests to fail (cannot allocate memory in echo1 > or compression tests because there isn't enough free memory to get a > 64MiB chunk). 32 MiB chunk size seems to work and seems fast enough. Nice... > [It's a bit hard to tell because at some point in time time the powerpc > machine stopped shutting down when we got to the end of the tests. oh > well.] Ohhh... :-( > >> - After fixing that in the ieee1275 code, all_functional_test > >> hangs trying to run the cmdline_cat test. I think this is from a slow > >> algorithm somewhere - the grub allocator isn't exactly optimised for > >> a proliferation of regions. > > > > Could you try the solution proposed above? Maybe it will solve problem of > > frequent additions of memory to the GRUB mm. > > > >> - I noticed that nearly all the allocations were under 1MB. This seems > >> inefficient for a trip out to firmware. So I made the ieee1275 code > >> allocate at least max(4MB, (size of allocation rounded up nearest > >> 1MB) + 4kB). This makes the tests run with only the usual failures, > >> at least on pseries with debug on... still chasing some bugs beyond > >> that. > > > > Yeah, this is similar to what I proposed above. Though I would want to see > > larger numbers tested as I said earlier. > > > >> - The speed impact depends on the allocation size. I'll post something > >> on that tomorrow, hopefully, but larger minimum allocations work > >> noticably better. > >> > >> - We only have 4GB max to play with because (at least) powerpc-ieee1275 > >> is technically defined to be 32 bit. So I'm a bit nervous about > >> further large allocations unless we have a way to release them back > >> to _firmware_, not just to grub. > > > > Ugh... This can be difficult. I am not sure the GRUB mm is smart enough > > to release memory regions if they are not used anymore by it. > > > >> I would think a better overall approach would be to allocate the 1/4 of > >> ram when grub starts, and create a whole new interface for large slabs > > > > I am not very happy with allocating 1/4 of memory at start of the day. > > I think allocating larger chunks of memory from firmware should be > > enough to make things working as expected. > > Maybe the per-platform memory chunk allocator just needs to be smart > enough to make sure that there is enough memory left over to load a > "normal sized" kernel and initrd... although the sizes of distro images > keep going up so that's going to be a bit fraught. > > >> of memory that are directly allocated from, and directly returned to, > >> the firmware. > > I still would really prefer to bypass grub mm completely as described in > my other mail. If we are able to give memory back to fw, we can claim > 1GB chunks (on SLOF, PFW is going to be another issue) without having to > worry about where we put them and if we have enough memory to load a > kernel or initrd. It makes it much harder accidentally render your > system unbootable. I like your approach because you can return memory to the firmware (the situation in the UEFI is simpler because all memory allocated by the GRUB and in general by UEFI applications should be marked, usually, as "loader data" in UEFI memory map; then after EBS "loader data" should be treated as free memory; so, we do not need to return allocated memory regions to the UEFI explicitly; I am not sure it is possible in IEEE 1275 firmware). However, I think your solution have at least two problems: - if you skip GRUB mm you cannot use relocator which means you cannot load and use big initrds, - you add third family of memory management functions to the GRUB. So, I would still try to use mm. However, maybe we should improve algorithm which allocates memory at the GRUB init. In general we should have after init enough memory in the GRUB to store mm structures which describe whole or most of system memory plus an amount of RAM needed to run loaded core.img . Then I think we should stop seeing "firmware allocation" failures in later phases. The tricky part here is an algorithm which properly estimates initial memory requirements. Daniel