From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nitin Gupta Date: Thu, 30 Mar 2017 20:47:11 +0000 Subject: Re: tlb_batch_add_one() Message-Id: <064d7fb5-2a61-bfa6-3870-1dc57d0cd65a@oracle.com> List-Id: References: <20170328.175226.210187301635964014.davem@davemloft.net> In-Reply-To: <20170328.175226.210187301635964014.davem@davemloft.net> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: sparclinux@vger.kernel.org On 3/30/17 1:22 PM, David Miller wrote: > From: David Miller > Date: Tue, 28 Mar 2017 17:52:26 -0700 (PDT) > >> >> There seems to be some disagreement about how the hugepage state is >> passed into tlb_batch_add(). It's declared as an integer shift, but >> there are call sites that pass it in the old way, as a boolean. >> >> For example, all of the call sites in tlb_batch_pmd_scan(), which >> likely should be passing PAGE_SHIFT. Passing true or false in these >> spots can't be right. > > And this appears to be causing regressions, gcc bootstraps fail with > all kinds of memory corruption, including in the libc malloc arena. > > I did a full git bisect and it showed the multipage size support > commit as the culprit. The wrong calls to tlb_batch_add_one(), which are passing boolean to hugepage_shift argument, are all under CONFIG_TRANSPARENT_HUGEPAGE. So are you getting these corruptions only when THP is enabled? I will be sending a fix for these call-sites today. There's another issue I found with 64K page size support during hugetlb_free_pgd_range(). The fix is current undergoing more testing. This bug affects 64K page size only. I'm still trying to understand how __tlb_remove_page_size() can be used instead of special page size change handling in tlb_batch_add_one(). Thanks, Nitin