From: Laurent Dufour <ldufour@linux.ibm.com>
To: Michael Ellerman <mpe@ellerman.id.au>,
benh@kernel.crashing.org, paulus@samba.org,
aneesh.kumar@linux.ibm.com, npiggin@gmail.com,
linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 2/2] powerpc/mm: call H_BLOCK_REMOVE when supported
Date: Thu, 19 Sep 2019 17:17:35 +0200 [thread overview]
Message-ID: <bd973652-5dc2-a359-fcd5-4a61243721bb@linux.ibm.com> (raw)
In-Reply-To: <87y2ylvhyo.fsf@mpe.ellerman.id.au>
Le 18/09/2019 à 15:42, Michael Ellerman a écrit :
> Hi Laurent,
>
> Few comments ...
Hi Michael,
Thanks for the review and the nitpicking ;)
>
> Laurent Dufour <ldufour@linux.ibm.com> writes:
>> Now we do not call _BLOCK_REMOVE all the time when the feature is
>> exhibited.
>
> This isn't true until after the patch is applied, ie. the tense is
> wrong. The rest of the change log explains things fine, so just drop
> that sentence I think.
>
> Can you include the info about the oops in here.
>
>> Depending on the hardware and the hypervisor, the hcall H_BLOCK_REMOVE may
>> not be able to process all the page size for a segment base page size, as
> ^
> sizes
>> reported by the TLB Invalidate Characteristics.o
> ^
> stray "o"
>>
>> For each couple base segment page size and actual page size, this
> ^
> "pair of"
>> characteristic is telling the size of the block the hcall is supporting.
> ^ ^
> "tells us" supports
>>
>> Due to the involve complexity in do_block_remove() and call_block_remove(),
> ^
> "required" is better I think
>> and the fact currently a 8 size block is returned by the hypervisor, we
> ^ ^
> that "block of size 8"
>> are only supporting 8 size block to the H_BLOCK_REMOVE hcall.
>>
>> Furthermore a warning message is displayed at boot time in the case of an
>> unsupported block size.
>
> I'm not sure we should be doing that? It could be unnecessarily spammy.
>
>> In order to identify this limitation easily in the code,a local define
>> HBLKR_SUPPORTED_SIZE defining the currently supported block size, and a
>> dedicated checking helper is_supported_hlbkr() are introduced.
>>
>> For regular pages and hugetlb, the assumption is made that the page size is
>> equal to the base page size. For THP the page size is assumed to be 16M.
>>
>> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
>> ---
>> arch/powerpc/platforms/pseries/lpar.c | 35 +++++++++++++++++++++++++--
>> 1 file changed, 33 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
>> index 98a5c2ff9a0b..e2ad9b3b1097 100644
>> --- a/arch/powerpc/platforms/pseries/lpar.c
>> +++ b/arch/powerpc/platforms/pseries/lpar.c
>> @@ -65,6 +65,13 @@ EXPORT_SYMBOL(plpar_hcall_norets);
>> */
>> static int hblkr_size[MMU_PAGE_COUNT][MMU_PAGE_COUNT];
>>
>> +/*
>> + * Due to the involved complexity, and that the current hypervisor is only
>> + * returning this value or 0, we are limiting the support of the H_BLOCK_REMOVE
>> + * buffer size to 8 size block.
>> + */
>> +#define HBLKR_SUPPORTED_BLOCK_SIZE 8
>> +
>> #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
>> static u8 dtl_mask = DTL_LOG_PREEMPT;
>> #else
>> @@ -993,6 +1000,15 @@ static void pSeries_lpar_hpte_invalidate(unsigned long slot, unsigned long vpn,
>> #define HBLKR_CTRL_ERRNOTFOUND 0x8800000000000000UL
>> #define HBLKR_CTRL_ERRBUSY 0xa000000000000000UL
>>
>> +/*
>> + * Returned true if we are supporting this block size for the specified segment
>> + * base page size and actual page size.
>> + */
>> +static inline bool is_supported_hlbkr(int bpsize, int psize)
>> +{
>> + return (hblkr_size[bpsize][psize] == HBLKR_SUPPORTED_BLOCK_SIZE);
>> +}
>> +
>> /**
>> * H_BLOCK_REMOVE caller.
>> * @idx should point to the latest @param entry set with a PTEX.
>> @@ -1152,7 +1168,11 @@ static inline void __pSeries_lpar_hugepage_invalidate(unsigned long *slot,
>> if (lock_tlbie)
>> spin_lock_irqsave(&pSeries_lpar_tlbie_lock, flags);
>>
>> - if (firmware_has_feature(FW_FEATURE_BLOCK_REMOVE))
>> + /*
>> + * Assuming THP size is 16M, and we only support 8 bytes size buffer
>> + * for the momment.
>> + */
>> + if (is_supported_hlbkr(psize, MMU_PAGE_16M))
>
> It's not very clear that this is correct in all cases. ie. how do we
> know we're being called for THP and not regular huge page?
>
> I think we're only called via:
> flush_hash_hugepage()
> -> mmu_hash_ops.hugepage_invalidate()
> pSeries_lpar_hugepage_invalidate()
> -> __pSeries_lpar_hugepage_invalidate()
>
> And flush_hash_hugepage() is called via:
> __hash_page_thp()
> and
> hpte_do_hugepage_flush()
>
> The first is presumably fine, the 2nd is called in a few places:
> __flush_hash_table_range() under if (is_thp)
> hash__pmd_hugepage_update()
>
>
> But it's a little bit fragile if the code ever evolves. Not sure if
> there's a better solution, other than just documenting it.
Indeed __pSeries_lpar_hugepage_invalidate() can only be called for THP.
flush_hash_hugepage() and hpte_do_hugepage_flush() are only defined (or
valid) with CONFIG_TRANSPARENT_HUGEPAGE.
As Aneesh remind me, "hugepage" stands for THP.
>
>> hugepage_block_invalidate(slot, vpn, count, psize, ssize);
>> else
>> hugepage_bulk_invalidate(slot, vpn, count, psize, ssize);
>> @@ -1437,6 +1457,14 @@ void __init pseries_lpar_read_hblkr_characteristics(void)
>>
>> block_size = 1 << block_size;
>>
>> + /*
>> + * If the block size is not supported by the kernel, report it,
>> + * but continue reading the values, and the following blocks.
>> + */
>> + if (block_size != HBLKR_SUPPORTED_BLOCK_SIZE)
>> + pr_warn("Unsupported H_BLOCK_REMOVE block size : %d\n",
>> + block_size);
>
> Does this need a printk? I'm worried it could end up triggering and
> scaring people unnecessarily.
I agree, will remove.
>
>> +
>> for (npsize = local_buffer[idx++]; npsize > 0; npsize--)
>> check_lp_set_hblk((unsigned int) local_buffer[idx++],
>> block_size);
>> @@ -1468,7 +1496,10 @@ static void pSeries_lpar_flush_hash_range(unsigned long number, int local)
>> if (lock_tlbie)
>> spin_lock_irqsave(&pSeries_lpar_tlbie_lock, flags);
>>
>> - if (firmware_has_feature(FW_FEATURE_BLOCK_REMOVE)) {
>> + /*
>> + * Currently, we only support 8 bytes size buffer in do_block_remove().
>> + */
>> + if (is_supported_hlbkr(batch->psize, batch->psize)) {
>> do_block_remove(number, batch, param);
>> goto out;
>> }
>> --
>> 2.23.0
>
> cheers
>
next prev parent reply other threads:[~2019-09-19 15:17 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-16 9:55 [PATCH v2 0/2] powerpc/mm: Conditionally call H_BLOCK_REMOVE Laurent Dufour
2019-09-16 9:55 ` [PATCH v2 1/2] powperc/mm: read TLB Block Invalidate Characteristics Laurent Dufour
2019-09-18 13:42 ` Michael Ellerman
2019-09-19 15:59 ` Laurent Dufour
2019-09-16 9:55 ` [PATCH v2 2/2] powerpc/mm: call H_BLOCK_REMOVE when supported Laurent Dufour
2019-09-18 13:42 ` Michael Ellerman
2019-09-19 15:17 ` Laurent Dufour [this message]
2019-09-17 16:12 ` [PATCH v2 0/2] powerpc/mm: Conditionally call H_BLOCK_REMOVE Aneesh Kumar K.V
2019-09-18 13:41 ` Michael Ellerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bd973652-5dc2-a359-fcd5-4a61243721bb@linux.ibm.com \
--to=ldufour@linux.ibm.com \
--cc=aneesh.kumar@linux.ibm.com \
--cc=benh@kernel.crashing.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).