linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Ellerman <mpe@ellerman.id.au>
To: Laurent Dufour <ldufour@linux.ibm.com>,
	benh@kernel.crashing.org, paulus@samba.org,
	aneesh.kumar@linux.ibm.com, npiggin@gmail.com,
	linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/2] powperc/mm: read TLB Block Invalidate Characteristics
Date: Wed, 18 Sep 2019 23:42:21 +1000	[thread overview]
Message-ID: <87zhj1vhz6.fsf@mpe.ellerman.id.au> (raw)
In-Reply-To: <20190916095543.17496-2-ldufour@linux.ibm.com>

Hi Laurent,

Comments below ...

Laurent Dufour <ldufour@linux.ibm.com> writes:
> The PAPR document specifies the TLB Block Invalidate Characteristics which
> tells for each couple segment base page size, actual page size, the size of
                 ^
                 "pair of" again

> the block the hcall H_BLOCK_REMOVE is supporting.
                                     ^
                                     "supports" is fine.

> These characteristics are loaded at boot time in a new table hblkr_size.
> The table is appart the mmu_psize_def because this is specific to the
               ^
               "separate from"

> pseries architecture.
          ^
          platform
>
> A new init service, pseries_lpar_read_hblkr_characteristics() is added to
             ^
             function

> read the characteristics. In that function, the size of the buffer is set
> to twice the number of known page size, plus 10 bytes to ensure we have
> enough place. This new init function is called from pSeries_setup_arch().
         ^
         space
>
> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> ---
>  .../include/asm/book3s/64/tlbflush-hash.h     |   1 +
>  arch/powerpc/platforms/pseries/lpar.c         | 138 ++++++++++++++++++
>  arch/powerpc/platforms/pseries/setup.c        |   1 +
>  3 files changed, 140 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
> index 64d02a704bcb..74155cc8cf89 100644
> --- a/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/tlbflush-hash.h
> @@ -117,4 +117,5 @@ extern void __flush_hash_table_range(struct mm_struct *mm, unsigned long start,
>  				     unsigned long end);
>  extern void flush_tlb_pmd_range(struct mm_struct *mm, pmd_t *pmd,
>  				unsigned long addr);
> +extern void pseries_lpar_read_hblkr_characteristics(void);

This doesn't need "extern", and also should go in
arch/powerpc/platforms/pseries/pseries.h as it's local to that directory.

You're using "hblkr" in a few places, can we instead make it "hblkrm" -
"rm" is a well known abbreviation for "remove".


> diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
> index 36b846f6e74e..98a5c2ff9a0b 100644
> --- a/arch/powerpc/platforms/pseries/lpar.c
> +++ b/arch/powerpc/platforms/pseries/lpar.c
> @@ -56,6 +56,15 @@ EXPORT_SYMBOL(plpar_hcall);
>  EXPORT_SYMBOL(plpar_hcall9);
>  EXPORT_SYMBOL(plpar_hcall_norets);
>  
> +/*
> + * H_BLOCK_REMOVE supported block size for this page size in segment who's base
> + * page size is that page size.
> + *
> + * The first index is the segment base page size, the second one is the actual
> + * page size.
> + */
> +static int hblkr_size[MMU_PAGE_COUNT][MMU_PAGE_COUNT];

Can you make that __ro_after_init, it goes at the end, eg:

static int hblkr_size[MMU_PAGE_COUNT][MMU_PAGE_COUNT] __ro_after_init;

> @@ -1311,6 +1320,135 @@ static void do_block_remove(unsigned long number, struct ppc64_tlb_batch *batch,
>  		(void)call_block_remove(pix, param, true);
>  }
>  
> +/*
> + * TLB Block Invalidate Characteristics These characteristics define the size of
                                           ^
                                           newline before here?

> + * the block the hcall H_BLOCK_REMOVE is able to process for each couple segment
> + * base page size, actual page size.
> + *
> + * The ibm,get-system-parameter properties is returning a buffer with the
> + * following layout:
> + *
> + * [ 2 bytes size of the RTAS buffer (without these 2 bytes) ]
                                         ^
                                         "excluding"

> + * -----------------
> + * TLB Block Invalidate Specifiers:
> + * [ 1 byte LOG base 2 of the TLB invalidate block size being specified ]
> + * [ 1 byte Number of page sizes (N) that are supported for the specified
> + *          TLB invalidate block size ]
> + * [ 1 byte Encoded segment base page size and actual page size
> + *          MSB=0 means 4k segment base page size and actual page size
> + *          MSB=1 the penc value in mmu_psize_def ]
> + * ...
> + * -----------------
> + * Next TLB Block Invalidate Specifiers...
> + * -----------------
> + * [ 0 ]
> + */
> +static inline void __init set_hblk_bloc_size(int bpsize, int psize,
> +					     unsigned int block_size)

"static inline" and __init are sort of contradictory.

Just make it "static void __init" and the compiler will inline it
anyway, but if it decides not to the section will be correct.

This one uses "hblk"? Should it be set_hblkrm_block_size() ?

> +{
> +	if (block_size > hblkr_size[bpsize][psize])
> +		hblkr_size[bpsize][psize] = block_size;
> +}
> +
> +/*
> + * Decode the Encoded segment base page size and actual page size.
> + * PAPR specifies with bit 0 as MSB:
> + *   - bit 0 is the L bit
> + *   - bits 2-7 are the penc value

Can we just convert those to normal bit ordering for the comment, eg:

 PAPR specifies:
   - bit 7 is the L bit
   - bits 0-5 are the penc value

> + * If the L bit is 0, this means 4K segment base page size and actual page size
> + * otherwise the penc value should be readed.
                                         ^
                                         "read" ?
> + */
> +#define HBLKR_L_BIT_MASK	0x80

"HBLKRM_L_MASK" would do I think?

> +#define HBLKR_PENC_MASK		0x3f
> +static inline void __init check_lp_set_hblk(unsigned int lp,
> +					    unsigned int block_size)
> +{
> +	unsigned int bpsize, psize;
> +

One blank line is sufficient :)

> +
> +	/* First, check the L bit, if not set, this means 4K */
> +	if ((lp & HBLKR_L_BIT_MASK) == 0) {
> +		set_hblk_bloc_size(MMU_PAGE_4K, MMU_PAGE_4K, block_size);
> +		return;
> +	}
> +
> +	lp &= HBLKR_PENC_MASK;
> +	for (bpsize = 0; bpsize < MMU_PAGE_COUNT; bpsize++) {
> +		struct mmu_psize_def *def =  &mmu_psize_defs[bpsize];
                                            ^
                                            stray space there
> +
> +		for (psize = 0; psize < MMU_PAGE_COUNT; psize++) {
> +			if (def->penc[psize] == lp) {
> +				set_hblk_bloc_size(bpsize, psize, block_size);
> +				return;
> +			}
> +		}
> +	}
> +}
> +
> +#define SPLPAR_TLB_BIC_TOKEN		50
> +#define SPLPAR_TLB_BIC_MAXLENGTH	(MMU_PAGE_COUNT*2 + 10)

The +10 is just a guess I think?

If I'm counting right that ends up as 42 bytes, which is not much at
all. You could probably just make it a cache line, 128 bytes, which
should be plenty of space?

> +void __init pseries_lpar_read_hblkr_characteristics(void)
> +{
> +	int call_status;

This should be grouped with the other ints below on one line.

> +	unsigned char local_buffer[SPLPAR_TLB_BIC_MAXLENGTH];
> +	int len, idx, bpsize;
> +
> +	if (!firmware_has_feature(FW_FEATURE_BLOCK_REMOVE)) {
> +		pr_info("H_BLOCK_REMOVE is not supported");

That's going to trigger on a lot of older machines, and all KVM VMs, so
just return silently.

> +		return;
> +	}
> +
> +	memset(local_buffer, 0, SPLPAR_TLB_BIC_MAXLENGTH);

Here you memset the whole buffer ...

> +	spin_lock(&rtas_data_buf_lock);
> +	memset(rtas_data_buf, 0, RTAS_DATA_BUF_SIZE);
> +	call_status = rtas_call(rtas_token("ibm,get-system-parameter"), 3, 1,
> +				NULL,
> +				SPLPAR_TLB_BIC_TOKEN,
> +				__pa(rtas_data_buf),
> +				RTAS_DATA_BUF_SIZE);
> +	memcpy(local_buffer, rtas_data_buf, SPLPAR_TLB_BIC_MAXLENGTH);

But then here you memcpy over the entire buffer, making the memset above
unnecessary.

> +	local_buffer[SPLPAR_TLB_BIC_MAXLENGTH - 1] = '\0';
> +	spin_unlock(&rtas_data_buf_lock);
> +
> +	if (call_status != 0) {
> +		pr_warn("%s %s Error calling get-system-parameter (0x%x)\n",
> +			__FILE__, __func__, call_status);

__FILE__ and __func__ is pretty verbose for what should be a rare case
(I realise you copied that from existing code).

		pr_warn("Error calling get-system-parameter(%d, ...) (0x%x)\n",
                        SPLPAR_TLB_BIC_TOKEN, call_status);

Should do I think?

> +		return;
> +	}
> +
> +	/*
> +	 * The first two (2) bytes of the data in the buffer are the length of
> +	 * the returned data, not counting these first two (2) bytes.
> +	 */
> +	len = local_buffer[0] * 256 + local_buffer[1] + 2;

This took me a minute to parse. I guess I was expecting something more
like:
	len = be16_to_cpu(local_buffer) + 2;

??

> +	if (len >= SPLPAR_TLB_BIC_MAXLENGTH) {

I think it's allowed for len to be == SPLPAR_TLB_BIC_MAXLENGTH isn't it?

> +		pr_warn("%s too large returned buffer %d", __func__, len);
> +		return;
> +	}
> +
> +	idx = 2;
> +	while (idx < len) {
> +		unsigned int block_size = local_buffer[idx++];

This is a little subtle, local_buffer is char, so no endian issue, but
you're loading that into an unsigned int which makes it look like there
should be an endian swap.

Maybe instead do:
		u8 block_shift = local_buffer[idx++];
                u32 block_size;

		if (!block_shift)
                	break;

		block_size = 1 << block_shift;

??

> +		unsigned int npsize;
> +
> +		if (!block_size)
> +			break;
> +
> +		block_size = 1 << block_size;
> +
> +		for (npsize = local_buffer[idx++];  npsize > 0; npsize--)
> +			check_lp_set_hblk((unsigned int) local_buffer[idx++],
> +					  block_size);

This could overflow if npsize is too big. I think you can just add
"&& idx < len" to the for condition to make it safe?

> +	}
> +
> +	for (bpsize = 0; bpsize < MMU_PAGE_COUNT; bpsize++)
> +		for (idx = 0; idx < MMU_PAGE_COUNT; idx++)
> +			if (hblkr_size[bpsize][idx])
> +				pr_info("H_BLOCK_REMOVE supports base psize:%d psize:%d block size:%d",
> +					bpsize, idx, hblkr_size[bpsize][idx]);

I think this is probably too verbose, except for debugging. Probably
just remove it?

cheers

  reply	other threads:[~2019-09-18 13:42 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-16  9:55 [PATCH v2 0/2] powerpc/mm: Conditionally call H_BLOCK_REMOVE Laurent Dufour
2019-09-16  9:55 ` [PATCH v2 1/2] powperc/mm: read TLB Block Invalidate Characteristics Laurent Dufour
2019-09-18 13:42   ` Michael Ellerman [this message]
2019-09-19 15:59     ` Laurent Dufour
2019-09-16  9:55 ` [PATCH v2 2/2] powerpc/mm: call H_BLOCK_REMOVE when supported Laurent Dufour
2019-09-18 13:42   ` Michael Ellerman
2019-09-19 15:17     ` Laurent Dufour
2019-09-17 16:12 ` [PATCH v2 0/2] powerpc/mm: Conditionally call H_BLOCK_REMOVE Aneesh Kumar K.V
2019-09-18 13:41 ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zhj1vhz6.fsf@mpe.ellerman.id.au \
    --to=mpe@ellerman.id.au \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=ldufour@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=npiggin@gmail.com \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).