linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Dennis Zhou <dennisz@fb.com>
Cc: Christoph Lameter <cl@linux.com>,
	kernel-team@fb.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, Dennis Zhou <dennisszhou@gmail.com>
Subject: Re: [PATCH 06/10] percpu: modify base_addr to be region specific
Date: Mon, 17 Jul 2017 14:57:06 -0400	[thread overview]
Message-ID: <20170717185705.GK3519177@devbig577.frc2.facebook.com> (raw)
In-Reply-To: <20170716022315.19892-7-dennisz@fb.com>

Hello,

On Sat, Jul 15, 2017 at 10:23:11PM -0400, Dennis Zhou wrote:
> From: "Dennis Zhou (Facebook)" <dennisszhou@gmail.com>
> 
> Originally, the first chunk is served by up to three chunks, each given
> a region they are responsible for. Despite this, the arithmetic was based
> off of the base_addr making it require offsets or be overly inclusive.
> This patch changes percpu checks for first chunk to consider the only
> the dynamic region and the reserved check to be only the reserved
> region. There is no impact here besides making these checks a little
> more accurate.
> 
> This patch also adds the ground work increasing the minimum allocation
> size to 4 bytes. The new field nr_pages in pcpu_chunk will be used to
> keep track of the number of pages the bitmap serves. The arithmetic for
> identifying first chunk and reserved chunk reflect this change.

However small the patch might end up being, I'd much prefer changing
the minimum alloc size to be a separate patch with rationale.

> diff --git a/include/linux/percpu.h b/include/linux/percpu.h
> index 98a371c..a5cedcd 100644
> --- a/include/linux/percpu.h
> +++ b/include/linux/percpu.h
> @@ -21,6 +21,10 @@
>  /* minimum unit size, also is the maximum supported allocation size */
>  #define PCPU_MIN_UNIT_SIZE		PFN_ALIGN(32 << 10)
>  
> +/* minimum allocation size and shift in bytes */
> +#define PCPU_MIN_ALLOC_SIZE		(1 << PCPU_MIN_ALLOC_SHIFT)
> +#define PCPU_MIN_ALLOC_SHIFT		2

nitpick: Put SHIFT def above SIZE def?

> +/*
> + * Static addresses should never be passed into the allocator.  They
> + * are accessed using the group_offsets and therefore do not rely on
> + * chunk->base_addr.
> + */
>  static bool pcpu_addr_in_first_chunk(void *addr)
>  {
>  	void *first_start = pcpu_first_chunk->base_addr;
>  
> -	return addr >= first_start && addr < first_start + pcpu_unit_size;
> +	return addr >= first_start &&
> +	       addr < first_start +
> +	       pcpu_first_chunk->nr_pages * PAGE_SIZE;

Does the above line need line break?  If so, it'd probably be easier
to read if the broken line is indented (preferably to align with the
start of the sub expression).  e.g.

	return addr < first_start +
		      pcpu_first_chunk->nr_pages * PAGE_SIZE;

>  static bool pcpu_addr_in_reserved_chunk(void *addr)
>  {
> -	void *first_start = pcpu_first_chunk->base_addr;
> +	void *first_start;
>  
> -	return addr >= first_start &&
> -		addr < first_start + pcpu_reserved_chunk_limit;
> +	if (!pcpu_reserved_chunk)
> +		return false;
> +
> +	first_start = pcpu_reserved_chunk->base_addr;
> +
> +	return addr >= first_start + pcpu_reserved_offset &&
> +	       addr < first_start +
> +	       pcpu_reserved_chunk->nr_pages * PAGE_SIZE;

Ditto on indentation.

> @@ -1366,10 +1388,17 @@ phys_addr_t per_cpu_ptr_to_phys(void *addr)
>  	 * The following test on unit_low/high isn't strictly
>  	 * necessary but will speed up lookups of addresses which
>  	 * aren't in the first chunk.
> +	 *
> +	 * The address check is of high granularity checking against full
> +	 * chunk sizes.  pcpu_base_addr points to the beginning of the first
> +	 * chunk including the static region.  This allows us to examine all
> +	 * regions of the first chunk. Assumes good intent as the first
> +	 * chunk may not be full (ie. < pcpu_unit_pages in size).
>  	 */
> -	first_low = pcpu_chunk_addr(pcpu_first_chunk, pcpu_low_unit_cpu, 0);
> -	first_high = pcpu_chunk_addr(pcpu_first_chunk, pcpu_high_unit_cpu,
> -				     pcpu_unit_pages);
> +	first_low = (unsigned long) pcpu_base_addr +
                                   ^
			 no space for type casts

> @@ -1575,6 +1604,8 @@ int __init pcpu_setup_first_chunk(const struct pcpu_alloc_info *ai,
>  	unsigned int cpu;
>  	int *unit_map;
>  	int group, unit, i;
> +	unsigned long tmp_addr, aligned_addr;
> +	unsigned long map_size_bytes;

How about just map_size or map_bytes?

> @@ -1678,46 +1709,66 @@ int __init pcpu_setup_first_chunk(const struct pcpu_alloc_info *ai,
>  		INIT_LIST_HEAD(&pcpu_slot[i]);
>  
>  	/*
> -	 * Initialize static chunk.  If reserved_size is zero, the
> -	 * static chunk covers static area + dynamic allocation area
> -	 * in the first chunk.  If reserved_size is not zero, it
> -	 * covers static area + reserved area (mostly used for module
> -	 * static percpu allocation).
> +	 * Initialize static chunk.
> +	 * The static region is dropped as those addresses are already
> +	 * allocated and do not rely on chunk->base_addr.
> +	 * reserved_size == 0:
> +	 *      the static chunk covers the dynamic area
> +	 * reserved_size > 0:
> +	 *      the static chunk covers the reserved area
> +	 *
> +	 * If the static area is not page aligned, the region adjacent
> +	 * to the static area must have its base_addr be offset into
> +	 * the static area to have it be page aligned.  The overlap is
> +	 * then allocated preserving the alignment in the metadata for
> +	 * the actual region.

We can address this later but static chunk not covering static area is
kinda confusing.  The original complication came from trying to make
the static chunk service either reserved or first dynamic chunk.  We
don't need that anymore and might as well use separate rchunk and
dchunk.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-07-17 18:57 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-16  2:23 [PATCH 00/10] percpu: replace percpu area map allocator with bitmap allocator Dennis Zhou
2017-07-16  2:23 ` [PATCH 01/10] percpu: pcpu-stats change void buffer to int buffer Dennis Zhou
2017-07-17 14:44   ` Tejun Heo
2017-07-16  2:23 ` [PATCH 02/10] percpu: change the format for percpu_stats output Dennis Zhou
2017-07-17 14:46   ` Tejun Heo
2017-07-16  2:23 ` [PATCH 03/10] percpu: expose pcpu_nr_empty_pop_pages in pcpu_stats Dennis Zhou
2017-07-17 14:47   ` Tejun Heo
2017-07-16  2:23 ` [PATCH 04/10] percpu: update the header comment and pcpu_build_alloc_info comments Dennis Zhou
2017-07-17 14:53   ` Tejun Heo
2017-07-16  2:23 ` [PATCH 05/10] percpu: change reserved_size to end page aligned Dennis Zhou
2017-07-16  4:01   ` kbuild test robot
2017-07-16  5:11   ` kbuild test robot
2017-07-17 16:46   ` Tejun Heo
2017-07-17 19:10     ` Dennis Zhou
2017-07-24 20:04     ` Dennis Zhou
2017-07-16  2:23 ` [PATCH 06/10] percpu: modify base_addr to be region specific Dennis Zhou
2017-07-17 18:57   ` Tejun Heo [this message]
2017-07-18 19:26   ` Josef Bacik
2017-07-18 19:36     ` Matthew Wilcox
2017-07-19 14:20       ` Josef Bacik
2017-07-16  2:23 ` [PATCH 07/10] percpu: fix misnomer in schunk/dchunk variable names Dennis Zhou
2017-07-17 19:10   ` Tejun Heo
2017-07-24 20:07     ` Dennis Zhou
2017-07-16  2:23 ` [PATCH 08/10] percpu: change the number of pages marked in the first_chunk bitmaps Dennis Zhou
2017-07-17 19:26   ` Tejun Heo
2017-07-24 20:13     ` Dennis Zhou
2017-07-16  2:23 ` [PATCH 09/10] percpu: replace area map allocator with bitmap allocator Dennis Zhou
2017-07-17 23:27   ` Tejun Heo
2017-07-24 21:37     ` Dennis Zhou
2017-07-19 19:11   ` Josef Bacik
2017-07-19 22:19     ` Dennis Zhou
2017-07-19 19:16   ` Josef Bacik
2017-07-19 22:13     ` Dennis Zhou
2017-07-16  2:23 ` [PATCH 10/10] percpu: add optimizations on allocation path for the " Dennis Zhou
2017-07-17 23:32   ` Tejun Heo
2017-07-18 19:15 ` [PATCH 00/10] percpu: replace percpu area map allocator with " Josef Bacik
2017-07-24 21:14   ` Dennis Zhou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170717185705.GK3519177@devbig577.frc2.facebook.com \
    --to=tj@kernel.org \
    --cc=cl@linux.com \
    --cc=dennisszhou@gmail.com \
    --cc=dennisz@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).