nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: "Verma, Vishal L" <vishal.l.verma@intel.com>
To: "linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	"vaibhav@linux.ibm.com" <vaibhav@linux.ibm.com>
Cc: "harish@linux.ibm.com" <harish@linux.ibm.com>,
	"aneesh.kumar@linux.ibm.com" <aneesh.kumar@linux.ibm.com>
Subject: Re: [PATCH v2] ndctl, check: Ensure mmap of BTT sections work with 64K page-sizes
Date: Tue, 6 Aug 2019 05:19:51 +0000	[thread overview]
Message-ID: <26253c7e5fa424fcaa832ef7a2a50ea94d181acf.camel@intel.com> (raw)
In-Reply-To: <20190806050334.2267-1-vaibhav@linux.ibm.com>

On Tue, 2019-08-06 at 10:33 +0530, Vaibhav Jain wrote:
> On PPC64 which uses a 64K page-size, ndtl-check command fails on a BTT
> namespace with following error:
> 
> $ sudo ndctl check-namespace namespace0.0 -vv
>   namespace0.0: namespace_check: checking namespace0.0
>   namespace0.0: btt_discover_arenas: found 1 BTT arena
>   namespace0.0: btt_create_mappings: mmap arena[0].info [sz = 0x1000,
>   	off = 0x1000] failed: Invalid argument
>   error checking namespaces: Invalid argument
>   checked 0 namespaces
> 
> An error happens when btt_create_mappings() tries to mmap the sections
> of the BTT device which are usually 4K offset aligned. However the
> mmap() syscall expects the 'offset' argument to be in multiples of
> page-size, hence it returns EINVAL causing the btt_create_mappings()
> to error out.
> 
> As a fix for the issue this patch proposes addition of two new
> functions to 'check.c' namely btt_mmap/btt_unmap that can be used to
> map/unmap parts of BTT device to ndctl process address-space. The
> functions tweaks the requested 'offset' argument to mmap() ensuring
> that its page-size aligned and then fix-ups the returned pointer such
> that it points to the requested offset within mmapped region.
> 
> With these newly introduced functions the patch updates the call-sites
> in 'check.c' to use these functions instead of mmap/unmap syscalls.
> Also since btt_mmap returns NULL if mmap operation fails, the
> error check call-sites can be made against NULL instead of MAP_FAILED.
> 
> Reported-by: Harish Sriram <harish@linux.ibm.com>
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
> Changelog:
> 
> v2:
> * Updated patch description to include canonical email address of bug
>   reporter. [Vishal]
> * Improved the comment describing function btt_mmap() in 'check.c'
>   [Vishal]
> * Simplified the argument list for btt_mmap() to just accept bttc,
>   length and offset. Other arguments for mmap() are derived from these
>   args. [Vishal]
> * Renamed 'shift' variable in btt_mmap()/unmap() to 'page_offset'
>   [Vishal]
> * Improved the dbg() messages logged inside
>   btt_mmap()/unmap(). [Vishal]
> * Return NULL from btt_mmap() in case of an error. [Vishal]
> * Changed the all sites to btt_mmap() to perform error check against
>   NULL return value instead of MAP_FAILED. [Vishal]

Hi Vaibhav,

Thanks for the turnaround on these - just one minor nit below, but other
than that this looks good to me.

> ---
>  ndctl/check.c | 93 +++++++++++++++++++++++++++++++++++++--------------
>  1 file changed, 67 insertions(+), 26 deletions(-)
> 
> diff --git a/ndctl/check.c b/ndctl/check.c
> index 8a7125053865..e72b498f1ce1 100644
> --- a/ndctl/check.c
> +++ b/ndctl/check.c
> @@ -907,59 +907,100 @@ static int btt_discover_arenas(struct btt_chk *bttc)
>  	return ret;
>  }
>  
> -static int btt_create_mappings(struct btt_chk *bttc)
> +/*
> + * Wrap call to mmap(2) to work with btt device offsets that are not aligned
> + * to system page boundary. It works by rounding down the requested offset
> + * to sys_page_size when calling mmap(2) and then returning a fixed-up pointer
> + * to the correct offset in the mmaped region.
> + */
> +static void *btt_mmap(struct btt_chk *bttc, size_t length, off_t offset)
>  {
> -	struct arena_info *a;
> -	int mmap_flags;
> -	int i;
> +	off_t page_offset;
> +	int prot_flags;
> +	uint8_t *addr;
>  
>  	if (!bttc->opts->repair)
> -		mmap_flags = PROT_READ;
> +		prot_flags = PROT_READ;
>  	else
> -		mmap_flags = PROT_READ|PROT_WRITE;
> +		prot_flags = PROT_READ|PROT_WRITE;
> +
> +	/* Calculate the page_offset from the system page boundary */
> +	page_offset = offset - rounddown(offset, bttc->sys_page_size);
> +
> +	/* Update the offset and length with the page_offset calculated above */
> +	offset -= page_offset;
> +	length += page_offset;
> +
> +	addr = mmap(NULL, length, prot_flags, MAP_SHARED, bttc->fd, offset);
> +
> +	/* If needed fixup the return pointer to correct offset requested */
> +	if (addr != MAP_FAILED)
> +		addr += page_offset;
> +
> +	dbg(bttc, "addr = %p length = %#lx offset = %#lx"
> +	    "page_offset = %#lx\n", (void *) addr, length, offset, page_offset);

The string portion of any print shouldn't ever be split across lines to
preserve greppability (the kernel CodingStyle document we follow
mentions that, and 'checkpatch.pl' should warn abnout it too. It is ok
in this case if the line ends up over 80 columns wide. Also maybe add
commas between the elements - so "addr = %p, length = %#lx, ..."

Same applies to the print in the unmap function below.

> +
> +	return addr == MAP_FAILED ? NULL : addr;
> +}
> +
> +static void btt_unmap(struct btt_chk *bttc, void *ptr, size_t length)
> +{
> +	off_t page_offset;
> +	uintptr_t addr = (uintptr_t) ptr;
> +
> +	/* Calculate the page_offset from system page boundary */
> +	page_offset = addr - rounddown(addr, bttc->sys_page_size);
> +
> +	addr -= page_offset;
> +	length += page_offset;
> +
> +	munmap((void *) addr, length);
> +	dbg(bttc, "addr = %p length = %#lx page_offset = %#lx\n",
> +	    (void *) addr, length, page_offset);
> +}
> +
> +static int btt_create_mappings(struct btt_chk *bttc)
> +{
> +	struct arena_info *a;
> +	int i;
>  
>  	for (i = 0; i < bttc->num_arenas; i++) {
>  		a = &bttc->arena[i];
>  		a->map.info_len = BTT_INFO_SIZE;
> -		a->map.info = mmap(NULL, a->map.info_len, mmap_flags,
> -			MAP_SHARED, bttc->fd, a->infooff);
> -		if (a->map.info == MAP_FAILED) {
> +		a->map.info = btt_mmap(bttc, a->map.info_len, a->infooff);
> +		if (!a->map.info) {
>  			err(bttc, "mmap arena[%d].info [sz = %#lx, off = %#lx] failed: %s\n",
>  				i, a->map.info_len, a->infooff, strerror(errno));
>  			return -errno;
>  		}
>  
>  		a->map.data_len = a->mapoff - a->dataoff;
> -		a->map.data = mmap(NULL, a->map.data_len, mmap_flags,
> -			MAP_SHARED, bttc->fd, a->dataoff);
> -		if (a->map.data == MAP_FAILED) {
> +		a->map.data = btt_mmap(bttc, a->map.data_len, a->dataoff);
> +		if (!a->map.data) {
>  			err(bttc, "mmap arena[%d].data [sz = %#lx, off = %#lx] failed: %s\n",
>  				i, a->map.data_len, a->dataoff, strerror(errno));
>  			return -errno;
>  		}
>  
>  		a->map.map_len = a->logoff - a->mapoff;
> -		a->map.map = mmap(NULL, a->map.map_len, mmap_flags,
> -			MAP_SHARED, bttc->fd, a->mapoff);
> -		if (a->map.map == MAP_FAILED) {
> +		a->map.map = btt_mmap(bttc, a->map.map_len, a->mapoff);
> +		if (!a->map.map) {
>  			err(bttc, "mmap arena[%d].map [sz = %#lx, off = %#lx] failed: %s\n",
>  				i, a->map.map_len, a->mapoff, strerror(errno));
>  			return -errno;
>  		}
>  
>  		a->map.log_len = a->info2off - a->logoff;
> -		a->map.log = mmap(NULL, a->map.log_len, mmap_flags,
> -			MAP_SHARED, bttc->fd, a->logoff);
> -		if (a->map.log == MAP_FAILED) {
> +		a->map.log = btt_mmap(bttc, a->map.log_len, a->logoff);
> +		if (!a->map.log) {
>  			err(bttc, "mmap arena[%d].log [sz = %#lx, off = %#lx] failed: %s\n",
>  				i, a->map.log_len, a->logoff, strerror(errno));
>  			return -errno;
>  		}
>  
>  		a->map.info2_len = BTT_INFO_SIZE;
> -		a->map.info2 = mmap(NULL, a->map.info2_len, mmap_flags,
> -			MAP_SHARED, bttc->fd, a->info2off);
> -		if (a->map.info2 == MAP_FAILED) {
> +		a->map.info2 = btt_mmap(bttc, a->map.info2_len, a->info2off);
> +		if (!a->map.info2) {
>  			err(bttc, "mmap arena[%d].info2 [sz = %#lx, off = %#lx] failed: %s\n",
>  				i, a->map.info2_len, a->info2off, strerror(errno));
>  			return -errno;
> @@ -977,15 +1018,15 @@ static void btt_remove_mappings(struct btt_chk *bttc)
>  	for (i = 0; i < bttc->num_arenas; i++) {
>  		a = &bttc->arena[i];
>  		if (a->map.info)
> -			munmap(a->map.info, a->map.info_len);
> +			btt_unmap(bttc, a->map.info, a->map.info_len);
>  		if (a->map.data)
> -			munmap(a->map.data, a->map.data_len);
> +			btt_unmap(bttc, a->map.data, a->map.data_len);
>  		if (a->map.map)
> -			munmap(a->map.map, a->map.map_len);
> +			btt_unmap(bttc, a->map.map, a->map.map_len);
>  		if (a->map.log)
> -			munmap(a->map.log, a->map.log_len);
> +			btt_unmap(bttc, a->map.log, a->map.log_len);
>  		if (a->map.info2)
> -			munmap(a->map.info2, a->map.info2_len);
> +			btt_unmap(bttc, a->map.info2, a->map.info2_len);
>  	}
>  }
>  
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

      reply	other threads:[~2019-08-06  5:22 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-06  5:03 [PATCH v2] ndctl, check: Ensure mmap of BTT sections work with 64K page-sizes Vaibhav Jain
2019-08-06  5:19 ` Verma, Vishal L [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=26253c7e5fa424fcaa832ef7a2a50ea94d181acf.camel@intel.com \
    --to=vishal.l.verma@intel.com \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=harish@linux.ibm.com \
    --cc=linux-nvdimm@lists.01.org \
    --cc=vaibhav@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).