linux-bcache.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Coly Li <colyli@suse.de>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org,
	linux-nvdimm@lists.linux.dev, Jens Axboe <axboe@kernel.dk>,
	Hannes Reinecke <hare@suse.com>, Jan Kara <jack@suse.cz>,
	Christoph Hellwig <hch@lst.de>,
	"Huang, Ying" <ying.huang@intel.com>,
	Jianpeng Ma <jianpeng.ma@intel.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	Qiaowei Ren <qiaowei.ren@intel.com>,
	Hannes Reinecke <hare@suse.de>
Subject: Re: [PATCH v12 02/12] bcache: initialize the nvm pages allocator
Date: Thu, 12 Aug 2021 16:26:26 +0800	[thread overview]
Message-ID: <dffaf01e-9ac4-e71b-3e38-1f2b0bfc5aed@suse.de> (raw)
In-Reply-To: <CAPcyv4hhfg=mgN4AW8T2VWGVbKsQZkpPwpU5yVAVh2nFOxCBcg@mail.gmail.com>

On 8/12/21 1:43 PM, Dan Williams wrote:
> On Wed, Aug 11, 2021 at 10:04 AM Coly Li <colyli@suse.de> wrote:
>> From: Jianpeng Ma <jianpeng.ma@intel.com>
>>
>> This patch define the prototype data structures in memory and
>> initializes the nvm pages allocator.
>>
>> The nvm address space which is managed by this allocator can consist of
>> many nvm namespaces, and some namespaces can compose into one nvm set,
>> like cache set. For this initial implementation, only one set can be
>> supported.
>>
>> The users of this nvm pages allocator need to call register_namespace()
>> to register the nvdimm device (like /dev/pmemX) into this allocator as
>> the instance of struct nvm_namespace.
>>
>> Reported-by: Randy Dunlap <rdunlap@infradead.org>
>> Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
>> Co-developed-by: Qiaowei Ren <qiaowei.ren@intel.com>
>> Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Hannes Reinecke <hare@suse.de>
>> Cc: Jens Axboe <axboe@kernel.dk>
>> ---
>>  drivers/md/bcache/Kconfig     |  10 +
>>  drivers/md/bcache/Makefile    |   1 +
>>  drivers/md/bcache/nvm-pages.c | 339 ++++++++++++++++++++++++++++++++++
>>  drivers/md/bcache/nvm-pages.h |  96 ++++++++++
>>  drivers/md/bcache/super.c     |   3 +
>>  5 files changed, 449 insertions(+)
>>  create mode 100644 drivers/md/bcache/nvm-pages.c
>>  create mode 100644 drivers/md/bcache/nvm-pages.h
>>
[snipped]
>> +
>> +       err = -EOPNOTSUPP;
>> +       if (!bdev_dax_supported(bdev, ns->page_size)) {
>> +               pr_err("%s don't support DAX\n", bdevname(bdev, buf));
>> +               goto free_ns;
>> +       }
>> +
>> +       err = -EINVAL;
>> +       if (bdev_dax_pgoff(bdev, 0, ns->page_size, &pgoff)) {
>> +               pr_err("invalid offset of %s\n", bdevname(bdev, buf));
>> +               goto free_ns;
>> +       }
>> +
>> +       err = -ENOMEM;
>> +       ns->dax_dev = fs_dax_get_by_bdev(bdev);
>> +       if (!ns->dax_dev) {
>> +               pr_err("can't by dax device by %s\n", bdevname(bdev, buf));
>> +               goto free_ns;
>> +       }
>> +
>> +       err = -EINVAL;
>> +       id = dax_read_lock();
>> +       dax_ret = dax_direct_access(ns->dax_dev, pgoff, ns->pages_total,
>> +                                   &ns->base_addr, &ns->start_pfn);
>> +       if (dax_ret <= 0) {
>> +               pr_err("dax_direct_access error\n");
>> +               dax_read_unlock(id);
>> +               goto free_ns;
>> +       }
>> +
>> +       if (dax_ret < ns->pages_total) {
>> +               pr_warn("mapped range %ld is less than ns->pages_total %lu\n",
>> +                       dax_ret, ns->pages_total);

Hi Dan,

Many thanks for your information.

> This failure will become a common occurrence with CXL namespaces that
> will have discontiguous range support. It's already the case for
> dax-devices for soft-reserved memory [1]. In the CXL case the
> discontinuity will be 256MB aligned, for the soft-reserved dax-devices
> the discontinuity granularity can be as small as 4K.
>
> [1]: https://elixir.bootlin.com/linux/v5.14-rc5/source/drivers/dax/device.c#L414

Fortunately the on-media allocation list format works with multiple
ranges of the namespace. For the in-memory struct bch_nvmpg_ns currently
assumes the namespace is a flat continuous range. Yes, we need to
consider and support multiple ranges in struct bch_nvmpg_ns for buddy
allocation initialization to skip the discontinuous gap.

It will be in the to-do list for next work. Thanks for your comments and
hint.

Coly Li

  reply	other threads:[~2021-08-12  8:26 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-11 17:02 [PATCH v12 00/12] bcache: support NVDIMM for journaling Coly Li
2021-08-11 17:02 ` [PATCH v12 01/12] bcache: add initial data structures for nvm pages Coly Li
2021-08-11 17:02 ` [PATCH v12 02/12] bcache: initialize the nvm pages allocator Coly Li
2021-08-12  5:43   ` Dan Williams
2021-08-12  8:26     ` Coly Li [this message]
2021-08-11 17:02 ` [PATCH v12 03/12] bcache: initialization of the buddy Coly Li
2021-08-11 17:02 ` [PATCH v12 04/12] bcache: bch_nvmpg_alloc_pages() " Coly Li
2021-08-11 17:02 ` [PATCH v12 05/12] bcache: bch_nvmpg_free_pages() of the buddy allocator Coly Li
2021-08-11 17:02 ` [PATCH v12 06/12] bcache: get recs list head for allocated pages by specific uuid Coly Li
2021-08-11 17:02 ` [PATCH v12 07/12] bcache: use bucket index to set GC_MARK_METADATA for journal buckets in bch_btree_gc_finish() Coly Li
2021-08-11 17:02 ` [PATCH v12 08/12] bcache: add BCH_FEATURE_INCOMPAT_NVDIMM_META into incompat feature set Coly Li
2021-08-11 17:02 ` [PATCH v12 09/12] bcache: initialize bcache journal for NVDIMM meta device Coly Li
2021-08-11 17:02 ` [PATCH v12 10/12] bcache: support storing bcache journal into " Coly Li
2021-08-11 17:02 ` [PATCH v12 11/12] bcache: read jset from NVDIMM pages for journal replay Coly Li
2021-08-11 17:02 ` [PATCH v12 12/12] bcache: add sysfs interface register_nvdimm_meta to register NVDIMM meta device Coly Li
2021-08-15 16:21 ` [PATCH v12 00/12] bcache: support NVDIMM for journaling Coly Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dffaf01e-9ac4-e71b-3e38-1f2b0bfc5aed@suse.de \
    --to=colyli@suse.de \
    --cc=axboe@kernel.dk \
    --cc=dan.j.williams@intel.com \
    --cc=hare@suse.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jianpeng.ma@intel.com \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvdimm@lists.linux.dev \
    --cc=qiaowei.ren@intel.com \
    --cc=rdunlap@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).