From: Jeff Moyer <jmoyer@redhat.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-nvdimm@lists.01.org,
hch@infradead.org, dm-devel@redhat.com
Subject: Re: [PATCH v5 2/8] drivers/pmem: Allow pmem_clear_poison() to accept arbitrary offset and len
Date: Fri, 21 Feb 2020 13:32:48 -0500 [thread overview]
Message-ID: <x498skv3i5r.fsf@segfault.boston.devel.redhat.com> (raw)
In-Reply-To: 20200220215707.GC10816@redhat.com
Vivek Goyal <vgoyal@redhat.com> writes:
> On Thu, Feb 20, 2020 at 04:35:17PM -0500, Jeff Moyer wrote:
>> Vivek Goyal <vgoyal@redhat.com> writes:
>>
>> > Currently pmem_clear_poison() expects offset and len to be sector aligned.
>> > Atleast that seems to be the assumption with which code has been written.
>> > It is called only from pmem_do_bvec() which is called only from pmem_rw_page()
>> > and pmem_make_request() which will only passe sector aligned offset and len.
>> >
>> > Soon we want use this function from dax_zero_page_range() code path which
>> > can try to zero arbitrary range of memory with-in a page. So update this
>> > function to assume that offset and length can be arbitrary and do the
>> > necessary alignments as needed.
>>
>> What caller will try to zero a range that is smaller than a sector?
>
> Hi Jeff,
>
> New dax zeroing interface (dax_zero_page_range()) can technically pass
> a range which is less than a sector. Or which is bigger than a sector
> but start and end are not aligned on sector boundaries.
Sure, but who will call it with misaligned ranges?
> At this point of time, all I care about is that case of an arbitrary
> range is handeled well. So if a caller passes a range in, we figure
> out subrange which is sector aligned in terms of start and end, and
> clear poison on those sectors and ignore rest of the range. And
> this itself will be an improvement over current behavior where
> nothing is cleared if I/O is not sector aligned.
I don't think this makes sense. The caller needs to know about the
blast radius of errors. This is why I asked for a concrete example.
It might make more sense, for example, to return an error if not all of
the errors could be cleared.
>> > nvdimm_clear_poison() seems to assume offset and len to be aligned to
>> > clear_err_unit boundary. But this is currently internal detail and is
>> > not exported for others to use. So for now, continue to align offset and
>> > length to SECTOR_SIZE boundary. Improving it further and to align it
>> > to clear_err_unit boundary is a TODO item for future.
>>
>> When there is a poisoned range of persistent memory, it is recorded by
>> the badblocks infrastructure, which currently operates on sectors. So,
>> no matter what the error unit is for the hardware, we currently can't
>> record/report to userspace anything smaller than a sector, and so that
>> is what we expect when clearing errors.
>>
>> Continuing on for completeness, we will currently not map a page with
>> badblocks into a process' address space. So, let's say you have 256
>> bytes of bad pmem, we will tell you we've lost 512 bytes, and even if
>> you access a valid mmap()d address in the same page as the poisoned
>> memory, you will get a segfault.
>>
>> Userspace can fix up the error by calling write(2) and friends to
>> provide new data, or by punching a hole and writing new data to the hole
>> (which may result in getting a new block, or reallocating the old block
>> and zeroing it, which will clear the error).
>
> Fair enough. I do not need poison clearing at finer granularity. It might
> be needed once dev_dax path wants to clear poison. Not sure how exactly
> that works.
It doesn't. :)
>> > + /*
>> > + * Callers can pass arbitrary offset and len. But nvdimm_clear_poison()
>> > + * expects memory offset and length to meet certain alignment
>> > + * restrction (clear_err_unit). Currently nvdimm does not export
>> ^^^^^^^^^^^^^^^^^^^^^^
>> > + * required alignment. So align offset and length to sector boundary
>>
>> What is "nvdimm" in that sentence? Because the nvdimm most certainly
>> does export the required alignment. Perhaps you meant libnvdimm?
>
> I meant nvdimm_clear_poison() function in drivers/nvdimm/bus.c. Whatever
> it is called. It first queries alignement required (clear_err_unit) and
> then makes sure range passed in meets that alignment requirement.
My point was your comment is misleading.
>> We could potentially support clearing less than a sector, but I'd have
>> to understand the use cases better before offerring implementation
>> suggestions.
>
> I don't need clearing less than a secotr. Once somebody needs it they
> can implement it. All I am doing is making sure current logic is not
> broken when dax_zero_page_range() starts using this logic and passes
> an arbitrary range. We need to make sure we internally align I/O
An arbitrary range is the same thing as less than a sector. :) Do you
know of an instance where the range will not be sector-aligned and sized?
> and carve out an aligned sub-range and pass that subrange to
> nvdimm_clear_poison().
And what happens to the rest? The caller is left to trip over the
errors? That sounds pretty terrible. I really think there needs to be
an explicit contract here.
> So if you can make sure I am not breaking things and new interface
> will continue to clear poison on sector boundary, that will be great.
I think allowing arbitrary ranges /could/ break things. How it breaks
things depends on what the caller is doing.
If ther eare no callers using the interface in this way, then I see no
need to relax the restriction. I do think we could document it better.
-Jeff
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
next prev parent reply other threads:[~2020-02-21 18:33 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-18 21:48 [PATCH v5 0/8] dax/pmem: Provide a dax operation to zero range of memory Vivek Goyal
2020-02-18 21:48 ` [PATCH v5 1/8] pmem: Add functions for reading/writing page to/from pmem Vivek Goyal
2020-02-18 21:48 ` [PATCH v5 2/8] drivers/pmem: Allow pmem_clear_poison() to accept arbitrary offset and len Vivek Goyal
2020-02-20 16:17 ` Christoph Hellwig
2020-02-20 21:35 ` Jeff Moyer
2020-02-20 21:57 ` Vivek Goyal
2020-02-21 18:32 ` Jeff Moyer [this message]
2020-02-21 20:17 ` Vivek Goyal
2020-02-21 21:00 ` Dan Williams
2020-02-21 21:24 ` Vivek Goyal
2020-02-21 21:30 ` Dan Williams
2020-02-21 21:33 ` Jeff Moyer
2020-02-23 23:03 ` Dave Chinner
2020-02-24 0:40 ` Dan Williams
2020-02-24 13:50 ` Jeff Moyer
2020-02-24 20:48 ` Dan Williams
2020-02-24 21:53 ` Jeff Moyer
2020-02-25 0:26 ` Dan Williams
2020-02-25 20:32 ` Jeff Moyer
2020-02-25 21:52 ` Dan Williams
2020-02-25 23:26 ` Jane Chu
2020-02-24 15:38 ` Vivek Goyal
2020-02-27 3:02 ` Dave Chinner
2020-02-27 4:19 ` Dan Williams
2020-02-28 1:30 ` Dave Chinner
2020-02-28 3:28 ` Dan Williams
2020-02-28 14:05 ` Christoph Hellwig
2020-02-28 16:26 ` Dan Williams
2020-02-24 20:13 ` Vivek Goyal
2020-02-24 20:52 ` Dan Williams
2020-02-24 21:15 ` Vivek Goyal
2020-02-24 21:32 ` Dan Williams
2020-02-25 13:36 ` Vivek Goyal
2020-02-25 16:25 ` Dan Williams
2020-02-25 20:08 ` Vivek Goyal
2020-02-25 22:49 ` Dan Williams
2020-02-26 13:51 ` Vivek Goyal
2020-02-26 16:57 ` Vivek Goyal
2020-02-27 3:11 ` Dave Chinner
2020-02-27 15:25 ` Vivek Goyal
2020-02-28 1:50 ` Dave Chinner
2020-02-18 21:48 ` [PATCH v5 3/8] pmem: Enable pmem_do_write() to deal with arbitrary ranges Vivek Goyal
2020-02-20 16:17 ` Christoph Hellwig
2020-02-18 21:48 ` [PATCH v5 4/8] dax, pmem: Add a dax operation zero_page_range Vivek Goyal
2020-03-31 19:38 ` Dan Williams
2020-04-01 13:15 ` Vivek Goyal
2020-04-01 16:14 ` Vivek Goyal
2020-02-18 21:48 ` [PATCH v5 5/8] s390,dcssblk,dax: Add dax zero_page_range operation to dcssblk driver Vivek Goyal
2020-02-18 21:48 ` [PATCH v5 6/8] dm,dax: Add dax zero_page_range operation Vivek Goyal
2020-02-18 21:48 ` [PATCH v5 7/8] dax,iomap: Start using dax native zero_page_range() Vivek Goyal
2020-02-18 21:48 ` [PATCH v5 8/8] dax,iomap: Add helper dax_iomap_zero() to zero a range Vivek Goyal
2020-04-25 11:31 ` [PATCH v5 8/8] dax, iomap: " neolift9
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=x498skv3i5r.fsf@segfault.boston.devel.redhat.com \
--to=jmoyer@redhat.com \
--cc=dm-devel@redhat.com \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).