From: Jan Kara <jack@suse.cz>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>, Jan Kara <jack@suse.cz>,
"darrick.wong@oracle.com" <darrick.wong@oracle.com>,
"Vyacheslav.Dubeyko@wdc.com" <Vyacheslav.Dubeyko@wdc.com>,
"linux-nvdimm@ml01.01.org" <linux-nvdimm@ml01.01.org>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"slava@dubeyko.com" <slava@dubeyko.com>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"lsf-pc@lists.linux-foundation.org"
<lsf-pc@lists.linux-foundation.org>
Subject: Re: [Lsf-pc] [LSF/MM TOPIC] Badblocks checking/representation in filesystems
Date: Fri, 20 Jan 2017 10:03:57 +0100 [thread overview]
Message-ID: <20170120090357.GD14115@quack2.suse.cz> (raw)
In-Reply-To: <CAPcyv4jZz_iqLutd0gPEL3udqbFxvBH8CZY5oDgUjG5dGbC2gg@mail.gmail.com>
On Thu 19-01-17 11:03:12, Dan Williams wrote:
> On Thu, Jan 19, 2017 at 10:59 AM, Vishal Verma <vishal.l.verma@intel.com> wrote:
> > On 01/19, Jan Kara wrote:
> >> On Wed 18-01-17 21:56:58, Verma, Vishal L wrote:
> >> > On Wed, 2017-01-18 at 13:32 -0800, Dan Williams wrote:
> >> > > On Wed, Jan 18, 2017 at 1:02 PM, Darrick J. Wong
> >> > > <darrick.wong@oracle.com> wrote:
> >> > > > On Wed, Jan 18, 2017 at 03:39:17PM -0500, Jeff Moyer wrote:
> >> > > > > Jan Kara <jack@suse.cz> writes:
> >> > > > >
> >> > > > > > On Tue 17-01-17 15:14:21, Vishal Verma wrote:
> >> > > > > > > Your note on the online repair does raise another tangentially
> >> > > > > > > related
> >> > > > > > > topic. Currently, if there are badblocks, writes via the bio
> >> > > > > > > submission
> >> > > > > > > path will clear the error (if the hardware is able to remap
> >> > > > > > > the bad
> >> > > > > > > locations). However, if the filesystem is mounted eith DAX,
> >> > > > > > > even
> >> > > > > > > non-mmap operations - read() and write() will go through the
> >> > > > > > > dax paths
> >> > > > > > > (dax_do_io()). We haven't found a good/agreeable way to
> >> > > > > > > perform
> >> > > > > > > error-clearing in this case. So currently, if a dax mounted
> >> > > > > > > filesystem
> >> > > > > > > has badblocks, the only way to clear those badblocks is to
> >> > > > > > > mount it
> >> > > > > > > without DAX, and overwrite/zero the bad locations. This is a
> >> > > > > > > pretty
> >> > > > > > > terrible user experience, and I'm hoping this can be solved in
> >> > > > > > > a better
> >> > > > > > > way.
> >> > > > > >
> >> > > > > > Please remind me, what is the problem with DAX code doing
> >> > > > > > necessary work to
> >> > > > > > clear the error when it gets EIO from memcpy on write?
> >> > > > >
> >> > > > > You won't get an MCE for a store; only loads generate them.
> >> > > > >
> >> > > > > Won't fallocate FL_ZERO_RANGE clear bad blocks when mounted with
> >> > > > > -o dax?
> >> > > >
> >> > > > Not necessarily; XFS usually implements this by punching out the
> >> > > > range
> >> > > > and then reallocating it as unwritten blocks.
> >> > > >
> >> > >
> >> > > That does clear the error because the unwritten blocks are zeroed and
> >> > > errors cleared when they become allocated again.
> >> >
> >> > Yes, the problem was that writes won't clear errors. zeroing through
> >> > either hole-punch, truncate, unlinking the file should all work
> >> > (assuming the hole-punch or truncate ranges wholly contain the
> >> > 'badblock' sector).
> >>
> >> Let me repeat my question: You have mentioned that if we do IO through DAX,
> >> writes won't clear errors and we should fall back to normal block path to
> >> do write to clear the error. What does prevent us from directly clearing
> >> the error from DAX path?
> >>
> > With DAX, all IO goes through DAX paths. There are two cases:
> > 1. mmap and loads/stores: Obviously there is no kernel intervention
> > here, and no badblocks handling is possible.
> > 2. read() or write() IO: In the absence of dax, this would go through
> > the bio submission path, through the pmem driver, and that would handle
> > error clearing. With DAX, this goes through dax_iomap_actor, which also
> > doesn't go through the pmem driver (it does a dax mapping, followed by
> > essentially memcpy), and hence cannot handle badblocks.
>
> Hmm, that may no longer be true after my changes to push dax flushing
> to the driver. I.e. we could have a copy_from_iter() implementation
> that attempts to clear errors... I'll get that series out and we can
> discuss there.
Yeah, that was precisely my point - doing copy_from_iter() that clears
errors should be possible...
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2017-01-20 9:24 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <at1mp6pou4lenesjdgh22k4p.1484345585589@email.android.com>
[not found] ` <b9rbflutjt10mb4ofherta8j.1484345610771@email.android.com>
2017-01-14 0:00 ` [LSF/MM TOPIC] Badblocks checking/representation in filesystems Slava Dubeyko
2017-01-14 0:49 ` Vishal Verma
2017-01-16 2:27 ` Slava Dubeyko
2017-01-17 14:37 ` [Lsf-pc] " Jan Kara
2017-01-17 15:08 ` Christoph Hellwig
2017-01-17 22:14 ` Vishal Verma
2017-01-18 10:16 ` Jan Kara
2017-01-18 20:39 ` Jeff Moyer
2017-01-18 21:02 ` Darrick J. Wong
2017-01-18 21:32 ` Dan Williams
2017-01-18 21:56 ` Verma, Vishal L
2017-01-19 8:10 ` Jan Kara
2017-01-19 18:59 ` Vishal Verma
2017-01-19 19:03 ` Dan Williams
2017-01-20 9:03 ` Jan Kara [this message]
2017-01-17 23:15 ` Slava Dubeyko
2017-01-18 20:47 ` Jeff Moyer
2017-01-19 2:56 ` Slava Dubeyko
2017-01-19 19:33 ` Jeff Moyer
2017-01-17 6:33 ` Darrick J. Wong
2017-01-17 21:35 ` Vishal Verma
2017-01-17 22:15 ` Andiry Xu
2017-01-17 22:37 ` Vishal Verma
2017-01-17 23:20 ` Andiry Xu
2017-01-17 23:51 ` Vishal Verma
2017-01-18 1:58 ` Andiry Xu
2017-01-20 0:32 ` Verma, Vishal L
2017-01-18 9:38 ` [Lsf-pc] " Jan Kara
2017-01-19 21:17 ` Vishal Verma
2017-01-20 9:47 ` Jan Kara
2017-01-20 15:42 ` Dan Williams
2017-01-24 7:46 ` Jan Kara
2017-01-24 19:59 ` Vishal Verma
2017-01-18 0:16 ` Andreas Dilger
2017-01-18 2:01 ` Andiry Xu
[not found] ` <CAOvWMLZA092iUCnFxCxPZmDNX-hH08xbSnweBhK-E-m9Ko0yuw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-01-18 3:08 ` Lu Zhang
2017-01-20 0:46 ` Vishal Verma
2017-01-20 9:24 ` Yasunori Goto
2017-01-21 0:23 ` Kani, Toshimitsu
2017-01-20 0:55 ` Verma, Vishal L
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170120090357.GD14115@quack2.suse.cz \
--to=jack@suse.cz \
--cc=Vyacheslav.Dubeyko@wdc.com \
--cc=dan.j.williams@intel.com \
--cc=darrick.wong@oracle.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nvdimm@ml01.01.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=slava@dubeyko.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).