linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>, Jan Kara <jack@suse.cz>,
	"darrick.wong@oracle.com" <darrick.wong@oracle.com>,
	"Vyacheslav.Dubeyko@wdc.com" <Vyacheslav.Dubeyko@wdc.com>,
	"linux-nvdimm@ml01.01.org" <linux-nvdimm@ml01.01.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"slava@dubeyko.com" <slava@dubeyko.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"lsf-pc@lists.linux-foundation.org"
	<lsf-pc@lists.linux-foundation.org>
Subject: Re: [Lsf-pc] [LSF/MM TOPIC] Badblocks checking/representation in filesystems
Date: Fri, 20 Jan 2017 10:03:57 +0100	[thread overview]
Message-ID: <20170120090357.GD14115@quack2.suse.cz> (raw)
In-Reply-To: <CAPcyv4jZz_iqLutd0gPEL3udqbFxvBH8CZY5oDgUjG5dGbC2gg@mail.gmail.com>

On Thu 19-01-17 11:03:12, Dan Williams wrote:
> On Thu, Jan 19, 2017 at 10:59 AM, Vishal Verma <vishal.l.verma@intel.com> wrote:
> > On 01/19, Jan Kara wrote:
> >> On Wed 18-01-17 21:56:58, Verma, Vishal L wrote:
> >> > On Wed, 2017-01-18 at 13:32 -0800, Dan Williams wrote:
> >> > > On Wed, Jan 18, 2017 at 1:02 PM, Darrick J. Wong
> >> > > <darrick.wong@oracle.com> wrote:
> >> > > > On Wed, Jan 18, 2017 at 03:39:17PM -0500, Jeff Moyer wrote:
> >> > > > > Jan Kara <jack@suse.cz> writes:
> >> > > > >
> >> > > > > > On Tue 17-01-17 15:14:21, Vishal Verma wrote:
> >> > > > > > > Your note on the online repair does raise another tangentially
> >> > > > > > > related
> >> > > > > > > topic. Currently, if there are badblocks, writes via the bio
> >> > > > > > > submission
> >> > > > > > > path will clear the error (if the hardware is able to remap
> >> > > > > > > the bad
> >> > > > > > > locations). However, if the filesystem is mounted eith DAX,
> >> > > > > > > even
> >> > > > > > > non-mmap operations - read() and write() will go through the
> >> > > > > > > dax paths
> >> > > > > > > (dax_do_io()). We haven't found a good/agreeable way to
> >> > > > > > > perform
> >> > > > > > > error-clearing in this case. So currently, if a dax mounted
> >> > > > > > > filesystem
> >> > > > > > > has badblocks, the only way to clear those badblocks is to
> >> > > > > > > mount it
> >> > > > > > > without DAX, and overwrite/zero the bad locations. This is a
> >> > > > > > > pretty
> >> > > > > > > terrible user experience, and I'm hoping this can be solved in
> >> > > > > > > a better
> >> > > > > > > way.
> >> > > > > >
> >> > > > > > Please remind me, what is the problem with DAX code doing
> >> > > > > > necessary work to
> >> > > > > > clear the error when it gets EIO from memcpy on write?
> >> > > > >
> >> > > > > You won't get an MCE for a store;  only loads generate them.
> >> > > > >
> >> > > > > Won't fallocate FL_ZERO_RANGE clear bad blocks when mounted with
> >> > > > > -o dax?
> >> > > >
> >> > > > Not necessarily; XFS usually implements this by punching out the
> >> > > > range
> >> > > > and then reallocating it as unwritten blocks.
> >> > > >
> >> > >
> >> > > That does clear the error because the unwritten blocks are zeroed and
> >> > > errors cleared when they become allocated again.
> >> >
> >> > Yes, the problem was that writes won't clear errors. zeroing through
> >> > either hole-punch, truncate, unlinking the file should all work
> >> > (assuming the hole-punch or truncate ranges wholly contain the
> >> > 'badblock' sector).
> >>
> >> Let me repeat my question: You have mentioned that if we do IO through DAX,
> >> writes won't clear errors and we should fall back to normal block path to
> >> do write to clear the error. What does prevent us from directly clearing
> >> the error from DAX path?
> >>
> > With DAX, all IO goes through DAX paths. There are two cases:
> > 1. mmap and loads/stores: Obviously there is no kernel intervention
> > here, and no badblocks handling is possible.
> > 2. read() or write() IO: In the absence of dax, this would go through
> > the bio submission path, through the pmem driver, and that would handle
> > error clearing. With DAX, this goes through dax_iomap_actor, which also
> > doesn't go through the pmem driver (it does a dax mapping, followed by
> > essentially memcpy), and hence cannot handle badblocks.
> 
> Hmm, that may no longer be true after my changes to push dax flushing
> to the driver. I.e. we could have a copy_from_iter() implementation
> that attempts to clear errors... I'll get that series out and we can
> discuss there.

Yeah, that was precisely my point - doing copy_from_iter() that clears
errors should be possible...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2017-01-20  9:24 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <at1mp6pou4lenesjdgh22k4p.1484345585589@email.android.com>
     [not found] ` <b9rbflutjt10mb4ofherta8j.1484345610771@email.android.com>
2017-01-14  0:00   ` [LSF/MM TOPIC] Badblocks checking/representation in filesystems Slava Dubeyko
2017-01-14  0:49     ` Vishal Verma
2017-01-16  2:27       ` Slava Dubeyko
2017-01-17 14:37         ` [Lsf-pc] " Jan Kara
2017-01-17 15:08           ` Christoph Hellwig
2017-01-17 22:14           ` Vishal Verma
2017-01-18 10:16             ` Jan Kara
2017-01-18 20:39               ` Jeff Moyer
2017-01-18 21:02                 ` Darrick J. Wong
2017-01-18 21:32                   ` Dan Williams
2017-01-18 21:56                     ` Verma, Vishal L
2017-01-19  8:10                       ` Jan Kara
2017-01-19 18:59                         ` Vishal Verma
2017-01-19 19:03                           ` Dan Williams
2017-01-20  9:03                             ` Jan Kara [this message]
2017-01-17 23:15           ` Slava Dubeyko
2017-01-18 20:47             ` Jeff Moyer
2017-01-19  2:56               ` Slava Dubeyko
2017-01-19 19:33                 ` Jeff Moyer
2017-01-17  6:33       ` Darrick J. Wong
2017-01-17 21:35         ` Vishal Verma
2017-01-17 22:15           ` Andiry Xu
2017-01-17 22:37             ` Vishal Verma
2017-01-17 23:20               ` Andiry Xu
2017-01-17 23:51                 ` Vishal Verma
2017-01-18  1:58                   ` Andiry Xu
2017-01-20  0:32                     ` Verma, Vishal L
2017-01-18  9:38               ` [Lsf-pc] " Jan Kara
2017-01-19 21:17                 ` Vishal Verma
2017-01-20  9:47                   ` Jan Kara
2017-01-20 15:42                     ` Dan Williams
2017-01-24  7:46                       ` Jan Kara
2017-01-24 19:59                         ` Vishal Verma
2017-01-18  0:16             ` Andreas Dilger
2017-01-18  2:01               ` Andiry Xu
     [not found]                 ` <CAOvWMLZA092iUCnFxCxPZmDNX-hH08xbSnweBhK-E-m9Ko0yuw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-01-18  3:08                   ` Lu Zhang
2017-01-20  0:46                     ` Vishal Verma
2017-01-20  9:24                       ` Yasunori Goto
2017-01-21  0:23                         ` Kani, Toshimitsu
2017-01-20  0:55                 ` Verma, Vishal L

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170120090357.GD14115@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=Vyacheslav.Dubeyko@wdc.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=slava@dubeyko.com \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).