All of lore.kernel.org
 help / color / mirror / Atom feed
From: greg@enjellic.com
To: Neil Brown <neilb@suse.de>, greg@enjellic.com
Cc: Eyal Lebedinsky <eyal@eyal.emu.id.au>,
	linux-raid list <linux-raid@vger.kernel.org>
Subject: Re: mismatch_cnt again
Date: Mon, 16 Nov 2009 15:36:55 -0600	[thread overview]
Message-ID: <200911162136.nAGLatPb028438@wind.enjellic.com> (raw)
In-Reply-To: Neil Brown <neilb@suse.de> "Re: mismatch_cnt again" (Nov 13,  1:28pm)

On Nov 13,  1:28pm, Neil Brown wrote:
} Subject: Re: mismatch_cnt again

Good afternoon to everyone, hope your week is starting well.

> On Thursday November 12, greg@enjellic.com wrote:
> > 
> > Neil/Martin what do you think?

> I think that if you found out which blocks were different and mapped
> that back through the filesystem, you would find that those blocks
> are not a part of any file, or possibly are part of a file that is
> currently being written.

I can buy the issue of the mismatches being part of a file being
written but that doesn't explain machines where the RAID1 array was
initialized and allowed to synchronize and which now show persistent
counts of mismatched sectors.

I can certainly buy the issue of the mismatches not being part of an
active file.  I still think this leaves the issue of why the
mismatches were generated unless we want to assume that whatever
causes the mismatch only affects areas of the filesystem which don't
have useful files.  Not a reassuring assumption.

> I guess I need to start logging the error address so people can
> start dealing with facts rather than fears.

I think that would be a good starting point.  If for no other reason
then to allow people to easily figure out the possible ramifications of a
mismatch count.

One other issue to consider.  We have RAID1 volumes with mismatch
counts over a wide variety of hardware platforms and Linux kernels.
In all cases the number of mismatched blocks are an exact multiple of
128.  That doesn't seem to suggest some type of random corruption.

This issue may all be innocuous but we have about the worst situation
we could have.  An issue which may be generating false positives for
potential corruption.  Amplified by the fact that major distributions
are generating what will be interpreted as warning e-mails about their
existence.  So even if the problem is innocuous the list is guaranteed
to be spammed with these reports let alone your inbox.... :-)

Just a thought in moving forward.

The 'check' option is primarily useful for its role in scrubbing RAID*
volumes with an eye toward making sure that silent corruption
scenarios don't arise which would thwart a resync.  Particularly since
you implemented the ability to attempt a sector re-write to trigger
block re-allocations.  This is a nice deterministic repair mechanism
which has fixed problems for us on a number of occassions.

I think what is needed is a 'scrub' directive which carries out this
function without incrementing mismatch counts and the like.  That
would leave a possibly enhanced 'check' command to report on
mismatches and carry out any remedial action, if any, that the group
can think of.

If a scrub directive were to be implemented it would be beneficial to
make it interruptible.  A 'halt' or similar directive would shutdown
the scrub and latch the last block number which had been examined.
That would allow a scrub to be resumed from that point in a subsequent
session.

With some of these large block devices it is difficult to get through
an entire 'check/scrub' in whatever late night window is left after
backups have run.  The above infra-structure would allow userspace to
gate the checking into whatever windows are available for these types
of activities.

> NeilBrown

Hope the above comments are helpful.

Best wishes for a productive week.

}-- End of excerpt from Neil Brown

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686
FAX: 701-281-3949           EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"When I am working on a problem I never think about beauty.  I only
 think about how to solve the problem.  But when I have finished, if
 the solution is not beautiful, I know it is wrong."
                                -- Buckminster Fuller

             reply	other threads:[~2009-11-16 21:36 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-16 21:36 greg [this message]
2009-11-16 22:14 ` mismatch_cnt again Neil Brown
2009-11-17  4:50   ` Goswin von Brederlow
  -- strict thread matches above, loose matches on Subject: below --
2009-11-12 19:20 greg
2009-11-13  2:28 ` Neil Brown
2009-11-13  5:19   ` Goswin von Brederlow
2009-11-15  1:54   ` Bill Davidsen
2009-11-07  0:41 Eyal Lebedinsky
2009-11-07  1:53 ` berk walker
2009-11-07  7:49   ` Eyal Lebedinsky
2009-11-07  8:08     ` Michael Evans
2009-11-07  8:42       ` Eyal Lebedinsky
2009-11-07 13:51       ` Goswin von Brederlow
2009-11-07 14:58         ` Doug Ledford
2009-11-07 16:23           ` Piergiorgio Sartor
2009-11-07 16:37             ` Doug Ledford
2009-11-07 22:25               ` Eyal Lebedinsky
2009-11-07 22:57                 ` Doug Ledford
2009-11-08 15:32             ` Goswin von Brederlow
2009-11-09 18:08               ` Bill Davidsen
2009-11-07 22:19           ` Eyal Lebedinsky
2009-11-07 22:58             ` Doug Ledford
2009-11-08 15:46           ` Goswin von Brederlow
2009-11-08 16:04             ` Piergiorgio Sartor
2009-11-09 18:22               ` Bill Davidsen
2009-11-09 21:50                 ` NeilBrown
2009-11-10 18:05                   ` Bill Davidsen
2009-11-10 22:17                     ` Peter Rabbitson
2009-11-13  2:15                     ` Neil Brown
2009-11-09 19:13               ` Goswin von Brederlow
2009-11-08 22:51             ` Peter Rabbitson
2009-11-09 18:56               ` Piergiorgio Sartor
2009-11-09 21:14                 ` NeilBrown
2009-11-09 21:54                   ` Piergiorgio Sartor
2009-11-10  0:17                     ` NeilBrown
2009-11-10  9:09                       ` Peter Rabbitson
2009-11-10 14:03                         ` Martin K. Petersen
2009-11-12 22:40                           ` Bill Davidsen
2009-11-13 17:12                             ` Martin K. Petersen
2009-11-14 17:01                               ` Bill Davidsen
2009-11-17  5:19                                 ` Martin K. Petersen
2009-11-14 19:04                               ` Goswin von Brederlow
2009-11-17  5:22                                 ` Martin K. Petersen
2009-11-10 19:52                       ` Piergiorgio Sartor
2009-11-13  2:37                         ` Neil Brown
2009-11-13  5:30                           ` Goswin von Brederlow
2009-11-13  9:33                           ` Peter Rabbitson
2009-11-15 21:05                           ` Piergiorgio Sartor
2009-11-15 22:29                             ` Guy Watkins
2009-11-16  1:23                               ` Goswin von Brederlow
2009-11-16  1:37                               ` Neil Brown
2009-11-16  5:21                                 ` Goswin von Brederlow
2009-11-16  5:35                                   ` Neil Brown
2009-11-16  7:40                                     ` Goswin von Brederlow
2009-11-12 22:57                       ` Bill Davidsen
2009-11-09 18:11           ` Bill Davidsen
2009-11-09 20:58             ` Doug Ledford
2009-11-09 22:03 ` Eyal Lebedinsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200911162136.nAGLatPb028438@wind.enjellic.com \
    --to=greg@enjellic.com \
    --cc=eyal@eyal.emu.id.au \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.