linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 3.12: raid-1 mismatch_cnt question
@ 2013-11-04 10:25 Justin Piszcz
  2013-11-07 10:54 ` Justin Piszcz
  0 siblings, 1 reply; 4+ messages in thread
From: Justin Piszcz @ 2013-11-04 10:25 UTC (permalink / raw)
  To: linux-kernel, linux-raid

Hi,

I run two SSDs in a RAID-1 configuration and I have a swap partition on a
third SSD.  Over time, the mismatch_cnt between the two devices grows higher
and higher.

Once a week, I run a check and repair against the md devices to help bring
the mismatch_cnt down.  When I run the check and repair, the system is live
so there are various logs/processes writing to disk.  The system also has
ECC memory and there are no errors reported.

The following graph is the mismatch_cnt from June 2013 to current; each drop
represents a check+repair.  In September, I dropped the kernel/vm caches
before running check/repair and that seemed to help a bit.
http://home.comcast.net/~jpiszcz/20131104/md_raid_mismatch_cnt.png

My question is: is this normal or should the mismatch_cnt always be 0 unless
there is a HW or md/driver issue?

Justin.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 3.12: raid-1 mismatch_cnt question
  2013-11-04 10:25 3.12: raid-1 mismatch_cnt question Justin Piszcz
@ 2013-11-07 10:54 ` Justin Piszcz
  2013-11-12  0:39   ` Brad Campbell
  0 siblings, 1 reply; 4+ messages in thread
From: Justin Piszcz @ 2013-11-07 10:54 UTC (permalink / raw)
  To: open list, linux-raid

On Mon, Nov 4, 2013 at 5:25 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote:
> Hi,
>
> I run two SSDs in a RAID-1 configuration and I have a swap partition on a
> third SSD.  Over time, the mismatch_cnt between the two devices grows higher
> and higher.
>
> Once a week, I run a check and repair against the md devices to help bring
> the mismatch_cnt down.  When I run the check and repair, the system is live
> so there are various logs/processes writing to disk.  The system also has
> ECC memory and there are no errors reported.
>
> The following graph is the mismatch_cnt from June 2013 to current; each drop
> represents a check+repair.  In September, I dropped the kernel/vm caches
> before running check/repair and that seemed to help a bit.
> http://home.comcast.net/~jpiszcz/20131104/md_raid_mismatch_cnt.png
>
> My question is: is this normal or should the mismatch_cnt always be 0 unless
> there is a HW or md/driver issue?
>
> Justin.
>

Hi,

Could anyone please comment if this is normal/expected behavior?

Thanks,

Justin.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 3.12: raid-1 mismatch_cnt question
  2013-11-07 10:54 ` Justin Piszcz
@ 2013-11-12  0:39   ` Brad Campbell
  2013-11-12  9:14     ` Justin Piszcz
  0 siblings, 1 reply; 4+ messages in thread
From: Brad Campbell @ 2013-11-12  0:39 UTC (permalink / raw)
  To: Justin Piszcz, open list, linux-raid

On 11/07/2013 06:54 PM, Justin Piszcz wrote:
> On Mon, Nov 4, 2013 at 5:25 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote:
>> Hi,
>>
>> I run two SSDs in a RAID-1 configuration and I have a swap partition on a
>> third SSD.  Over time, the mismatch_cnt between the two devices grows higher
>> and higher.
>>

Are both SSD's identical? Do you have discard enabled on the filesystem?

The reason I ask is I have a RAID10 comprised of 3 Intel and 3 Samsung 
SSD's. The Intel return 0 after TRIM while the Samsung don't, so I 
_always_ have a massive mismatch_cnt after I run fstrim. I never use a 
repair operation as it's just going to re-write the already trimmed sectors.


Just a thought.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 3.12: raid-1 mismatch_cnt question
  2013-11-12  0:39   ` Brad Campbell
@ 2013-11-12  9:14     ` Justin Piszcz
  0 siblings, 0 replies; 4+ messages in thread
From: Justin Piszcz @ 2013-11-12  9:14 UTC (permalink / raw)
  To: Brad Campbell; +Cc: open list, linux-raid

On Mon, Nov 11, 2013 at 7:39 PM, Brad Campbell
<lists2009@fnarfbargle.com> wrote:
> On 11/07/2013 06:54 PM, Justin Piszcz wrote:
>>
>> On Mon, Nov 4, 2013 at 5:25 AM, Justin Piszcz <jpiszcz@lucidpixels.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> I run two SSDs in a RAID-1 configuration and I have a swap partition on a
>>> third SSD.  Over time, the mismatch_cnt between the two devices grows
>>> higher
>>> and higher.
>>>
>
> Are both SSD's identical? Do you have discard enabled on the filesystem?
Yes (2 x Intel SSDSC2CW240A3) & yes )/dev/root on / type ext4
(rw,relatime,discard,data=ordered))

>
> The reason I ask is I have a RAID10 comprised of 3 Intel and 3 Samsung
> SSD's. The Intel return 0 after TRIM while the Samsung don't, so I _always_
> have a massive mismatch_cnt after I run fstrim. I never use a repair
> operation as it's just going to re-write the already trimmed sectors.
Very interesting and good to know!

Justin.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-11-12  9:14 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-04 10:25 3.12: raid-1 mismatch_cnt question Justin Piszcz
2013-11-07 10:54 ` Justin Piszcz
2013-11-12  0:39   ` Brad Campbell
2013-11-12  9:14     ` Justin Piszcz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).