All of lore.kernel.org
 help / color / mirror / Atom feed
From: Heming Zhao <heming.zhao@suse.com>
To: LVM general discussion and development <linux-lvm@redhat.com>,
	David Teigland <teigland@redhat.com>
Subject: Re: [linux-lvm] resend patch - bcache may mistakenly write data to another disk when writes error
Date: Thu, 24 Oct 2019 03:06:18 +0000	[thread overview]
Message-ID: <a3738d0e-cfba-af45-afd2-a08ca1819c83@suse.com> (raw)
In-Reply-To: <20191023213101.gpkorwubuobm5w5y@reti>

Hello Joe,

Please see my comments in below. Thanks.

On 10/24/19 5:31 AM, Joe Thornber wrote:
> On Tue, Oct 22, 2019 at 09:47:32AM +0000, Heming Zhao wrote:
>> Hello List & David,
>>
>> This patch is responsible for legacy mail:
>> [linux-lvm] pvresize will cause a meta-data corruption with error message "Error writing device at 4096 length 512"
>>
>> I had send it to our customer, the code ran as expected. I think this code is enough to fix this issue.
>>
>> Thanks
>> zhm
>>
>> ------(patch for branch stable-2.02) ----------
>>   From d0d77d0bdad6136c792c9664444d73dd47b809cb Mon Sep 17 00:00:00 2001
>> From: Zhao Heming <heming.zhao@suse.com>
>> Date: Tue, 22 Oct 2019 17:22:17 +0800
>> Subject: [PATCH] bcache may mistakenly write data to another disk when writes
>>    error
>>
>> When bcache write data error, the errored fd and its data is saved in
>> cache->errored, then this fd is closed. Later lvm will reuse this
>> closed fd to new opened devs, but the fd related data still in
>> cache->errored and flags with BF_DIRTY. It make the data may mistakenly
>> write to another disk.
> 
> I think real issue here is that the flush fails, and the error path for that
> calls invalidate dev, which also fails, but that return value is not checked.
> The fd is subsequently closed, and reopened with data still in the cache.
> 
First fail is in bcache_flush, then bcache_invalidate_fd does nothing because the data
in cache->errored, which not belongs to dirty & clean list. Then the data mistakenly
move from cache->errored into cache->dirty by "bcache_get => _lookup_or_read_block"
(because the data holds BF_DIRTY flag).

> So I think the correct fix is to have a variant of invalidate, that doesn't
> bother retrying the IO, and just throws away the dirty data.  bcache_abort()?
> This should be called when the flush() fails.
> 
Currently lvm2 calls bcache_invalidate_fd when flush fails. So I add abort action
in invalidate(). You can see the handle errored list codes in invalidate(). It's
not necessary to add a new func bcache_abort().

So IMO both bcache_flush &  bcache_invalidate_fd have bug.
- bcache_flush should do more error/fail check.
- invalidate should do abort job.

BTW, yesterday I found my patch a mistake, it overwrites some correct code. And I
only sent it to David with V2 patch.


> - Joe
> 
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> 

  reply	other threads:[~2019-10-24  3:06 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-22  9:47 [linux-lvm] resend patch - bcache may mistakenly write data to another disk when writes error Heming Zhao
2019-10-23 21:31 ` Joe Thornber
2019-10-24  3:06   ` Heming Zhao [this message]
2019-10-28 15:43     ` Joe Thornber
2019-10-29  5:07       ` Heming Zhao
2019-10-29  9:46         ` Heming Zhao
2019-10-29 11:05           ` Joe Thornber
2019-10-29 11:47             ` Heming Zhao
2019-10-29 14:41               ` Joe Thornber
2019-10-29 11:01         ` Joe Thornber
2019-10-29 11:41           ` Heming Zhao
2019-10-24  3:13   ` Heming Zhao
2019-10-28  8:38     ` Heming Zhao
     [not found] <fc8ca0d7-23d9-8145-05e5-27a7ea2a7682@suse.com>
     [not found] ` <872328cd-3d51-97bb-1c50-b54cc194c6f2@suse.com>
2019-11-12 15:21   ` David Teigland
     [not found]     ` <667efc9f-1001-37cc-c0af-b352ff366c03@suse.com>
2019-11-13 15:41       ` David Teigland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a3738d0e-cfba-af45-afd2-a08ca1819c83@suse.com \
    --to=heming.zhao@suse.com \
    --cc=linux-lvm@redhat.com \
    --cc=teigland@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.