All of lore.kernel.org
 help / color / mirror / Atom feed
From: Heming Zhao <heming.zhao@suse.com>
To: LVM general discussion and development <linux-lvm@redhat.com>,
	Joe Thornber <thornber@redhat.com>
Cc: David Teigland <teigland@redhat.com>
Subject: Re: [linux-lvm] resend patch - bcache may mistakenly write data to another disk when writes error
Date: Tue, 29 Oct 2019 05:07:30 +0000	[thread overview]
Message-ID: <324885c4-a211-fb11-8315-979935391820@suse.com> (raw)
In-Reply-To: <20191028154313.ibspu6qwqqsq54cc@reti>

Hello Joe,

Please check my comments for your commit 2938b4dcc & 6b0d969b

1. b->ref_count is non-zero, and write error happens, the data never release?
    (no place to call _unlink_block & _free_block)

2. when dev_write_bytes failed, call dev_unset_last_byte with "fd=-1" is wrong.

3. I still think below error handling should be added.
    Below base on stable-2.02, but the core idea is same, should access the return value of io_submit & io_getevents.
    
```
static bool _async_issue(struct io_engine *ioe, enum dir d, int fd,
    ... ...
     if (r < 0) {
         _cb_free(e->cbs, cb);
+       ((struct block *)context)->error = r; <== assign errno & print warning
+       log_warn("io_submit <%c> off %llu bytes %llu return %d:%s",
+           (d == DIR_READ) ? 'R' : 'W', (long long unsigned)offset,
+           (long long unsigned)nbytes, r, strerror(-r));
         return false;
     }

static void _issue_low_level(struct block *b, enum dir d)
    ... ...
     dm_list_move(&cache->io_pending, &b->list);
  
     if (!cache->engine->issue(cache->engine, d, b->fd, sb, se, b->data, b)) {
-       /* FIXME: if io_submit() set an errno, return that instead of EIO? */
-       _complete_io(b, -EIO);
+       _complete_io(b, b->error); <=== this pass the right errno to caller.
         return;
     }
  }
  
-static void _wait_all(struct bcache *cache)
+static bool _wait_all(struct bcache *cache) <=== change to return error
  {
+   bool ret = true;
     while (!dm_list_empty(&cache->io_pending))
-       _wait_io(cache);
+       ret = _wait_io(cache);
+   return ret;
  }
  
-static void _wait_specific(struct block *b)
+static bool _wait_specific(struct block *b) <=== change to return error
  {
+   bool ret = true;
     while (_test_flags(b, BF_IO_PENDING))
-       _wait_io(b->cache);
+       ret = _wait_io(b->cache);
+   return ret;
  }

  bool bcache_flush(struct bcache *cache) <==== add more error handling
  {
+   bool write_ret = true, wait_ret = true;
  
     ... ...
         _issue_write(b);
+       if (b->error) write_ret = false;
     }
  
-   _wait_all(cache);
+   wait_ret = _wait_all(cache);
  
-   return dm_list_empty(&cache->errored);
+   if (write_ret == false || wait_ret == false ||
+           !dm_list_empty(&cache->errored))
+       return false;
+   else
+       return true;
  }
```

Thanks,
zhm

On 10/28/19 11:43 PM, Joe Thornber wrote:
> On Thu, Oct 24, 2019 at 03:06:18AM +0000, Heming Zhao wrote:
>> First fail is in bcache_flush, then bcache_invalidate_fd does nothing because the data
>> in cache->errored, which not belongs to dirty & clean list. Then the data mistakenly
>> move from cache->errored into cache->dirty by "bcache_get => _lookup_or_read_block"
>> (because the data holds BF_DIRTY flag).
> 
> I just pushed a couple of patches that will hopefully fix this issue for you:
> 
> commit 6b0d969b2a85ba69046afa26af4d7bcddddbccd5 (HEAD -> master, origin/master, origin/HEAD, 2019-10-11-bcache-purge)
> Author: Joe Thornber <ejt@redhat.com>
> Date:   Mon Oct 28 15:01:47 2019 +0000
> 
>      [label] Use bcache_abort_fd() to ensure blocks are no longer in the cache.
>      
>      The return value from bcache_invalidate_fd() was not being checked.
>      
>      So I've introduced a little function, _invalidate_fd() that always
>      calls bcache_abort_fd() if the write fails.
> 
> commit 2938b4dcca0a1df661758abfab7f402ea7aab018
> Author: Joe Thornber <ejt@redhat.com>
> Date:   Mon Oct 28 14:29:47 2019 +0000
> 
>      [bcache] add bcache_abort()
>      
>      This gives us a way to cope with write failures.
> 
> 
> 
> 
> Also there are big changes to bcache coming, that remove file descriptors from
> the interface completely.  See the branch 2019-09-05-add-io-manager for more info
> (bcache has been renamed io_manager).
> 
> - Joe
> 
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
> 

  reply	other threads:[~2019-10-29  5:07 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-22  9:47 [linux-lvm] resend patch - bcache may mistakenly write data to another disk when writes error Heming Zhao
2019-10-23 21:31 ` Joe Thornber
2019-10-24  3:06   ` Heming Zhao
2019-10-28 15:43     ` Joe Thornber
2019-10-29  5:07       ` Heming Zhao [this message]
2019-10-29  9:46         ` Heming Zhao
2019-10-29 11:05           ` Joe Thornber
2019-10-29 11:47             ` Heming Zhao
2019-10-29 14:41               ` Joe Thornber
2019-10-29 11:01         ` Joe Thornber
2019-10-29 11:41           ` Heming Zhao
2019-10-24  3:13   ` Heming Zhao
2019-10-28  8:38     ` Heming Zhao
     [not found] <fc8ca0d7-23d9-8145-05e5-27a7ea2a7682@suse.com>
     [not found] ` <872328cd-3d51-97bb-1c50-b54cc194c6f2@suse.com>
2019-11-12 15:21   ` David Teigland
     [not found]     ` <667efc9f-1001-37cc-c0af-b352ff366c03@suse.com>
2019-11-13 15:41       ` David Teigland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=324885c4-a211-fb11-8315-979935391820@suse.com \
    --to=heming.zhao@suse.com \
    --cc=linux-lvm@redhat.com \
    --cc=teigland@redhat.com \
    --cc=thornber@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.