From mboxrd@z Thu Jan 1 00:00:00 1970 From: Heming Zhao Date: Sat, 12 Oct 2019 07:11:57 +0000 Message-ID: References: <6b055125-2e06-df7d-89fa-6c347404a9cd@suse.com> <20191011151405.GA31912@redhat.com> <4139435d-c8fc-71c3-6066-ebfc882e9511@suse.com> In-Reply-To: <4139435d-c8fc-71c3-6066-ebfc882e9511@suse.com> Content-Language: en-US Content-ID: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: Re: [linux-lvm] pvresize will cause a meta-data corruption with error message "Error writing device at 4096 length 512" Reply-To: LVM general discussion and development List-Id: LVM general discussion and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , List-Id: Content-Type: text/plain; charset="us-ascii" To: David Teigland Cc: Gang He , "linux-lvm@redhat.com" Hello List & David, Below patch for fix incorrect calling dev_unset_last_byte. ------------ commit 89cfffeffb7499d8f51112f58c381007aebc372d (HEAD -> master) Author: Zhao Heming Date: Sat Oct 12 15:04:42 2019 +0800 When dev_write_bytes error, this function will release fd. It makes caller can't reset bcache last_byte by dev_unset_last_byte. Signed-off-by: Zhao Heming diff --git a/.gitignore b/.gitignore index 7ebb8bb3be..cfd5bee1c4 100644 --- a/.gitignore +++ b/.gitignore @@ -30,7 +30,7 @@ make.tmpl /config.log /config.status /configure.scan -/cscope.out +/cscope.* /html/ /reports/ /tags diff --git a/lib/format_text/format-text.c b/lib/format_text/format-text.c index 6ec47bfcef..fd65f50f5f 100644 --- a/lib/format_text/format-text.c +++ b/lib/format_text/format-text.c @@ -277,8 +277,7 @@ static int _raw_write_mda_header(const struct format_type *fmt, dev_set_last_byte(dev, start_byte + MDA_HEADER_SIZE); if (!dev_write_bytes(dev, start_byte, MDA_HEADER_SIZE, mdah)) { - dev_unset_last_byte(dev); - log_error("Failed to write mda header to %s fd %d", dev_name(dev), dev->bcache_fd); + log_error("Failed to write mda header to %s", dev_name(dev)); return 0; } dev_unset_last_byte(dev); @@ -988,8 +987,7 @@ static int _vg_write_raw(struct format_instance *fid, struct volume_group *vg, (unsigned long long)write2_size); if (!dev_write_bytes(mdac->area.dev, write1_start, (size_t)write1_size, write_buf)) { - log_error("Failed to write metadata to %s fd %d", devname, mdac->area.dev->bcache_fd); - dev_unset_last_byte(mdac->area.dev); + log_error("Failed to write metadata to %s", devname); goto out; } @@ -1001,8 +999,7 @@ static int _vg_write_raw(struct format_instance *fid, struct volume_group *vg, if (!dev_write_bytes(mdac->area.dev, write2_start, write2_size, write_buf + new_size - new_wrap)) { - log_error("Failed to write metadata wrap to %s fd %d", devname, mdac->area.dev->bcache_fd); - dev_unset_last_byte(mdac->area.dev); + log_error("Failed to write metadata wrap to %s", devname); goto out; } } @@ -1019,7 +1016,7 @@ static int _vg_write_raw(struct format_instance *fid, struct volume_group *vg, r = 1; - out: +out: if (!r) { free(fidtc->write_buf); fidtc->write_buf = NULL; diff --git a/lib/label/label.c b/lib/label/label.c index 60ad387219..f4787b18cb 100644 --- a/lib/label/label.c +++ b/lib/label/label.c @@ -218,7 +218,7 @@ int label_write(struct device *dev, struct label *label) if (!dev_write_bytes(dev, offset, LABEL_SIZE, buf)) { log_debug_devs("Failed to write label to %s", dev_name(dev)); - r = 0; + return 0; } dev_unset_last_byte(dev); @@ -1415,7 +1415,8 @@ bool dev_write_bytes(struct device *dev, uint64_t start, size_t len, void *data) if (!scan_bcache) { /* Should not happen */ - log_error("dev_write bcache not set up %s", dev_name(dev)); + log_error("dev_write bcache not set up %s fd %d", dev_name(dev), + dev->bcache_fd); return false; } @@ -1434,21 +1435,25 @@ bool dev_write_bytes(struct device *dev, uint64_t start, size_t len, void *data) dev->flags |= DEV_BCACHE_WRITE; if (!label_scan_open(dev)) { log_error("Error opening device %s for writing at %llu length %u.", - dev_name(dev), (unsigned long long)start, (uint32_t)len); + dev_name(dev), (unsigned long long)start, (uint32_t)len); return false; } } if (!bcache_write_bytes(scan_bcache, dev->bcache_fd, start, len, data)) { - log_error("Error writing device %s at %llu length %u.", - dev_name(dev), (unsigned long long)start, (uint32_t)len); + log_error("Error writing device %s at %llu length %u fd %d.", + dev_name(dev), (unsigned long long)start, (uint32_t)len, + dev->bcache_fd); + dev_unset_last_byte(mdac->area.dev); label_scan_invalidate(dev); return false; } if (!bcache_flush(scan_bcache)) { - log_error("Error writing device %s at %llu length %u.", - dev_name(dev), (unsigned long long)start, (uint32_t)len); + log_error("Error writing device %s at %llu length %u fd %d.", + dev_name(dev), (unsigned long long)start, (uint32_t)len, + dev->bcache_fd); + dev_unset_last_byte(mdac->area.dev); label_scan_invalidate(dev); return false; } diff --git a/lib/metadata/mirror.c b/lib/metadata/mirror.c index 75dc18c113..c8280f9c47 100644 --- a/lib/metadata/mirror.c +++ b/lib/metadata/mirror.c @@ -266,7 +266,6 @@ static int _write_log_header(struct cmd_context *cmd, struct logical_volume *lv) dev_set_last_byte(dev, sizeof(log_header)); if (!dev_write_bytes(dev, UINT64_C(0), sizeof(log_header), &log_header)) { - dev_unset_last_byte(dev); log_error("Failed to write log header to %s.", name); return 0; } --- Thanks zhm On 10/12/19 2:34 PM, Heming Zhao wrote: > Hello David, > > Thank you for your reply. > > For these days analysis code, I found below codes can be enhanced. > (code changes base on git master branch.) > > --------------- > commit 3768196011fb01e4016510bfab9eef0c7bdc04f5 (HEAD -> master) > Author: Zhao Heming > Date: Sat Oct 12 14:28:06 2019 +0800 > > fix typo in lib/cache/lvmcache.c > enhance error handling in bcache > fix constant var 'error' in _scan_list > fix gcc warning in _lvconvert_split_cache_single > > Signed-off-by: Zhao Heming > > diff --git a/lib/cache/lvmcache.c b/lib/cache/lvmcache.c > index f6e792459b..499f9437cb 100644 > --- a/lib/cache/lvmcache.c > +++ b/lib/cache/lvmcache.c > @@ -939,7 +939,7 @@ int lvmcache_label_rescan_vg_rw(struct cmd_context *cmd, const char *vgname, con > * incorrectly placed PVs should have been moved from the orphan vginfo > * onto their correct vginfo's, and the orphan vginfo should (in theory) > * represent only real orphan PVs. (Note: if lvmcache_label_scan is run > - * after vg_read udpates to lvmcache state, then the lvmcache will be > + * after vg_read updates to lvmcache state, then the lvmcache will be > * incorrect again, so do not run lvmcache_label_scan during the > * processing phase.) > * > diff --git a/lib/device/bcache.c b/lib/device/bcache.c > index d100419770..cfe01bac2f 100644 > --- a/lib/device/bcache.c > +++ b/lib/device/bcache.c > @@ -292,6 +292,10 @@ static bool _async_issue(struct io_engine *ioe, enum dir d, int fd, > } while (r == -EAGAIN); > > if (r < 0) { > + ((struct block *)context)->error = r; > + log_warn("io_submit <%c> off %llu bytes %llu return %d:%s", > + (d == DIR_READ) ? 'R' : 'W', (long long unsigned)offset, > + (long long unsigned)nbytes, r, strerror(-r)); > _cb_free(e->cbs, cb); > return false; > } > @@ -842,7 +846,7 @@ static void _complete_io(void *context, int err) > > if (b->error) { > dm_list_add(&cache->errored, &b->list); > - > + log_warn("fd: %d error: %d", b->fd, err); > } else { > _clear_flags(b, BF_DIRTY); > _link_block(b); > @@ -869,8 +873,7 @@ static void _issue_low_level(struct block *b, enum dir d) > dm_list_move(&cache->io_pending, &b->list); > > if (!cache->engine->issue(cache->engine, d, b->fd, sb, se, b->data, b)) { > - /* FIXME: if io_submit() set an errno, return that instead of EIO? */ > - _complete_io(b, -EIO); > + _complete_io(b, b->error); > return; > } > } > diff --git a/lib/label/label.c b/lib/label/label.c > index dc4d32d151..60ad387219 100644 > --- a/lib/label/label.c > +++ b/lib/label/label.c > @@ -647,7 +647,6 @@ static int _scan_list(struct cmd_context *cmd, struct dev_filter *f, > int submit_count; > int scan_failed; > int is_lvm_device; > - int error; > int ret; > > dm_list_init(&wait_devs); > @@ -694,12 +693,12 @@ static int _scan_list(struct cmd_context *cmd, struct dev_filter *f, > > dm_list_iterate_items_safe(devl, devl2, &wait_devs) { > bb = NULL; > - error = 0; > scan_failed = 0; > is_lvm_device = 0; > > if (!bcache_get(scan_bcache, devl->dev->bcache_fd, 0, 0, &bb)) { > - log_debug_devs("Scan failed to read %s error %d.", dev_name(devl->dev), error); > + log_debug_devs("Scan failed to read %s error %d.", > + dev_name(devl->dev), bb ? bb->error : 0); > scan_failed = 1; > scan_read_errors++; > scan_failed_count++; > diff --git a/tools/lvconvert.c b/tools/lvconvert.c > index 60ab956614..4939e5ec7d 100644 > --- a/tools/lvconvert.c > +++ b/tools/lvconvert.c > @@ -4676,7 +4676,7 @@ static int _lvconvert_split_cache_single(struct cmd_context *cmd, > struct logical_volume *lv_main = NULL; > struct logical_volume *lv_fast = NULL; > struct lv_segment *seg; > - int ret; > + int ret = 0; > > if (lv_is_writecache(lv)) { > lv_main = lv; > > --- > Thanks > zhm > > On 10/11/19 11:14 PM, David Teigland wrote: >> On Fri, Oct 11, 2019 at 08:11:29AM +0000, Heming Zhao wrote: >> >>> I analyze this issue for some days. It looks a new bug. >> >> Yes, thanks for the thorough analysis. >> >>> In user machine, this write action was failed, the PV header data (first >>> 4K) save in bcache (cache->errored list), and then write (by >>> bcache_flush) to another disk (f748). >> >> It looks like we need to get rid of cache->errored completely. >> >>> If dev_write_bytes failed, the bcache never clean last_byte. and the fd >>> is closed at same time, but cache->errored still have errored fd's data. >>> later lvm open new disk, the fd may reuse the old-errored fd number, >>> error data will be written when later lvm call bcache_flush. >> >> That's a bad bug. >> >>> 2> duplicated pv header. >>> as <1> description, fc68 metadata was overwritten to f748. >>> this cause by lvm bug (I said in <1>). >>> >>> 3> device not correct >>> I don't know why the disk scsi-360060e80072a670000302a670000fc68 has below wrong metadata: >>> >>> pre_pvr/scsi-360060e80072a670000302a670000fc68 >>> (please also read the comments in below metadata area.) >>> ``` >>> vgpocdbcdb1_r2 { >>> id = "PWd17E-xxx-oANHbq" >>> seqno = 20 >>> format = "lvm2" >>> status = ["RESIZEABLE", "READ", "WRITE"] >>> flags = [] >>> extent_size = 65536 >>> max_lv = 0 >>> max_pv = 0 >>> metadata_copies = 0 >>> >>> physical_volumes { >>> >>> pv0 { >>> id = "3KTOW5-xxxx-8g0Rf2" >>> device = "/dev/disk/by-id/scsi-360060e80072a660000302a660000f768" >>> Wrong!! ^^^^^ >>> I don't know why there is f768, please ask customer >>> status = ["ALLOCATABLE"] >>> flags = [] >>> dev_size = 860160 >>> pe_start = 2048 >>> pe_count = 13 >>> } >>> } >>> ``` >>> fc68 => f768 the 'c' (b1100) change to '7' (b0111). >>> maybe disk bit overturn, maybe lvm has bug. I don't know & have no idea. >> >> Is scsi-360060e80072a660000302a660000f768 the correct device for >> PVID 3KTOW5...? If so, then it's consistent. If not, then I suspect >> this is a result of duplicating the PVID on multiple devices above. >> >> >>> On 9/11/19 5:17 PM, Gang He wrote: >>>> Hello List, >>>> >>>> Our user encountered a meta-data corruption problem, when run pvresize command after upgrading to LVM2 v2.02.180 from v2.02.120. >>>> >>>> The details are as below, >>>> we have following environment: >>>> - Storage: HP XP7 (SAN) - LUN's are presented to ESX via RDM >>>> - VMWare ESXi 6.5 >>>> - SLES 12 SP 4 Guest >>>> >>>> Resize happened this way (is our standard way since years) - however - this is our first resize after upgrading SLES 12 SP3 to SLES 12 SP4 - until this upgrade, we >>>> never had a problem like this: >>>> - split continous access on storage box, resize lun on XP7 >>>> - recreate ca on XP7 >>>> - scan on ESX >>>> - rescan-scsi-bus.sh -s on SLES VM >>>> - pvresize ( at this step the error happened) >>>> >>>> huns1vdb01:~ # pvresize /dev/disk/by-id/scsi-360060e80072a660000302a6600003274 >>> >>> _______________________________________________ >>> linux-lvm mailing list >>> linux-lvm@redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-lvm >>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/ >> > > _______________________________________________ > linux-lvm mailing list > linux-lvm@redhat.com > https://www.redhat.com/mailman/listinfo/linux-lvm > read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/ >