From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mx1.redhat.com (ext-mx17.extmail.prod.ext.phx2.redhat.com [10.5.110.46]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 34F8E2D22F for ; Mon, 26 Nov 2018 18:24:39 +0000 (UTC) Received: from mail-vs1-f46.google.com (mail-vs1-f46.google.com [209.85.217.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4DC833001587 for ; Mon, 26 Nov 2018 18:24:37 +0000 (UTC) Received: by mail-vs1-f46.google.com with SMTP id z3so11986603vsf.7 for ; Mon, 26 Nov 2018 10:24:37 -0800 (PST) MIME-Version: 1.0 From: Andrew Hall Date: Mon, 26 Nov 2018 18:24:24 +0000 Message-ID: Subject: [linux-lvm] lvextend operation results in suspended volume Reply-To: LVM general discussion and development List-Id: LVM general discussion and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , List-Id: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-lvm@redhat.com Can anyone confirm if the following situation is recoverable or not ? Thanks very much. 1. We have an LV which was recently extended using a VG with sufficient PE available. A filesystem resize operation was included with the -r flag : description = "Created *before* executing 'lvextend -L+100G -r /dev/vgname/lvname'" creation_time = 1542042235 # Mon Nov 12 17:03:55 2018 2. We then see the following in syslog confirming completion of this operation : Nov 12 17:04:06 servername kernel: EXT4-fs (dm-2): resizing filesystem from 7739015168 to 7765229568 blocks Nov 12 17:04:08 servername kernel: EXT4-fs (dm-2): resized filesystem to 7765229568 3. A few minutes later we see a further extend and resize operation : description = "Created *before* executing 'lvextend -L+200G -r /dev/vgname/lvname'" creation_time = 1542042358 # Mon Nov 12 17:05:58 2018 4. We then see the following in the syslog confirming errors with this operation : Nov 12 17:05:59 servername kernel: device-mapper: table: 253:2: sdb1 too small for target: start=35156248576, len=37756928, dev_size=35156248576 Nov 12 17:08:09 servername kernel: INFO: task jbd2/dm-2-8:698 blocked for more than 120 seconds. Nov 12 17:08:09 servername kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 12 17:08:09 servername kernel: jbd2/dm-2-8 D ffff8804682b7c30 0 698 2 0x00000000 Nov 12 17:08:09 servername kernel: ffff8804682b7ad0 0000000000000046 ffff8804639ab980 ffff8804682b7fd8 Nov 12 17:08:09 servername kernel: ffff8804682b7fd8 ffff8804682b7fd8 ffff8804639ab980 ffff88047fc54780 Nov 12 17:08:09 servername kernel: 0000000000000000 7fffffffffffffff ffffffff812112d0 ffff8804682b7c30 Nov 12 17:08:09 servername kernel: Call Trace: Nov 12 17:08:09 servername kernel: [] ? generic_block_bmap+0x70/0x70 Nov 12 17:08:09 servername kernel: [] schedule+0x29/0x70 Nov 12 17:08:09 servername kernel: [] schedule_timeout+0x209/0x2d0 Nov 12 17:08:09 servername kernel: [] ? read_tsc+0x9/0x10 Nov 12 17:08:09 servername kernel: [] ? ktime_get_ts64+0x4c/0xf0 Nov 12 17:08:09 servername kernel: [] ? read_tsc+0x9/0x10 Nov 12 17:08:09 servername kernel: [] ? generic_block_bmap+0x70/0x70 Nov 12 17:08:09 servername kernel: [] io_schedule_timeout+0xae/0x130 Nov 12 17:08:09 servername kernel: [] io_schedule+0x18/0x20 Nov 12 17:08:09 servername kernel: [] sleep_on_buffer+0xe/0x20 Nov 12 17:08:09 servername kernel: [] __wait_on_bit+0x60/0x90 Nov 12 17:08:09 servername kernel: [] ? generic_block_bmap+0x70/0x70 Nov 12 17:08:09 servername kernel: [] out_of_line_wait_on_bit+0x87/0xb0 Nov 12 17:08:09 servername kernel: [] ? wake_atomic_t_function+0x40/0x40 Nov 12 17:08:09 servername kernel: [] __wait_on_buffer+0x2a/0x30 Nov 12 17:08:09 servername kernel: [] jbd2_journal_commit_transaction+0xe88/0x19a0 [jbd2] Nov 12 17:08:09 servername kernel: [] ? __switch_to+0xf8/0x4b0 Nov 12 17:08:09 servername kernel: [] kjournald2+0xc9/0x260 [jbd2] Nov 12 17:08:09 servername kernel: [] ? wake_up_atomic_t+0x30/0x30 Nov 12 17:08:09 servername kernel: [] ? commit_timeout+0x10/0x10 [jbd2] Nov 12 17:08:09 servername kernel: [] kthread+0xcf/0xe0 Nov 12 17:08:09 servername kernel: [] ? kthread_create_on_node+0x140/0x140 Nov 12 17:08:09 servername kernel: [] ret_from_fork+0x58/0x90 Nov 12 17:08:09 servername kernel: [] ? kthread_create_on_node+0x140/0x140 Nov 12 17:08:09 servername kernel: INFO: task kworker/u12:1:17864 blocked for more than 120 seconds. Nov 12 17:08:09 servername kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 12 17:08:09 servername kernel: kworker/u12:1 D ffff88025a3f36b0 0 17864 2 0x00000080 Nov 12 17:08:09 servername kernel: Workqueue: writeback bdi_writeback_workfn (flush-253:2) Nov 12 17:08:09 servername kernel: ffff88025a3f3550 0000000000000046 ffff880018d2e780 ffff88025a3f3fd8 Nov 12 17:08:09 servername kernel: ffff88025a3f3fd8 ffff88025a3f3fd8 ffff880018d2e780 ffff88047fd14780 Nov 12 17:08:09 servername kernel: 0000000000000000 7fffffffffffffff ffffffffa0428720 ffff88025a3f36b0 Nov 12 17:08:09 servername kernel: Call Trace: Nov 12 17:08:09 servername kernel: [] ? start_this_handle+0x5d0/0x5d0 [jbd2] Nov 12 17:08:09 servername kernel: [] schedule+0x29/0x70 Nov 12 17:08:09 servername kernel: [] schedule_timeout+0x209/0x2d0 Nov 12 17:08:09 servername kernel: [] ? blk_throtl_bio+0xc9/0x550 Nov 12 17:08:09 servername kernel: [] ? read_tsc+0x9/0x10 Nov 12 17:08:09 servername kernel: [] ? start_this_handle+0x5d0/0x5d0 [jbd2] Nov 12 17:08:09 servername kernel: [] io_schedule_timeout+0xae/0x130 Nov 12 17:08:09 servername kernel: [] io_schedule+0x18/0x20 Nov 12 17:08:09 servername kernel: [] sleep_on_shadow_bh+0xe/0x20 [jbd2] Nov 12 17:08:09 servername kernel: [] __wait_on_bit+0x60/0x90 Nov 12 17:08:09 servername kernel: [] ? start_this_handle+0x5d0/0x5d0 [jbd2] Nov 12 17:08:09 servername kernel: [] out_of_line_wait_on_bit+0x87/0xb0 Nov 12 17:08:09 servername kernel: [] ? wake_atomic_t_function+0x40/0x40 Nov 12 17:08:09 servername kernel: [] do_get_write_access+0x2dd/0x4e0 [jbd2] Nov 12 17:08:09 servername kernel: [] ? __getblk+0x2d/0x2e0 Nov 12 17:08:09 servername kernel: [] ? __jbd2_journal_file_buffer+0xe4/0x220 [jbd2] Nov 12 17:08:09 servername kernel: [] jbd2_journal_get_write_access+0x27/0x40 [jbd2] Nov 12 17:08:09 servername kernel: [] __ext4_journal_get_write_access+0x3b/0x80 [ext4] Nov 12 17:08:09 servername kernel: [] ext4_reserve_inode_write+0x70/0xa0 [ext4] Nov 12 17:08:09 servername kernel: [] ? ext4_dirty_inode+0x40/0x60 [ext4] Nov 12 17:08:09 servername kernel: [] ext4_mark_inode_dirty+0x53/0x210 [ext4] Nov 12 17:08:09 servername kernel: [] ext4_dirty_inode+0x40/0x60 [ext4] Nov 12 17:08:09 servername kernel: [] __mark_inode_dirty+0xca/0x290 Nov 12 17:08:09 servername kernel: [] ext4_da_update_reserve_space+0x13d/0x190 [ext4] Nov 12 17:08:09 servername kernel: [] ext4_ext_map_blocks+0x9d5/0xed0 [ext4] Nov 12 17:08:09 servername kernel: [] ext4_map_blocks+0x179/0x590 [ext4] Nov 12 17:08:09 servername kernel: [] ext4_writepages+0x692/0xd60 [ext4] Nov 12 17:08:09 servername kernel: [] do_writepages+0x1e/0x40 Nov 12 17:08:09 servername kernel: [] __writeback_single_inode+0x40/0x220 Nov 12 17:08:09 servername kernel: [] writeback_sb_inodes+0x25e/0x420 Nov 12 17:08:09 servername kernel: [] __writeback_inodes_wb+0x9f/0xd0 Nov 12 17:08:09 servername kernel: [] wb_writeback+0x263/0x2f0 Nov 12 17:08:09 servername kernel: [] ? get_nr_inodes+0x4c/0x70 Nov 12 17:08:09 servername kernel: [] bdi_writeback_workfn+0x2cb/0x460 Nov 12 17:08:09 servername kernel: [] process_one_work+0x17b/0x470 Nov 12 17:08:09 servername kernel: [] worker_thread+0x11b/0x400 Nov 12 17:08:09 servername kernel: [] ? rescuer_thread+0x400/0x400 Nov 12 17:08:09 servername kernel: [] kthread+0xcf/0xe0 Nov 12 17:08:09 servername kernel: [] ? kthread_create_on_node+0x140/0x140 Nov 12 17:08:09 servername kernel: [] ret_from_fork+0x58/0x90 Nov 12 17:08:09 servername kernel: [] ? kthread_create_on_node+0x140/0x140 5. The system was then broke at that point as the LV was suspended : [root@servername ~]# lvdisplay /dev/vgname/lvname | grep Status LV Status suspended 6. Any attempt to activate it results in the following : [root@servername ~]# lvchange -ay /dev/vgname/lvname device-mapper: resume ioctl on (253:2) failed: Invalid argument Unable to resume vgname-lvname (253:2) [98337.345943] device-mapper: table: 253:2: sdb1 too small for target: start=35156248576, len=37756928, dev_size=35156248576 [root@servername ~]# pvresize /dev/sdb1 /dev/sdb1: cannot resize to 4291534 extents as 4296143 are allocated. 0 physical volume(s) resized / 1 physical volume(s) not resized 7. Prior to the operation we see the following VG config file : [root@servername archive]# cat vgname_00014-1742536132.vg # Generated by LVM2 version 2.02.130(2)-RHEL7 (2015-10-14): Mon Nov 12 17:05:58 2018 contents = "Text Format Volume Group" version = 1 description = "Created *before* executing 'lvextend -L+200G -r /dev/vgname/lvname'" creation_time = 1542042358 # Mon Nov 12 17:05:58 2018 vgname { id = "D0lvmr-QXbz-4hmv-IFMn-KGku-B0ER-dMt60n" seqno = 14 format = "lvm2" # informational status = ["RESIZEABLE", "READ", "WRITE"] flags = [] extent_size = 8192 # 4 Megabytes max_lv = 0 max_pv = 0 metadata_copies = 0 physical_volumes { pv0 { id = "gw60sW-aoKL-Cfcf-h2aV-cgHY-yrtq-i7zB5C" device = "/dev/sdb1" # Hint only status = ["ALLOCATABLE"] flags = [] dev_size = 36228047093 # 16.87 Terabytes pe_start = 2048 pe_count = 4422369 # 16.87 Terabytes } pv1 { id = "ntEAG3-JMQ8-WEqf-kGcR-Qgeb-lHEF-2jbVbs" device = "/dev/sdb2" # Hint only status = ["ALLOCATABLE"] flags = [] dev_size = 27347265536 # 12.7346 Terabytes pe_start = 2048 pe_count = 3338289 # 12.7346 Terabytes } } logical_volumes { lvname { id = "oUcIdD-uD5m-M0EY-33XQ-U5Yn-7ez1-ejRzbZ" status = ["READ", "WRITE", "VISIBLE"] flags = [] creation_host = "servername" creation_time = 1477501162 # 2016-10-26 17:59:22 +0100 segment_count = 2 segment1 { start_extent = 0 extent_count = 4291534 # 16.3709 Terabytes type = "striped" stripe_count = 1 # linear stripes = [ "pv0", 0 ] } segment2 { start_extent = 4291534 extent_count = 3291698 # 12.5568 Terabytes type = "striped" stripe_count = 1 # linear stripes = [ "pv1", 0 ] } } } } 8. Any operation afterwards shows the following extra segment in the lvname volume : [root@servername archive]# cat vgname_00015-765415489.vg # Generated by LVM2 version 2.02.130(2)-RHEL7 (2015-10-14): Fri Nov 23 10:56:50 2018 contents = "Text Format Volume Group" version = 1 description = "Created *before* executing 'pvresize -v /dev/sdb1'" creation_time = 1542970610 # Fri Nov 23 10:56:50 2018 vgname { id = "D0lvmr-QXbz-4hmv-IFMn-KGku-B0ER-dMt60n" seqno = 15 format = "lvm2" # informational status = ["RESIZEABLE", "READ", "WRITE"] flags = [] extent_size = 8192 # 4 Megabytes max_lv = 0 max_pv = 0 metadata_copies = 0 physical_volumes { pv0 { id = "gw60sW-aoKL-Cfcf-h2aV-cgHY-yrtq-i7zB5C" device = "/dev/sdb1" # Hint only status = ["ALLOCATABLE"] flags = [] dev_size = 36228047093 # 16.87 Terabytes pe_start = 2048 pe_count = 4422369 # 16.87 Terabytes } pv1 { id = "ntEAG3-JMQ8-WEqf-kGcR-Qgeb-lHEF-2jbVbs" device = "/dev/sdb2" # Hint only status = ["ALLOCATABLE"] flags = [] dev_size = 27347265536 # 12.7346 Terabytes pe_start = 2048 pe_count = 3338289 # 12.7346 Terabytes } } logical_volumes { lvname { id = "oUcIdD-uD5m-M0EY-33XQ-U5Yn-7ez1-ejRzbZ" status = ["READ", "WRITE", "VISIBLE"] flags = [] creation_host = "servername" creation_time = 1477501162 # 2016-10-26 17:59:22 +0100 segment_count = 3 segment1 { start_extent = 0 extent_count = 4291534 # 16.3709 Terabytes type = "striped" stripe_count = 1 # linear stripes = [ "pv0", 0 ] } segment2 { start_extent = 4291534 extent_count = 3338289 # 12.7346 Terabytes type = "striped" stripe_count = 1 # linear stripes = [ "pv1", 0 ] } segment3 { start_extent = 7629823 extent_count = 4609 # 18.0039 Gigabytes type = "striped" stripe_count = 1 # linear stripes = [ "pv0", 4291534 ] } } } } 9. Note that segment1 extent_count(4291534) + segment3 extent_count(4609) = 4296143 which relates to the following error : /dev/sdb1: cannot resize to 4291534 extents as 4296143 are allocated. Any thoughts on how to potentially recover from this would be most greatly appreciated ! Thanks very much.