From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mimecast-mx02.redhat.com (mimecast03.extmail.prod.ext.rdu2.redhat.com [10.11.55.19]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 24BA16B5BF for ; Wed, 18 Nov 2020 14:45:37 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1B45A811E87 for ; Wed, 18 Nov 2020 14:45:37 +0000 (UTC) References: <20201117154516.GA18257@redhat.com> From: "heming.zhao@suse.com" Message-ID: <4d590441-4bf3-cd37-52c2-9fbf34a8194c@suse.com> Date: Wed, 18 Nov 2020 22:45:07 +0800 In-Reply-To: <20201117154516.GA18257@redhat.com> MIME-Version: 1.0 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [linux-lvm] issue about return value in _lvchange_activate_single Reply-To: LVM general discussion and development List-Id: LVM general discussion and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , List-Id: Content-Type: text/plain; charset="us-ascii" To: David Teigland Cc: Zdenek Kabelac , LVM general discussion and development On 11/17/20 11:45 PM, David Teigland wrote: > On Tue, Nov 17, 2020 at 11:52:28AM +0800, heming.zhao@suse.com wrote: >> In lvm functions, it treats "return 0" as error case. >> >> if _lvchange_activate() return ECMD_FAILED, the caller _lvchange_activate_single() think as normal: >> ``` >> if (!_lvchange_activate(cmd, lv)) <== ECMD_FAILED is 5, won't enter if case. >> return_ECMD_FAILED; > > Thanks for finding that. In some places 0 is the error value and in other > places ECMD_FAILED is the error value; they frequently get mixed up. > I believe this is the bug you are seeing: > https://sourceware.org/git/?p=lvm2.git;a=commit;h=aba9652e584b6f6a422233dea951eb59326a3de2 > I recommend to use a define (e.g. E_COMM_ERR) not magic number (ZERO) to replace return value. >> 2. node2 change the systemid to itself >> >> ``` >> [tb-clustermd2 ~]# vgchange -y --config "local/extra_system_ids='tb-clustermd1'" --systemid tb-clustermd2 vg1 >> Volume group "vg1" successfully changed >> [tb-clustermd2 ~]# lvchange -ay vg1/lv1 >> [tb-clustermd2 ~]# dmsetup ls >> vg1-lv1 (254:0) > > This is what the LVM-activate resource agent does, except it wouldn't be > done while the LV is active on another running host. Just wanted to > clarify that, I don't think it's the point of your illustration here. > >> 3. this time both sides have dm device. >> ``` >> [tb-clustermd1 ~]# dmsetup ls >> vg1-lv1 (254:0) >> [tb-clustermd2 ~]# dmsetup ls >> vg1-lv1 (254:0) >> ``` > > For the sake of anyone looking at this later, this shouldn't happen in a > properly running cluster. (If you wanted the LV active on two hosts at > once, you'd use lvmlockd and no system ID on the VG.) > >> 4. node1 executes lvchange cmds. please note the return value is 0 >> ``` >> [tb-clustermd1 ~]# lvchange -ay vg1/lv1 ; echo $? >> WARNING: Found LVs active in VG vg1 with foreign system ID tb-clustermd2. Possible data corruption. >> Cannot activate LVs in a foreign VG. >> 0 > > That's the one fixed by the commit above. > > Dave >