All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: Non existing monitor
@ 2014-08-01 10:03 Aanchal Agrawal
  2014-08-01 12:46 ` Gregory Farnum
  0 siblings, 1 reply; 9+ messages in thread
From: Aanchal Agrawal @ 2014-08-01 10:03 UTC (permalink / raw)
  To: ceph-devel; +Cc: Pavan Rallabhandi, Gregory Farnum, Samuel Just

Any help on this ...

-----Original Message-----
From: Aanchal Agrawal
Sent: Wednesday, July 30, 2014 3:37 PM
To: 'ceph-devel@vger.kernel.org'
Cc: Pavan Rallabhandi
Subject: Non existing monitor

Hi,

We found a case(bug?) in ceph mon code, where in, an attempt to start a non-existing monitor is throwing up a levelDB error saying "failed to create new leveldb store", instead we thought an appropriate message say "No Monitor present with that id" would do, by checking for the monitor existence way ahead.

It seems that 'mon_exists()' checks for the existence of the mon data directory(via 'mon_data_exists()') and also for the non-empty nature of that directory(via 'mon_data_empty()'). The fix seemed pretty simple, as to flag the appropriate message if 'mon_data_exists()' were to set 'exists' to 'false', in case mkfs is not set.

The other behavior that we are seeking clarity, again in case of mkfs not being set is, if 'mon_data_exists()' sets 'exists' to 'true' and 'mon_data_empty()' sets 'exists' to 'false' (meaning the mon data directory is present, but it is empty), then the current code seems to be going ahead in an attempt to open the 'store.db', and when open fails, it tries to create a new 'store.db' (though mkfs is not set) and eventually gives up throwing "unable to read magic from mon data".

The questions we had around this were:

1) Though in case of mkfs not being set, what is the reason for creating a new levelDB store in case an attempt to open the 'store.db' is a failure, as levelDB anyways seem to be throwing 'magic' error going forward. Are there any use cases for this scenario?
2) And also, is it valid to flag "No Monitor present" in case the mon data directory is existing, but with no data('store.db') in it, in case mkfs is not set?

Thanks,
Aanchal

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Non existing monitor
  2014-08-01 10:03 Non existing monitor Aanchal Agrawal
@ 2014-08-01 12:46 ` Gregory Farnum
  2014-08-01 18:20   ` Pavan Rallabhandi
  0 siblings, 1 reply; 9+ messages in thread
From: Gregory Farnum @ 2014-08-01 12:46 UTC (permalink / raw)
  To: Aanchal Agrawal; +Cc: ceph-devel, Pavan Rallabhandi, Samuel Just, Joao Luis

On Fri, Aug 1, 2014 at 6:03 AM, Aanchal Agrawal
<Aanchal.Agrawal@sandisk.com> wrote:
> Any help on this ...
>
> -----Original Message-----
> From: Aanchal Agrawal
> Sent: Wednesday, July 30, 2014 3:37 PM
> To: 'ceph-devel@vger.kernel.org'
> Cc: Pavan Rallabhandi
> Subject: Non existing monitor
>
> Hi,
>
> We found a case(bug?) in ceph mon code, where in, an attempt to start a non-existing monitor is throwing up a levelDB error saying "failed to create new leveldb store", instead we thought an appropriate message say "No Monitor present with that id" would do, by checking for the monitor existence way ahead.
>
> It seems that 'mon_exists()' checks for the existence of the mon data directory(via 'mon_data_exists()') and also for the non-empty nature of that directory(via 'mon_data_empty()'). The fix seemed pretty simple, as to flag the appropriate message if 'mon_data_exists()' were to set 'exists' to 'false', in case mkfs is not set.
>
> The other behavior that we are seeking clarity, again in case of mkfs not being set is, if 'mon_data_exists()' sets 'exists' to 'true' and 'mon_data_empty()' sets 'exists' to 'false' (meaning the mon data directory is present, but it is empty), then the current code seems to be going ahead in an attempt to open the 'store.db', and when open fails, it tries to create a new 'store.db' (though mkfs is not set) and eventually gives up throwing "unable to read magic from mon data".
>
> The questions we had around this were:
>
> 1) Though in case of mkfs not being set, what is the reason for creating a new levelDB store in case an attempt to open the 'store.db' is a failure, as levelDB anyways seem to be throwing 'magic' error going forward. Are there any use cases for this scenario?
> 2) And also, is it valid to flag "No Monitor present" in case the mon data directory is existing, but with no data('store.db') in it, in case mkfs is not set?

This is a little unclear to me. Can you describe exactly what commands
you're running and what the response from the monitor is?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Non existing monitor
  2014-08-01 12:46 ` Gregory Farnum
@ 2014-08-01 18:20   ` Pavan Rallabhandi
  2014-08-02 13:17     ` Joao Eduardo Luis
  0 siblings, 1 reply; 9+ messages in thread
From: Pavan Rallabhandi @ 2014-08-01 18:20 UTC (permalink / raw)
  To: Gregory Farnum, Aanchal Agrawal; +Cc: ceph-devel, Samuel Just, Joao Luis

Greg,

The commands used were to start monitor for an instance id that is non existing, for which a leveldb store error is thrown:

<snip>

src/ceph-mon -i foo
2014-08-01 02:47:08.210208 7f7341c3c800 -1 failed to create new leveldb store

<\snip>

The idea is to fix this behavior by throwing a relevant error.

Thanks,
-Pavan.

-----Original Message-----
From: Gregory Farnum [mailto:greg@inktank.com]
Sent: Friday, August 01, 2014 6:16 PM
To: Aanchal Agrawal
Cc: ceph-devel@vger.kernel.org; Pavan Rallabhandi; Samuel Just; Joao Luis
Subject: Re: Non existing monitor

On Fri, Aug 1, 2014 at 6:03 AM, Aanchal Agrawal <Aanchal.Agrawal@sandisk.com> wrote:
> Any help on this ...
>
> -----Original Message-----
> From: Aanchal Agrawal
> Sent: Wednesday, July 30, 2014 3:37 PM
> To: 'ceph-devel@vger.kernel.org'
> Cc: Pavan Rallabhandi
> Subject: Non existing monitor
>
> Hi,
>
> We found a case(bug?) in ceph mon code, where in, an attempt to start a non-existing monitor is throwing up a levelDB error saying "failed to create new leveldb store", instead we thought an appropriate message say "No Monitor present with that id" would do, by checking for the monitor existence way ahead.
>
> It seems that 'mon_exists()' checks for the existence of the mon data directory(via 'mon_data_exists()') and also for the non-empty nature of that directory(via 'mon_data_empty()'). The fix seemed pretty simple, as to flag the appropriate message if 'mon_data_exists()' were to set 'exists' to 'false', in case mkfs is not set.
>
> The other behavior that we are seeking clarity, again in case of mkfs not being set is, if 'mon_data_exists()' sets 'exists' to 'true' and 'mon_data_empty()' sets 'exists' to 'false' (meaning the mon data directory is present, but it is empty), then the current code seems to be going ahead in an attempt to open the 'store.db', and when open fails, it tries to create a new 'store.db' (though mkfs is not set) and eventually gives up throwing "unable to read magic from mon data".
>
> The questions we had around this were:
>
> 1) Though in case of mkfs not being set, what is the reason for creating a new levelDB store in case an attempt to open the 'store.db' is a failure, as levelDB anyways seem to be throwing 'magic' error going forward. Are there any use cases for this scenario?
> 2) And also, is it valid to flag "No Monitor present" in case the mon data directory is existing, but with no data('store.db') in it, in case mkfs is not set?

This is a little unclear to me. Can you describe exactly what commands you're running and what the response from the monitor is?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Non existing monitor
  2014-08-01 18:20   ` Pavan Rallabhandi
@ 2014-08-02 13:17     ` Joao Eduardo Luis
  2014-08-04  6:43       ` Aanchal Agrawal
  0 siblings, 1 reply; 9+ messages in thread
From: Joao Eduardo Luis @ 2014-08-02 13:17 UTC (permalink / raw)
  To: Pavan Rallabhandi, Gregory Farnum, Aanchal Agrawal
  Cc: ceph-devel, Samuel Just

On 01/08/14 18:20, Pavan Rallabhandi wrote:
> Greg,
>
> The commands used were to start monitor for an instance id that is non existing, for which a leveldb store error is thrown:
>
> <snip>
>
> src/ceph-mon -i foo
> 2014-08-01 02:47:08.210208 7f7341c3c800 -1 failed to create new leveldb store
>
> <\snip>
>
> The idea is to fix this behavior by throwing a relevant error.

The infrastructure is there (ceph_mon.cc) but is not doing what I 
believe it should.  It was introduced in 
1eafe8dc45419a6b7d319345cac7fbc0d684d1b1.  I'm fixing it now.

   -Joao

>
> Thanks,
> -Pavan.
>
> -----Original Message-----
> From: Gregory Farnum [mailto:greg@inktank.com]
> Sent: Friday, August 01, 2014 6:16 PM
> To: Aanchal Agrawal
> Cc: ceph-devel@vger.kernel.org; Pavan Rallabhandi; Samuel Just; Joao Luis
> Subject: Re: Non existing monitor
>
> On Fri, Aug 1, 2014 at 6:03 AM, Aanchal Agrawal <Aanchal.Agrawal@sandisk.com> wrote:
>> Any help on this ...
>>
>> -----Original Message-----
>> From: Aanchal Agrawal
>> Sent: Wednesday, July 30, 2014 3:37 PM
>> To: 'ceph-devel@vger.kernel.org'
>> Cc: Pavan Rallabhandi
>> Subject: Non existing monitor
>>
>> Hi,
>>
>> We found a case(bug?) in ceph mon code, where in, an attempt to start a non-existing monitor is throwing up a levelDB error saying "failed to create new leveldb store", instead we thought an appropriate message say "No Monitor present with that id" would do, by checking for the monitor existence way ahead.
>>
>> It seems that 'mon_exists()' checks for the existence of the mon data directory(via 'mon_data_exists()') and also for the non-empty nature of that directory(via 'mon_data_empty()'). The fix seemed pretty simple, as to flag the appropriate message if 'mon_data_exists()' were to set 'exists' to 'false', in case mkfs is not set.
>>
>> The other behavior that we are seeking clarity, again in case of mkfs not being set is, if 'mon_data_exists()' sets 'exists' to 'true' and 'mon_data_empty()' sets 'exists' to 'false' (meaning the mon data directory is present, but it is empty), then the current code seems to be going ahead in an attempt to open the 'store.db', and when open fails, it tries to create a new 'store.db' (though mkfs is not set) and eventually gives up throwing "unable to read magic from mon data".
>>
>> The questions we had around this were:
>>
>> 1) Though in case of mkfs not being set, what is the reason for creating a new levelDB store in case an attempt to open the 'store.db' is a failure, as levelDB anyways seem to be throwing 'magic' error going forward. Are there any use cases for this scenario?
>> 2) And also, is it valid to flag "No Monitor present" in case the mon data directory is existing, but with no data('store.db') in it, in case mkfs is not set?
>
> This is a little unclear to me. Can you describe exactly what commands you're running and what the response from the monitor is?
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
> ________________________________
>
> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
>


-- 
Joao Eduardo Luis
Software Engineer | http://inktank.com | http://ceph.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Non existing monitor
  2014-08-02 13:17     ` Joao Eduardo Luis
@ 2014-08-04  6:43       ` Aanchal Agrawal
  2014-08-04 14:40         ` Joao Eduardo Luis
  0 siblings, 1 reply; 9+ messages in thread
From: Aanchal Agrawal @ 2014-08-04  6:43 UTC (permalink / raw)
  To: Joao Eduardo Luis, Pavan Rallabhandi, Gregory Farnum
  Cc: ceph-devel, Samuel Just

Hi Joao,

I have already fixed this issue(let me know if the fix looks good) and was about to send the pull request with the following diff.
But I still had couple of unanswered questions, hence this mail chain. And they were:

1) Though in case of mkfs not being set, what is the reason for creating a new levelDB store in case an attempt to open the 'store.db' is a failure, as levelDB anyways seem to be throwing 'magic' error going forward. Are there any use cases for this scenario?
2) And also, is it valid to flag "No Monitor present" in case the mon data directory is existing, but with no data('store.db') in it, in case mkfs is not set?

========================
<snip>
 
@@ -152,11 +152,20 @@ int mon_data_empty(bool *r)
   return code;
 }
 
-int mon_exists(bool *r)
+int mon_exists(bool *r, bool mkfs)
 {
   int code = mon_data_exists(r);
-  if (code || *r == false)
+  if (code)
     return code;
+  // If mon_data(directory) not present, check for mkfs
+  if (*r == false) {
+    if (!mkfs) {
+      cerr << "No monitor with ID " << g_conf->name.get_id() << " in cluster" << std::endl;
+      return 1;
+    }
+    // If mkfs is true, we want to create a new directory, hence return 0
+    return code;
+  }
   return mon_data_empty(r);
 }
 
@@ -268,7 +277,7 @@ int main(int argc, const char **argv)
   }
 
   bool exists;
-  if (mon_exists(&exists))
+  if (mon_exists(&exists, mkfs))
     exit(1);
 
   if (mkfs && exists) {
@@ -452,7 +461,7 @@ int main(int argc, const char **argv)

<\snip>
==========================

Thanks,
Aanchal

-----Original Message-----
From: Joao Eduardo Luis [mailto:joao.luis@inktank.com] 
Sent: Saturday, August 02, 2014 6:48 PM
To: Pavan Rallabhandi; Gregory Farnum; Aanchal Agrawal
Cc: ceph-devel@vger.kernel.org; Samuel Just
Subject: Re: Non existing monitor

On 01/08/14 18:20, Pavan Rallabhandi wrote:
> Greg,
>
> The commands used were to start monitor for an instance id that is non existing, for which a leveldb store error is thrown:
>
> <snip>
>
> src/ceph-mon -i foo
> 2014-08-01 02:47:08.210208 7f7341c3c800 -1 failed to create new 
> leveldb store
>
> <\snip>
>
> The idea is to fix this behavior by throwing a relevant error.

The infrastructure is there (ceph_mon.cc) but is not doing what I believe it should.  It was introduced in 1eafe8dc45419a6b7d319345cac7fbc0d684d1b1.  I'm fixing it now.

   -Joao

>
> Thanks,
> -Pavan.
>
> -----Original Message-----
> From: Gregory Farnum [mailto:greg@inktank.com]
> Sent: Friday, August 01, 2014 6:16 PM
> To: Aanchal Agrawal
> Cc: ceph-devel@vger.kernel.org; Pavan Rallabhandi; Samuel Just; Joao 
> Luis
> Subject: Re: Non existing monitor
>
> On Fri, Aug 1, 2014 at 6:03 AM, Aanchal Agrawal <Aanchal.Agrawal@sandisk.com> wrote:
>> Any help on this ...
>>
>> -----Original Message-----
>> From: Aanchal Agrawal
>> Sent: Wednesday, July 30, 2014 3:37 PM
>> To: 'ceph-devel@vger.kernel.org'
>> Cc: Pavan Rallabhandi
>> Subject: Non existing monitor
>>
>> Hi,
>>
>> We found a case(bug?) in ceph mon code, where in, an attempt to start a non-existing monitor is throwing up a levelDB error saying "failed to create new leveldb store", instead we thought an appropriate message say "No Monitor present with that id" would do, by checking for the monitor existence way ahead.
>>
>> It seems that 'mon_exists()' checks for the existence of the mon data directory(via 'mon_data_exists()') and also for the non-empty nature of that directory(via 'mon_data_empty()'). The fix seemed pretty simple, as to flag the appropriate message if 'mon_data_exists()' were to set 'exists' to 'false', in case mkfs is not set.
>>
>> The other behavior that we are seeking clarity, again in case of mkfs not being set is, if 'mon_data_exists()' sets 'exists' to 'true' and 'mon_data_empty()' sets 'exists' to 'false' (meaning the mon data directory is present, but it is empty), then the current code seems to be going ahead in an attempt to open the 'store.db', and when open fails, it tries to create a new 'store.db' (though mkfs is not set) and eventually gives up throwing "unable to read magic from mon data".
>>
>> The questions we had around this were:
>>
>> 1) Though in case of mkfs not being set, what is the reason for creating a new levelDB store in case an attempt to open the 'store.db' is a failure, as levelDB anyways seem to be throwing 'magic' error going forward. Are there any use cases for this scenario?
>> 2) And also, is it valid to flag "No Monitor present" in case the mon data directory is existing, but with no data('store.db') in it, in case mkfs is not set?
>
> This is a little unclear to me. Can you describe exactly what commands you're running and what the response from the monitor is?
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
> ________________________________
>
> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
>


--
Joao Eduardo Luis
Software Engineer | http://inktank.com | http://ceph.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Non existing monitor
  2014-08-04  6:43       ` Aanchal Agrawal
@ 2014-08-04 14:40         ` Joao Eduardo Luis
  2014-08-05 11:01           ` Pavan Rallabhandi
  0 siblings, 1 reply; 9+ messages in thread
From: Joao Eduardo Luis @ 2014-08-04 14:40 UTC (permalink / raw)
  To: Aanchal Agrawal, Pavan Rallabhandi, Gregory Farnum
  Cc: ceph-devel, Samuel Just

Hello Aanchal,

On 04/08/14 06:43, Aanchal Agrawal wrote:
> Hi Joao,
>
> I have already fixed this issue(let me know if the fix looks good) and was about to send the pull request with the following diff.
> But I still had couple of unanswered questions, hence this mail chain. And they were:

I was in the middle of finishing a patch when I read your email. 
Although your patch looks good as a fix for the incorrect behaviour, I 
believe how the code itself was structured and the holes it left opened 
was concerning, so I pushed my branch nonetheless.  You can find it in 
ceph's github repository, branch name 'wip-mon-empty-store'.  A review 
would be appreciated, and any concerns you may have will be welcome.

>
> 1) Though in case of mkfs not being set, what is the reason for creating a new levelDB store in case an attempt to open the 'store.db' is a failure, as levelDB anyways seem to be throwing 'magic' error going forward. Are there any use cases for this scenario?
> 2) And also, is it valid to flag "No Monitor present" in case the mon data directory is existing, but with no data('store.db') in it, in case mkfs is not set?

With regard to 1), it is a bug.  On open we specified that leveldb 
should not create a new database (see db->open() vs 
db->create_and_open()).  However, leveldb will create the directory 
structure if it is missing nonetheless, and this is what was creating 
those issues.  The patch in the branch I mentioned above will work 
around that by checking whether the monitor's data directory is empty 
before opening the database -- thus, if it is indeed empty, it will not 
call on db->open() and therefore avoid populating the directory with 
senseless noise.

About 2), 'store.db' may not be present if we are upgrading a release 
prior to cuttlefish, as at the time the monitor store format was 
different.  We use the absence of 'store.db' as an indicator to check 
for a possible need for a format upgrade (which will then be performed 
by Monitor::StoreConverter).  However, we can use the total and complete 
absence of files in the monitor's data directory as a definite indicator 
that there's no monitor store at that path.

Hope this sheds some light on your questions.

   -Joao




> ========================
> <snip>
>
> @@ -152,11 +152,20 @@ int mon_data_empty(bool *r)
>     return code;
>   }
>
> -int mon_exists(bool *r)
> +int mon_exists(bool *r, bool mkfs)
>   {
>     int code = mon_data_exists(r);
> -  if (code || *r == false)
> +  if (code)
>       return code;
> +  // If mon_data(directory) not present, check for mkfs
> +  if (*r == false) {
> +    if (!mkfs) {
> +      cerr << "No monitor with ID " << g_conf->name.get_id() << " in cluster" << std::endl;
> +      return 1;
> +    }
> +    // If mkfs is true, we want to create a new directory, hence return 0
> +    return code;
> +  }
>     return mon_data_empty(r);
>   }
>
> @@ -268,7 +277,7 @@ int main(int argc, const char **argv)
>     }
>
>     bool exists;
> -  if (mon_exists(&exists))
> +  if (mon_exists(&exists, mkfs))
>       exit(1);
>
>     if (mkfs && exists) {
> @@ -452,7 +461,7 @@ int main(int argc, const char **argv)
>
> <\snip>
> ==========================
>
> Thanks,
> Aanchal
>
> -----Original Message-----
> From: Joao Eduardo Luis [mailto:joao.luis@inktank.com]
> Sent: Saturday, August 02, 2014 6:48 PM
> To: Pavan Rallabhandi; Gregory Farnum; Aanchal Agrawal
> Cc: ceph-devel@vger.kernel.org; Samuel Just
> Subject: Re: Non existing monitor
>
> On 01/08/14 18:20, Pavan Rallabhandi wrote:
>> Greg,
>>
>> The commands used were to start monitor for an instance id that is non existing, for which a leveldb store error is thrown:
>>
>> <snip>
>>
>> src/ceph-mon -i foo
>> 2014-08-01 02:47:08.210208 7f7341c3c800 -1 failed to create new
>> leveldb store
>>
>> <\snip>
>>
>> The idea is to fix this behavior by throwing a relevant error.
>
> The infrastructure is there (ceph_mon.cc) but is not doing what I believe it should.  It was introduced in 1eafe8dc45419a6b7d319345cac7fbc0d684d1b1.  I'm fixing it now.
>
>     -Joao
>
>>
>> Thanks,
>> -Pavan.
>>
>> -----Original Message-----
>> From: Gregory Farnum [mailto:greg@inktank.com]
>> Sent: Friday, August 01, 2014 6:16 PM
>> To: Aanchal Agrawal
>> Cc: ceph-devel@vger.kernel.org; Pavan Rallabhandi; Samuel Just; Joao
>> Luis
>> Subject: Re: Non existing monitor
>>
>> On Fri, Aug 1, 2014 at 6:03 AM, Aanchal Agrawal <Aanchal.Agrawal@sandisk.com> wrote:
>>> Any help on this ...
>>>
>>> -----Original Message-----
>>> From: Aanchal Agrawal
>>> Sent: Wednesday, July 30, 2014 3:37 PM
>>> To: 'ceph-devel@vger.kernel.org'
>>> Cc: Pavan Rallabhandi
>>> Subject: Non existing monitor
>>>
>>> Hi,
>>>
>>> We found a case(bug?) in ceph mon code, where in, an attempt to start a non-existing monitor is throwing up a levelDB error saying "failed to create new leveldb store", instead we thought an appropriate message say "No Monitor present with that id" would do, by checking for the monitor existence way ahead.
>>>
>>> It seems that 'mon_exists()' checks for the existence of the mon data directory(via 'mon_data_exists()') and also for the non-empty nature of that directory(via 'mon_data_empty()'). The fix seemed pretty simple, as to flag the appropriate message if 'mon_data_exists()' were to set 'exists' to 'false', in case mkfs is not set.
>>>
>>> The other behavior that we are seeking clarity, again in case of mkfs not being set is, if 'mon_data_exists()' sets 'exists' to 'true' and 'mon_data_empty()' sets 'exists' to 'false' (meaning the mon data directory is present, but it is empty), then the current code seems to be going ahead in an attempt to open the 'store.db', and when open fails, it tries to create a new 'store.db' (though mkfs is not set) and eventually gives up throwing "unable to read magic from mon data".
>>>
>>> The questions we had around this were:
>>>
>>> 1) Though in case of mkfs not being set, what is the reason for creating a new levelDB store in case an attempt to open the 'store.db' is a failure, as levelDB anyways seem to be throwing 'magic' error going forward. Are there any use cases for this scenario?
>>> 2) And also, is it valid to flag "No Monitor present" in case the mon data directory is existing, but with no data('store.db') in it, in case mkfs is not set?
>>
>> This is a little unclear to me. Can you describe exactly what commands you're running and what the response from the monitor is?
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>> ________________________________
>>
>> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
>>
>
>
> --
> Joao Eduardo Luis
> Software Engineer | http://inktank.com | http://ceph.com
>


-- 
Joao Eduardo Luis
Software Engineer | http://inktank.com | http://ceph.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Non existing monitor
  2014-08-04 14:40         ` Joao Eduardo Luis
@ 2014-08-05 11:01           ` Pavan Rallabhandi
  0 siblings, 0 replies; 9+ messages in thread
From: Pavan Rallabhandi @ 2014-08-05 11:01 UTC (permalink / raw)
  To: Joao Eduardo Luis, Aanchal Agrawal, Gregory Farnum
  Cc: ceph-devel, Samuel Just

Thanks Joao for the response, will track against the review for any more clarity.

Thanks,
-Pavan.

-----Original Message-----
From: ceph-devel-owner@vger.kernel.org [mailto:ceph-devel-owner@vger.kernel.org] On Behalf Of Joao Eduardo Luis
Sent: Monday, August 04, 2014 8:11 PM
To: Aanchal Agrawal; Pavan Rallabhandi; Gregory Farnum
Cc: ceph-devel@vger.kernel.org; Samuel Just
Subject: Re: Non existing monitor

Hello Aanchal,

On 04/08/14 06:43, Aanchal Agrawal wrote:
> Hi Joao,
>
> I have already fixed this issue(let me know if the fix looks good) and was about to send the pull request with the following diff.
> But I still had couple of unanswered questions, hence this mail chain. And they were:

I was in the middle of finishing a patch when I read your email. 
Although your patch looks good as a fix for the incorrect behaviour, I believe how the code itself was structured and the holes it left opened was concerning, so I pushed my branch nonetheless.  You can find it in ceph's github repository, branch name 'wip-mon-empty-store'.  A review would be appreciated, and any concerns you may have will be welcome.

>
> 1) Though in case of mkfs not being set, what is the reason for creating a new levelDB store in case an attempt to open the 'store.db' is a failure, as levelDB anyways seem to be throwing 'magic' error going forward. Are there any use cases for this scenario?
> 2) And also, is it valid to flag "No Monitor present" in case the mon data directory is existing, but with no data('store.db') in it, in case mkfs is not set?

With regard to 1), it is a bug.  On open we specified that leveldb should not create a new database (see db->open() vs 
db->create_and_open()).  However, leveldb will create the directory
structure if it is missing nonetheless, and this is what was creating those issues.  The patch in the branch I mentioned above will work around that by checking whether the monitor's data directory is empty before opening the database -- thus, if it is indeed empty, it will not call on db->open() and therefore avoid populating the directory with senseless noise.

About 2), 'store.db' may not be present if we are upgrading a release prior to cuttlefish, as at the time the monitor store format was different.  We use the absence of 'store.db' as an indicator to check for a possible need for a format upgrade (which will then be performed by Monitor::StoreConverter).  However, we can use the total and complete absence of files in the monitor's data directory as a definite indicator that there's no monitor store at that path.

Hope this sheds some light on your questions.

   -Joao




> ========================
> <snip>
>
> @@ -152,11 +152,20 @@ int mon_data_empty(bool *r)
>     return code;
>   }
>
> -int mon_exists(bool *r)
> +int mon_exists(bool *r, bool mkfs)
>   {
>     int code = mon_data_exists(r);
> -  if (code || *r == false)
> +  if (code)
>       return code;
> +  // If mon_data(directory) not present, check for mkfs  if (*r == 
> + false) {
> +    if (!mkfs) {
> +      cerr << "No monitor with ID " << g_conf->name.get_id() << " in cluster" << std::endl;
> +      return 1;
> +    }
> +    // If mkfs is true, we want to create a new directory, hence return 0
> +    return code;
> +  }
>     return mon_data_empty(r);
>   }
>
> @@ -268,7 +277,7 @@ int main(int argc, const char **argv)
>     }
>
>     bool exists;
> -  if (mon_exists(&exists))
> +  if (mon_exists(&exists, mkfs))
>       exit(1);
>
>     if (mkfs && exists) {
> @@ -452,7 +461,7 @@ int main(int argc, const char **argv)
>
> <\snip>
> ==========================
>
> Thanks,
> Aanchal
>
> -----Original Message-----
> From: Joao Eduardo Luis [mailto:joao.luis@inktank.com]
> Sent: Saturday, August 02, 2014 6:48 PM
> To: Pavan Rallabhandi; Gregory Farnum; Aanchal Agrawal
> Cc: ceph-devel@vger.kernel.org; Samuel Just
> Subject: Re: Non existing monitor
>
> On 01/08/14 18:20, Pavan Rallabhandi wrote:
>> Greg,
>>
>> The commands used were to start monitor for an instance id that is non existing, for which a leveldb store error is thrown:
>>
>> <snip>
>>
>> src/ceph-mon -i foo
>> 2014-08-01 02:47:08.210208 7f7341c3c800 -1 failed to create new 
>> leveldb store
>>
>> <\snip>
>>
>> The idea is to fix this behavior by throwing a relevant error.
>
> The infrastructure is there (ceph_mon.cc) but is not doing what I believe it should.  It was introduced in 1eafe8dc45419a6b7d319345cac7fbc0d684d1b1.  I'm fixing it now.
>
>     -Joao
>
>>
>> Thanks,
>> -Pavan.
>>
>> -----Original Message-----
>> From: Gregory Farnum [mailto:greg@inktank.com]
>> Sent: Friday, August 01, 2014 6:16 PM
>> To: Aanchal Agrawal
>> Cc: ceph-devel@vger.kernel.org; Pavan Rallabhandi; Samuel Just; Joao 
>> Luis
>> Subject: Re: Non existing monitor
>>
>> On Fri, Aug 1, 2014 at 6:03 AM, Aanchal Agrawal <Aanchal.Agrawal@sandisk.com> wrote:
>>> Any help on this ...
>>>
>>> -----Original Message-----
>>> From: Aanchal Agrawal
>>> Sent: Wednesday, July 30, 2014 3:37 PM
>>> To: 'ceph-devel@vger.kernel.org'
>>> Cc: Pavan Rallabhandi
>>> Subject: Non existing monitor
>>>
>>> Hi,
>>>
>>> We found a case(bug?) in ceph mon code, where in, an attempt to start a non-existing monitor is throwing up a levelDB error saying "failed to create new leveldb store", instead we thought an appropriate message say "No Monitor present with that id" would do, by checking for the monitor existence way ahead.
>>>
>>> It seems that 'mon_exists()' checks for the existence of the mon data directory(via 'mon_data_exists()') and also for the non-empty nature of that directory(via 'mon_data_empty()'). The fix seemed pretty simple, as to flag the appropriate message if 'mon_data_exists()' were to set 'exists' to 'false', in case mkfs is not set.
>>>
>>> The other behavior that we are seeking clarity, again in case of mkfs not being set is, if 'mon_data_exists()' sets 'exists' to 'true' and 'mon_data_empty()' sets 'exists' to 'false' (meaning the mon data directory is present, but it is empty), then the current code seems to be going ahead in an attempt to open the 'store.db', and when open fails, it tries to create a new 'store.db' (though mkfs is not set) and eventually gives up throwing "unable to read magic from mon data".
>>>
>>> The questions we had around this were:
>>>
>>> 1) Though in case of mkfs not being set, what is the reason for creating a new levelDB store in case an attempt to open the 'store.db' is a failure, as levelDB anyways seem to be throwing 'magic' error going forward. Are there any use cases for this scenario?
>>> 2) And also, is it valid to flag "No Monitor present" in case the mon data directory is existing, but with no data('store.db') in it, in case mkfs is not set?
>>
>> This is a little unclear to me. Can you describe exactly what commands you're running and what the response from the monitor is?
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>> ________________________________
>>
>> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
>>
>
>
> --
> Joao Eduardo Luis
> Software Engineer | http://inktank.com | http://ceph.com
>


--
Joao Eduardo Luis
Software Engineer | http://inktank.com | http://ceph.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Non existing monitor
  2014-07-30 10:07 Aanchal Agrawal
@ 2014-08-01 10:10 ` Wido den Hollander
  0 siblings, 0 replies; 9+ messages in thread
From: Wido den Hollander @ 2014-08-01 10:10 UTC (permalink / raw)
  To: Aanchal Agrawal, ceph-devel; +Cc: Pavan Rallabhandi

On 07/30/2014 12:07 PM, Aanchal Agrawal wrote:
> Hi,
>
> We found a case(bug?) in ceph mon code, where in, an attempt to start a non-existing monitor is throwing up a levelDB error saying "failed to create new leveldb store", instead we thought an appropriate message say "No Monitor present with that id" would do, by checking for the monitor existence way ahead.
>
> It seems that 'mon_exists()' checks for the existence of the mon data directory(via 'mon_data_exists()') and also for the non-empty nature of that directory(via 'mon_data_empty()'). The fix seemed pretty simple, as to flag the appropriate message if 'mon_data_exists()' were to set 'exists' to 'false', in case mkfs is not set.
>
> The other behavior that we are seeking clarity, again in case of mkfs not being set is, if 'mon_data_exists()' sets 'exists' to 'true' and 'mon_data_empty()' sets 'exists' to 'false' (meaning the mon data directory is present, but it is empty), then the current code seems to be going ahead in an attempt to open the 'store.db', and when open fails, it tries to create a new 'store.db' (though mkfs is not set) and eventually gives up throwing "unable to read magic from mon data".
>
> The questions we had around this were:
>
> 1) Though in case of mkfs not being set, what is the reason for creating a new levelDB store in case an attempt to open the 'store.db' is a failure, as levelDB anyways seem to be throwing 'magic' error going forward. Are there any use cases for this scenario?

Separate from Ceph it seems like bad behavior of a daemon to start 
creating directories when --mkfs is not supplied.

It should spit our errors about not being able to access it's data 
store, but not start creating them without explicitly being told to do so.

> 2) And also, is it valid to flag "No Monitor present" in case the mon data directory is existing, but with no data('store.db') in it, in case mkfs is not set?
>
> Thanks,
> Aanchal
>
> ________________________________
>
> PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


-- 
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Non existing monitor
@ 2014-07-30 10:07 Aanchal Agrawal
  2014-08-01 10:10 ` Wido den Hollander
  0 siblings, 1 reply; 9+ messages in thread
From: Aanchal Agrawal @ 2014-07-30 10:07 UTC (permalink / raw)
  To: ceph-devel; +Cc: Pavan Rallabhandi

Hi,

We found a case(bug?) in ceph mon code, where in, an attempt to start a non-existing monitor is throwing up a levelDB error saying "failed to create new leveldb store", instead we thought an appropriate message say "No Monitor present with that id" would do, by checking for the monitor existence way ahead.

It seems that 'mon_exists()' checks for the existence of the mon data directory(via 'mon_data_exists()') and also for the non-empty nature of that directory(via 'mon_data_empty()'). The fix seemed pretty simple, as to flag the appropriate message if 'mon_data_exists()' were to set 'exists' to 'false', in case mkfs is not set.

The other behavior that we are seeking clarity, again in case of mkfs not being set is, if 'mon_data_exists()' sets 'exists' to 'true' and 'mon_data_empty()' sets 'exists' to 'false' (meaning the mon data directory is present, but it is empty), then the current code seems to be going ahead in an attempt to open the 'store.db', and when open fails, it tries to create a new 'store.db' (though mkfs is not set) and eventually gives up throwing "unable to read magic from mon data".

The questions we had around this were:

1) Though in case of mkfs not being set, what is the reason for creating a new levelDB store in case an attempt to open the 'store.db' is a failure, as levelDB anyways seem to be throwing 'magic' error going forward. Are there any use cases for this scenario?
2) And also, is it valid to flag "No Monitor present" in case the mon data directory is existing, but with no data('store.db') in it, in case mkfs is not set?

Thanks,
Aanchal

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-08-05 11:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-01 10:03 Non existing monitor Aanchal Agrawal
2014-08-01 12:46 ` Gregory Farnum
2014-08-01 18:20   ` Pavan Rallabhandi
2014-08-02 13:17     ` Joao Eduardo Luis
2014-08-04  6:43       ` Aanchal Agrawal
2014-08-04 14:40         ` Joao Eduardo Luis
2014-08-05 11:01           ` Pavan Rallabhandi
  -- strict thread matches above, loose matches on Subject: below --
2014-07-30 10:07 Aanchal Agrawal
2014-08-01 10:10 ` Wido den Hollander

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.