All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] [PATCH 0/6] nocontrold: Eliminating ocfs2_controld
@ 2013-09-06  3:26 Goldwyn Rodrigues
  2013-09-06 11:27 ` Lars Marowsky-Bree
  2013-09-06 19:40 ` Mark Fasheh
  0 siblings, 2 replies; 6+ messages in thread
From: Goldwyn Rodrigues @ 2013-09-06  3:26 UTC (permalink / raw)
  To: ocfs2-devel

Hi,

I am re-sending this patch series because I did not get a response
for the previous set.

This is an effort of removing ocfs2_controld.pcmk and getting ocfs2 DLM
handling up to the times with respect to DLM (>=4.0.1) and corosync
(2.3.x). AFAIK, cman also is being phased out for a unified corosync
cluster stack.

fs/dlm performs all the functions with respect to fencing and node
management and provides the API's to do so for ocfs2. For all future
references, DLM stands for fs/dlm code.

The advantages are:
 + No need to run an additional userspace daemon (ocfs2_controld)
 + No contrrold devince handling and controld protocol
 + Shifting responsibilities of node management to DLM layer
 + Huge reduction in source code, both in kernel and userspace

This feature requires modification in the userspace ocfs2-tools.
The changes can be found at:
https://github.com/goldwynr/ocfs2-tools branch: nocontrold
Currently, not many checks are present in the userspace code,
but that would change soon.

These changes were developed on linux-stable 3.10.y, though they 
are applicable at the current upstream as well. If you want to give
the entire kernel a spin, the link is:

https://github.com/goldwynr/linux-stable branch: nocontrold

Review comments/suggestions/criticism welcome.

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH 0/6] nocontrold: Eliminating ocfs2_controld
  2013-09-06  3:26 [Ocfs2-devel] [PATCH 0/6] nocontrold: Eliminating ocfs2_controld Goldwyn Rodrigues
@ 2013-09-06 11:27 ` Lars Marowsky-Bree
  2013-09-06 19:13   ` Goldwyn Rodrigues
  2013-09-06 19:40 ` Mark Fasheh
  1 sibling, 1 reply; 6+ messages in thread
From: Lars Marowsky-Bree @ 2013-09-06 11:27 UTC (permalink / raw)
  To: ocfs2-devel

On 2013-09-05T22:26:56, Goldwyn Rodrigues <rgoldwyn@suse.de> wrote:

Hi Goldwyn,

thanks! This looks really good.

> This is an effort of removing ocfs2_controld.pcmk and getting ocfs2 DLM
> handling up to the times with respect to DLM (>=4.0.1) and corosync
> (2.3.x). AFAIK, cman also is being phased out for a unified corosync
> cluster stack.

That's clearly necessary, also to bring OCFS2 more uptodate with the
latest happenings in the GFS2 world; it'll allow both file systems to
share exactly the same cluster stack.

> https://github.com/goldwynr/ocfs2-tools branch: nocontrold
> Currently, not many checks are present in the userspace code,
> but that would change soon.

There's one question I have; how will this handle

- the "old" user-space code starting on a new kernel,
- or the "new" user-space code being run on an old kernel?

Is there anything we can do to at least provide a meaningful error
message in the first case? The second should be easier to handle.


Regards,
    Lars

-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imend?rffer, HRB 21284 (AG N?rnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH 0/6] nocontrold: Eliminating ocfs2_controld
  2013-09-06 11:27 ` Lars Marowsky-Bree
@ 2013-09-06 19:13   ` Goldwyn Rodrigues
  2013-09-06 19:38     ` Mark Fasheh
  0 siblings, 1 reply; 6+ messages in thread
From: Goldwyn Rodrigues @ 2013-09-06 19:13 UTC (permalink / raw)
  To: ocfs2-devel

Hi Lars,

On 09/06/2013 06:22 AM, Lars Marowsky-Bree wrote:
> On 2013-09-05T22:26:56, Goldwyn Rodrigues <rgoldwyn@suse.de> wrote:
>
> Hi Goldwyn,
>
> thanks! This looks really good.
>
>> This is an effort of removing ocfs2_controld.pcmk and getting ocfs2 DLM
>> handling up to the times with respect to DLM (>=4.0.1) and corosync
>> (2.3.x). AFAIK, cman also is being phased out for a unified corosync
>> cluster stack.
>
> That's clearly necessary, also to bring OCFS2 more uptodate with the
> latest happenings in the GFS2 world; it'll allow both file systems to
> share exactly the same cluster stack.
>
>> https://github.com/goldwynr/ocfs2-tools branch: nocontrold
>> Currently, not many checks are present in the userspace code,
>> but that would change soon.
>
> There's one question I have; how will this handle
>
> - the "old" user-space code starting on a new kernel,

The ocfs2_controld.pcmk will refuse to start because of absence of the 
control device created by the kernel. Of course, this would deny mounts 
as well.

> - or the "new" user-space code being run on an old kernel?

The kernel code will fail citing the reason: The userspace daemon is not 
present.
The userspace complains (ESRCH):
mount.ocfs2: No such process while mounting /dev/sdc1 on /mnt. Check 
'dmesg' for more information on this error.

>
> Is there anything we can do to at least provide a meaningful error
> message in the first case? The second should be easier to handle.

Yes, we can capture the error code and ask the user to upgrade in the 
second case. However, for the first case mount.ocfs2 would give a 
cluster connect failure because ocfs2_controld is not present.

On a different note, we should consider increasing the kernel module 
version shown in dmesg to be in sync with the userspace tools and/or 
possibly increase the version number of both tools and kernel module.

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH 0/6] nocontrold: Eliminating ocfs2_controld
  2013-09-06 19:13   ` Goldwyn Rodrigues
@ 2013-09-06 19:38     ` Mark Fasheh
  2013-09-26 22:22       ` Goldwyn Rodrigues
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Fasheh @ 2013-09-06 19:38 UTC (permalink / raw)
  To: ocfs2-devel

On Fri, Sep 06, 2013 at 02:13:00PM -0500, Goldwyn Rodrigues wrote:
> Hi Lars,
> 
> On 09/06/2013 06:22 AM, Lars Marowsky-Bree wrote:
> > On 2013-09-05T22:26:56, Goldwyn Rodrigues <rgoldwyn@suse.de> wrote:
> >
> > Hi Goldwyn,
> >
> > thanks! This looks really good.
> >
> >> This is an effort of removing ocfs2_controld.pcmk and getting ocfs2 DLM
> >> handling up to the times with respect to DLM (>=4.0.1) and corosync
> >> (2.3.x). AFAIK, cman also is being phased out for a unified corosync
> >> cluster stack.
> >
> > That's clearly necessary, also to bring OCFS2 more uptodate with the
> > latest happenings in the GFS2 world; it'll allow both file systems to
> > share exactly the same cluster stack.
> >
> >> https://github.com/goldwynr/ocfs2-tools branch: nocontrold
> >> Currently, not many checks are present in the userspace code,
> >> but that would change soon.
> >
> > There's one question I have; how will this handle
> >
> > - the "old" user-space code starting on a new kernel,
> 
> The ocfs2_controld.pcmk will refuse to start because of absence of the 
> control device created by the kernel. Of course, this would deny mounts 
> as well.

Do we know how the GFS2 project handled this case? It's going to be a major
problem for people if a kernel update horks their cluster fs.


> > Is there anything we can do to at least provide a meaningful error
> > message in the first case? The second should be easier to handle.
> 
> Yes, we can capture the error code and ask the user to upgrade in the 
> second case. However, for the first case mount.ocfs2 would give a 
> cluster connect failure because ocfs2_controld is not present.
> 
> On a different note, we should consider increasing the kernel module 
> version shown in dmesg to be in sync with the userspace tools and/or 
> possibly increase the version number of both tools and kernel module.

That shouldn't be a problem, the numbers are mostly there for us Ocfs2 devs.
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH 0/6] nocontrold: Eliminating ocfs2_controld
  2013-09-06  3:26 [Ocfs2-devel] [PATCH 0/6] nocontrold: Eliminating ocfs2_controld Goldwyn Rodrigues
  2013-09-06 11:27 ` Lars Marowsky-Bree
@ 2013-09-06 19:40 ` Mark Fasheh
  1 sibling, 0 replies; 6+ messages in thread
From: Mark Fasheh @ 2013-09-06 19:40 UTC (permalink / raw)
  To: ocfs2-devel

Firstly, thanks for developing this Goldwyn. I've been looking at the
patches, and plan to review the series (in kernel) for you.

Quick question - do you have a pointer handy to the development stream on
the dlm / gfs side of things? It would be instructive to see how it was done
in more than one place.
	--Mark

On Thu, Sep 05, 2013 at 10:26:56PM -0500, Goldwyn Rodrigues wrote:
> Hi,
> 
> I am re-sending this patch series because I did not get a response
> for the previous set.
> 
> This is an effort of removing ocfs2_controld.pcmk and getting ocfs2 DLM
> handling up to the times with respect to DLM (>=4.0.1) and corosync
> (2.3.x). AFAIK, cman also is being phased out for a unified corosync
> cluster stack.
> 
> fs/dlm performs all the functions with respect to fencing and node
> management and provides the API's to do so for ocfs2. For all future
> references, DLM stands for fs/dlm code.
> 
> The advantages are:
>  + No need to run an additional userspace daemon (ocfs2_controld)
>  + No contrrold devince handling and controld protocol
>  + Shifting responsibilities of node management to DLM layer
>  + Huge reduction in source code, both in kernel and userspace
> 
> This feature requires modification in the userspace ocfs2-tools.
> The changes can be found at:
> https://github.com/goldwynr/ocfs2-tools branch: nocontrold
> Currently, not many checks are present in the userspace code,
> but that would change soon.
> 
> These changes were developed on linux-stable 3.10.y, though they 
> are applicable at the current upstream as well. If you want to give
> the entire kernel a spin, the link is:
> 
> https://github.com/goldwynr/linux-stable branch: nocontrold
> 
> Review comments/suggestions/criticism welcome.
> 
> -- 
> Goldwyn
--
Mark Fasheh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Ocfs2-devel] [PATCH 0/6] nocontrold: Eliminating ocfs2_controld
  2013-09-06 19:38     ` Mark Fasheh
@ 2013-09-26 22:22       ` Goldwyn Rodrigues
  0 siblings, 0 replies; 6+ messages in thread
From: Goldwyn Rodrigues @ 2013-09-26 22:22 UTC (permalink / raw)
  To: ocfs2-devel

Hi Mark,

Thanks for the review so far. I was on vacation and could not get back 
to this earlier.

On 09/06/2013 02:38 PM, Mark Fasheh wrote:
> On Fri, Sep 06, 2013 at 02:13:00PM -0500, Goldwyn Rodrigues wrote:
>> Hi Lars,
>>
>> On 09/06/2013 06:22 AM, Lars Marowsky-Bree wrote:
>>> On 2013-09-05T22:26:56, Goldwyn Rodrigues <rgoldwyn@suse.de> wrote:
>>>
>>> Hi Goldwyn,
>>>
>>> thanks! This looks really good.
>>>
>>>> This is an effort of removing ocfs2_controld.pcmk and getting ocfs2 DLM
>>>> handling up to the times with respect to DLM (>=4.0.1) and corosync
>>>> (2.3.x). AFAIK, cman also is being phased out for a unified corosync
>>>> cluster stack.
>>>
>>> That's clearly necessary, also to bring OCFS2 more uptodate with the
>>> latest happenings in the GFS2 world; it'll allow both file systems to
>>> share exactly the same cluster stack.
>>>
>>>> https://github.com/goldwynr/ocfs2-tools branch: nocontrold
>>>> Currently, not many checks are present in the userspace code,
>>>> but that would change soon.
>>>
>>> There's one question I have; how will this handle
>>>
>>> - the "old" user-space code starting on a new kernel,
>>
>> The ocfs2_controld.pcmk will refuse to start because of absence of the
>> control device created by the kernel. Of course, this would deny mounts
>> as well.
>
> Do we know how the GFS2 project handled this case? It's going to be a major
> problem for people if a kernel update horks their cluster fs.

Okay, I have managed to work on this and can mount filesystems which are 
used with ocfs2_controld. So, we have backward compatibility.

The only downside is we will not have a code reduction :( I will post 
the patches soon for review.

>
>
>>> Is there anything we can do to at least provide a meaningful error
>>> message in the first case? The second should be easier to handle.
>>
>> Yes, we can capture the error code and ask the user to upgrade in the
>> second case. However, for the first case mount.ocfs2 would give a
>> cluster connect failure because ocfs2_controld is not present.
>>
>> On a different note, we should consider increasing the kernel module
>> version shown in dmesg to be in sync with the userspace tools and/or
>> possibly increase the version number of both tools and kernel module.
>
> That shouldn't be a problem, the numbers are mostly there for us Ocfs2 devs.

Understood. I hope the devs at Oracle does too :) especially from the 
tools POV.


-- 
Goldwyn

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-09-26 22:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-06  3:26 [Ocfs2-devel] [PATCH 0/6] nocontrold: Eliminating ocfs2_controld Goldwyn Rodrigues
2013-09-06 11:27 ` Lars Marowsky-Bree
2013-09-06 19:13   ` Goldwyn Rodrigues
2013-09-06 19:38     ` Mark Fasheh
2013-09-26 22:22       ` Goldwyn Rodrigues
2013-09-06 19:40 ` Mark Fasheh

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.