All of lore.kernel.org
 help / color / mirror / Atom feed
* dashboard in mimic
@ 2017-12-20 10:55 John Spray
  2017-12-21  1:48 ` Tim Serong
  2017-12-22 23:56 ` Lenz Grimmer
  0 siblings, 2 replies; 5+ messages in thread
From: John Spray @ 2017-12-20 10:55 UTC (permalink / raw)
  To: Ceph Development

Hi all,

There have been some discussions about making big additions to the
dashboard module (the web gui that runs in ceph-mgr), so as a couple
of people have suggested, let's have a mailing list thread about it!

This is a bit wordy so I've written it more like a document than an
email, see below.  It's a very broad topic, so what I've written here
is far from complete.  We're still at the point of discussion, there's
no UI code being written so far for any of the stuff that I mention
below.

Cheers,
John



What?
=====

Extend the dashboard module to provide management of the cluster, in
addition to monitoring.  This would potentially include anything you
can currently do with the Ceph CLI, plus additional functionality like
calling out to a container framework to spawn additional daemons.

The idea is to wrap things up into friendlier higher-level operations,
rather than just having buttons for the existing CLI operations.
Example workflows of interest:
 - a CephFS page where you can click "New Filesystem", and the pools
and MDS daemons will all be created for you.
 - similarly for RGW: ability to enable RGW and control the number of
gateway daemons
 - driving OSD additional/retirement, and also format conversions
(e.g. filestore->bluestore)

Some of the functionality would depend on how Ceph is being run:
especially, anything that detects devices and starts/stops physical
services would depend on an environment that provides that (such as
Kubenetes).

Why build it in?
============

Historically, Ceph management UIs were usually doing lots of non-Ceph
work too, configuring the underlying OS and hardware as well as the
Ceph cluster itself.  Consequently, it often made sense build the user
interface into an external tool/framework that already knew how to do
all that labour-intensive infrastructure stuff, rather than trying to
reinvent it for a Ceph-specific management tool.

As some of us are moving towards running Ceph in container
environments like Kubernetes, the hardware/OS piece is increasingly
taken care of for us.  The container platform provides a simpler way
to discover and use hosts and block devices, which we can use directly
from Ceph (or from the ceph dashboard).

What about external UIs?
====================

Building more UI functionality into Ceph should not get in the way of
integrating with any external tools/projects.  It should actually
benefit those projects: as we connect up functionality into the
dashboard module, those same ceph-mgr/python code paths can easily be
connected to REST endpoints in the restful module.

The work to actually expose the REST bits will probably still fall on
the people who really want/need that functionality, but it should be a
very lightweight task for things where the functionality already
exists in the dashboard.

Currently modules are somewhat isolated from one another, but I've
recently added an inter-module RPC interface so that we can have
better sharing of state -- the idea is to have some common things like
a table of long-running-jobs that would be shared between the
dashboard and restful modules.

Security
======

The dashboard is currently completely read-only: that's convenient
because it makes it less scary to run it over unencrypted http and/or
without login (or in practice, leaving https/login as an exercise to
the sysadmin).  When administrative functionality is added, we'll need
some sort of login, and https too.

The https part can probably be done in the same way as the restful
module: require a user-generated certificate (i.e. for their proper
domain) by default, but also provide a helper for the adventurous user
to run with a self-signed cert if they want to.

The login part could be as simple as creating users/passwords using a
CLI and just prompting for them in the GUI, or we could also have some
GUI functionality for managing users.  I wouldn't want to go too far
with the latter: if someone has complex requirements then it's
generally better to be plugging into some external user database.

It would still be very nice to retain the read only mode as an option of course.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: dashboard in mimic
  2017-12-20 10:55 dashboard in mimic John Spray
@ 2017-12-21  1:48 ` Tim Serong
  2017-12-21 10:42   ` John Spray
  2017-12-22 23:56 ` Lenz Grimmer
  1 sibling, 1 reply; 5+ messages in thread
From: Tim Serong @ 2017-12-21  1:48 UTC (permalink / raw)
  To: John Spray, Ceph Development

[-- Attachment #1: Type: text/plain, Size: 1548 bytes --]

(broad topic :-), trimming back to my immediate comment/question)

On 12/20/2017 09:55 PM, John Spray wrote:
> What?
> =====
> 
> Extend the dashboard module to provide management of the cluster, in
> addition to monitoring.  This would potentially include anything you
> can currently do with the Ceph CLI, plus additional functionality like
> calling out to a container framework to spawn additional daemons.
> 
> The idea is to wrap things up into friendlier higher-level operations,
> rather than just having buttons for the existing CLI operations.
> Example workflows of interest:
>  - a CephFS page where you can click "New Filesystem", and the pools
> and MDS daemons will all be created for you.
>  - similarly for RGW: ability to enable RGW and control the number of
> gateway daemons
>  - driving OSD additional/retirement, and also format conversions
> (e.g. filestore->bluestore)
> 
> Some of the functionality would depend on how Ceph is being run:
> especially, anything that detects devices and starts/stops physical
> services would depend on an environment that provides that (such as
> Kubenetes).

Any configuration/management of things that ceph already knows about is
"easy" to implement (creating pools, rbd volumes, cluster config, etc.)

For spawning/configuring additional daemons, is it worth considering
some kind of thin layer (another mgr module or modules?) that let the
admin choose whether this is done by k8s, salt, ansible, whatever?

Regards,

Tim
-- 
Tim Serong
Senior Clustering Engineer
SUSE
tserong@suse.com

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: tserong.vcf --]
[-- Type: text/x-vcard; name="tserong.vcf", Size: 4 bytes --]

null

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: dashboard in mimic
  2017-12-21  1:48 ` Tim Serong
@ 2017-12-21 10:42   ` John Spray
  0 siblings, 0 replies; 5+ messages in thread
From: John Spray @ 2017-12-21 10:42 UTC (permalink / raw)
  To: Tim Serong; +Cc: Ceph Development

On Thu, Dec 21, 2017 at 1:48 AM, Tim Serong <tserong@suse.com> wrote:
> (broad topic :-), trimming back to my immediate comment/question)
>
> On 12/20/2017 09:55 PM, John Spray wrote:
>> What?
>> =====
>>
>> Extend the dashboard module to provide management of the cluster, in
>> addition to monitoring.  This would potentially include anything you
>> can currently do with the Ceph CLI, plus additional functionality like
>> calling out to a container framework to spawn additional daemons.
>>
>> The idea is to wrap things up into friendlier higher-level operations,
>> rather than just having buttons for the existing CLI operations.
>> Example workflows of interest:
>>  - a CephFS page where you can click "New Filesystem", and the pools
>> and MDS daemons will all be created for you.
>>  - similarly for RGW: ability to enable RGW and control the number of
>> gateway daemons
>>  - driving OSD additional/retirement, and also format conversions
>> (e.g. filestore->bluestore)
>>
>> Some of the functionality would depend on how Ceph is being run:
>> especially, anything that detects devices and starts/stops physical
>> services would depend on an environment that provides that (such as
>> Kubenetes).
>
> Any configuration/management of things that ceph already knows about is
> "easy" to implement (creating pools, rbd volumes, cluster config, etc.)
>
> For spawning/configuring additional daemons, is it worth considering
> some kind of thin layer (another mgr module or modules?) that let the
> admin choose whether this is done by k8s, salt, ansible, whatever?

Yes, absolutely -- the main two that have been discussed so far are
kubernetes and a notional baremetal fallback, where the bare metal
version would be something that has a very small set of commands
(discover devices, format an OSD, control a service with systemd) run
over SSH.

It's easy to imagine salt being another route to managing bare metal,
and probably working a bit more robustly than the super-simple pure
SSH bare metal option.  That said, we should think carefully about in
which sorts of scenarios an external orchestration tool needs to be in
the loop: for example I don't think we'd want to be calling into
salt/ansible in cases where they were just forwarding ops into
Kubernetes for us -- the core value of the external orchestrator is in
doing things we can't do ourselves, like working out the OS and
networking configuration, bootstrapping the container environment.

John

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: dashboard in mimic
  2017-12-20 10:55 dashboard in mimic John Spray
  2017-12-21  1:48 ` Tim Serong
@ 2017-12-22 23:56 ` Lenz Grimmer
  2018-01-25 13:51   ` dashboard_v2 (was: dashboard in mimic) Lenz Grimmer
  1 sibling, 1 reply; 5+ messages in thread
From: Lenz Grimmer @ 2017-12-22 23:56 UTC (permalink / raw)
  To: Ceph Development


[-- Attachment #1.1: Type: text/plain, Size: 9400 bytes --]

Hi John,

On 12/20/2017 11:55 AM, John Spray wrote:

> There have been some discussions about making big additions to the 
> dashboard module (the web gui that runs in ceph-mgr), so as a couple 
> of people have suggested, let's have a mailing list thread about it!

Thanks a lot for kicking this off! Please find some comments inline, I'd
be glad to discuss this in more depth after the holiday break.

> This is a bit wordy so I've written it more like a document than an 
> email, see below.  It's a very broad topic, so what I've written
> here is far from complete.  We're still at the point of discussion,
> there's no UI code being written so far for any of the stuff that I
> mention below.

For additional context, I think it makes sense to mention that
this topic was also discussed in the last CDM call:

https://youtu.be/YNfp_4S7mYE?t=28m37s

The collection of ideas that Sage mentioned during the call have been
noted down here:

http://pad.ceph.com/p/mimic-dashboard

Looking at that list, I think we've implemented several of that items in
openATTIC/DeepSea already (https://www.openattic.org/features.html) and
many of the other topics are on our TODO as well.

As Jan already mentioned during the call, a good first step for us could
be to contribute the Grafana dashboards that we developed and embed in
openATTIC. They are currently maintained in the DeepSea git (somewhat
hidden at
https://github.com/SUSE/DeepSea/tree/master/srv/salt/ceph/monitoring/grafana/files),
but I think it would make sense to incorporate them upstream instead, or
maintain them as a separate project.

They have been developed to display data collected by the DigitalOcean
Ceph Exporter for Prometheus
(https://github.com/digitalocean/ceph_exporter). We also created a RGW
metrics exporter for the RGW dashboard parts:
https://github.com/SUSE/DeepSea/tree/master/srv/salt/ceph/monitoring/prometheus/exporters

The embedding of Grafana dashboards into another web app is actually not
that trivial (a simply iframe is way too inflexible) - we ended up with
writing a small proxy for the oA backend that talks to Grafana and then
forwards the filtered output to the oA web UI. You can see some examples
at https://www.openattic.org/galleries/oa-3.x-screenshots/

It should be relatively straightforward to port that to the manager
dashboard.

> What? =====
> 
> Extend the dashboard module to provide management of the cluster, in 
> addition to monitoring.  This would potentially include anything you 
> can currently do with the Ceph CLI, plus additional functionality
> like calling out to a container framework to spawn additional
> daemons.
> 
> The idea is to wrap things up into friendlier higher-level
> operations, rather than just having buttons for the existing CLI
> operations. Example workflows of interest: - a CephFS page where you
> can click "New Filesystem", and the pools and MDS daemons will all be
> created for you. - similarly for RGW: ability to enable RGW and
> control the number of gateway daemons - driving OSD
> additional/retirement, and also format conversions (e.g.
> filestore->bluestore)

OSD lifecycle management is definitely a frequently occurring task that
would benefit from an easy UI.

I'd focus on addressing the most popular and regular admin chores first
before diving into adding one-off management/deployment features.

> Some of the functionality would depend on how Ceph is being run: 
> especially, anything that detects devices and starts/stops physical 
> services would depend on an environment that provides that (such as 
> Kubenetes).

Right, this part could become be quite complex, as there are multiple
methods for deploying and orchestrating Ceph: bare-metal vs. Kubernetes,
using tools like ceph-ansible vs. DeepSea/Salt...

It may make sense to start with adding management functionality that is
based on existing/built-in Ceph APIs, e.g. Pools/RBDs/RGW and CephFS,
starting with read-only methods for obtaining information about these
and then extending that code path incrementally by adding functionality
to modify these objects. This evolutionary approach served us well for
many oA features that we created.

But at some point you will have to reach out to external services and
orchestration tools.

> Why build it in? ============
> 
> Historically, Ceph management UIs were usually doing lots of
> non-Ceph work too, configuring the underlying OS and hardware as well
> as the Ceph cluster itself.  Consequently, it often made sense build
> the user interface into an external tool/framework that already knew
> how to do all that labour-intensive infrastructure stuff, rather than
> trying to reinvent it for a Ceph-specific management tool.

We came to the same conclusion and initially started off from the
assumption, that the Ceph Cluster is already deployed and up and running
and our tool can then take it from there.

Of course, everybody wants that GUI-based "one click" install, but it's
the most complicated part to get right, and a lot of effort. Considering
you only use it once in the life cycle of your cluster, we currently
tried focusing on the more frequently occurring tasks...

> As some of us are moving towards running Ceph in container 
> environments like Kubernetes, the hardware/OS piece is increasingly 
> taken care of for us.  The container platform provides a simpler way 
> to discover and use hosts and block devices, which we can use
> directly from Ceph (or from the ceph dashboard).

The key is to make the dashboard usable in as many environments as
possible, even if only with limited functionality.

One thought however: the current UI framework is likely not well suited
for developing functionality that requires user interaction and some
more sophisticated widgets and other UI elements.

While I think that CherryPy is a great choice for the backend
functionality, AngularJS might be a better choice than Rivets.js for the
frontend in the long run. We've had very good experiences with it and
are currently in the process of migrating our UI to Angular2. But this
of course complicates the build and testing process.

> What about external UIs? ====================
> 
> Building more UI functionality into Ceph should not get in the way
> of integrating with any external tools/projects.  It should actually 
> benefit those projects: as we connect up functionality into the 
> dashboard module, those same ceph-mgr/python code paths can easily
> be connected to REST endpoints in the restful module.

That would be really useful indeed.

> The work to actually expose the REST bits will probably still fall
> on the people who really want/need that functionality, but it should
> be a very lightweight task for things where the functionality
> already exists in the dashboard.

So the Dashboard won't use the REST API itself by default? Wouldn't it
be better to have a clear separation between the UI and backend here,
and using one common API?

> Currently modules are somewhat isolated from one another, but I've 
> recently added an inter-module RPC interface so that we can have 
> better sharing of state -- the idea is to have some common things
> like a table of long-running-jobs that would be shared between the 
> dashboard and restful modules.

Have you already started working on this part? We created a TaskQueue
implementation for oA that might be worthwhile using here.

> Security ======
> 
> The dashboard is currently completely read-only: that's convenient 
> because it makes it less scary to run it over unencrypted http
> and/or without login (or in practice, leaving https/login as an
> exercise to the sysadmin).  When administrative functionality is
> added, we'll need some sort of login, and https too.

Agreed, access control will be required as soon as you will be able to
actually modify things.

> The https part can probably be done in the same way as the restful 
> module: require a user-generated certificate (i.e. for their proper 
> domain) by default, but also provide a helper for the adventurous
> user to run with a self-signed cert if they want to.

Sounds good.

> The login part could be as simple as creating users/passwords using
> a CLI and just prompting for them in the GUI, or we could also have
> some GUI functionality for managing users.  I wouldn't want to go too
> far with the latter: if someone has complex requirements then it's 
> generally better to be plugging into some external user database.

Agreed - external auth using SAML/oAUTH/LDAP/AD is usually high on the
wishlist for "enterprise" users. But it seems like CherryPy does not
provide any support for these methods yet?

> It would still be very nice to retain the read only mode as an option
> of course.

Being able to flag a user as "read-only" might be good enough to begin
with, instead of devising a full-fledged role/privilege system.

Thanks for kicking this off! I think our work on openATTIC and the
experiences that we've gathered while doing so might be useful here, so
we should continue this conversation.

Lenz

-- 
SUSE Linux GmbH - Maxfeldstr. 5 - 90409 Nuernberg (Germany)
GF:Felix Imendörffer,Jane Smithard,Graham Norton,HRB 21284 (AG Nürnberg)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: dashboard_v2 (was: dashboard in mimic)
  2017-12-22 23:56 ` Lenz Grimmer
@ 2018-01-25 13:51   ` Lenz Grimmer
  0 siblings, 0 replies; 5+ messages in thread
From: Lenz Grimmer @ 2018-01-25 13:51 UTC (permalink / raw)
  To: Ceph Development


[-- Attachment #1.1: Type: text/plain, Size: 2840 bytes --]

Hi,

time to follow up on this...

On 12/23/2017 12:56 AM, Lenz Grimmer wrote:

> http://pad.ceph.com/p/mimic-dashboard
>
> Looking at that list, I think we've implemented several of that items
> in openATTIC/DeepSea already
> (https://www.openattic.org/features.html) and many of the other
> topics are on our TODO as well.
>
>> Extend the dashboard module to provide management of the cluster,
>> in addition to monitoring.  This would potentially include anything
>> you can currently do with the Ceph CLI, plus additional
>> functionality like calling out to a container framework to spawn
>> additional daemons.

[...]

> Thanks for kicking this off! I think our work on openATTIC and the
> experiences that we've gathered while doing so might be useful here,
> so we should continue this conversation.

We did have some initial conversations with Sage and John about this in
the meanwhile. After creating a prototype of the current dashboard based
on Angular, we proposed the following: we (the openATTIC team at SUSE),
will go ahead and start a new Manager module, currently dubbed
"dashboard_v2".

The code and architecture of this module is derived from and inspired by
the openATTIC Ceph management and monitoring tool (both the backend and
WebUI). The development is actively driven by the team behind openATTIC
and we aim to migrate as much of the existing openATTIC functionality as
possible (plus all the functionality currently provided by the existing
dashboard).

The groundwork is still in the early stages, but we openened a WIP pull
request so you can follow the ongoing development here:

  https://github.com/ceph/ceph/pull/20103

Please see the PR description and the README included in the module for
more background information and details.

At the moment, we have a basic backend framework (based on CherryPy) in
place, including a simple authentication mechanism based on username and
password (freely definable).

Next up is adding the initial WebUI scaffold, which will be based on
Angular. After this groundwork is done, we'll migrate the openATTIC and
dashboard modules one by one.

Our main development branch is located here:

  https://github.com/openattic/ceph/commits/wip-mgr-dashboard_v2

For the time being, pull requests for the module are managed on this fork:

  https://github.com/openattic/ceph/pulls

If you have any questions or would like to help, don't hesitate to get
in touch with us!

We also have two weekly conference calls where we discuss the progress
and open issues. If you're interested in joining, please contact John
Spray or myself for details.

Thanks!

Lenz

-- 
SUSE Linux GmbH - Maxfeldstr. 5 - 90409 Nuernberg (Germany)
GF:Felix Imendörffer,Jane Smithard,Graham Norton,HRB 21284 (AG Nürnberg)



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-01-25 13:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-20 10:55 dashboard in mimic John Spray
2017-12-21  1:48 ` Tim Serong
2017-12-21 10:42   ` John Spray
2017-12-22 23:56 ` Lenz Grimmer
2018-01-25 13:51   ` dashboard_v2 (was: dashboard in mimic) Lenz Grimmer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.