All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/21] ceph: Ceph distributed file system client v0.9
@ 2009-06-19 22:31 Sage Weil
  2009-06-19 22:31 ` [PATCH 01/21] fs: add fs/staging directory Sage Weil
                   ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Sage Weil @ 2009-06-19 22:31 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel, greg; +Cc: Sage Weil

This is a patch series for v0.9 of the Ceph distributed file system
client (against v2.6.30).

Greg, the first patch in the series creates an fs/staging/ directory.
This is analogous to drivers/staging/ (not built by allyesconfig,
modpost will mark the module with 'staging', etc.), except you can
find it under the File Systems section (and it doesn't get hidden
along with drivers/ on UML).

If that looks reasonable, I would love to see this go into the staging
tree.  The remaining patches add Ceph at fs/staging/ceph.

Changes since v0.7 (the last lkml series):
 * Fixes to readdir (versus llseek())
 * Fixed problem with snapshots versus truncate()
 * Responds to memory pressure from the MDS, to avoid pinning
   to much memory on the server
 * CRUSH algorithm fixes, improvements
 * Protocol updates to match userspace
 * Bug fixes

The patchset is based on 2.6.30, and can be pulled from
    git://ceph.newdream.net/linux-ceph-client.git master

As always, questions, comments, and/or review are most welcome.

Thanks!
sage


---

Ceph is a distributed file system designed for reliability, scalability, 
and performance.  The storage system consists of some (potentially 
large) number of storage servers (bricks), a smaller set of metadata 
server daemons, and a few monitor daemons for managing cluster 
membership and state.  The storage daemons rely on btrfs for storing 
data (and take advantage of btrfs' internal transactions to keep the 
local data set in a consistent state).  This makes the storage cluster 
simple to deploy, while providing scalability not currently available 
from block-based Linux cluster file systems.

Additionaly, Ceph brings a few new things to Linux.  Directory 
granularity snapshots allow users to create a read-only snapshot of any 
directory (and its nested contents) with 'mkdir .snap/my_snapshot' [1]. 
Deletion is similarly trivial ('rmdir .snap/old_snapshot').  Ceph also 
maintains recursive accounting statistics on the number of nested files, 
directories, and file sizes for each directory, making it much easier 
for an administrator to manage usage [2].

Basic features include:

 * Strong data and metadata consistency between clients
 * High availability and reliability.  No single points of failure.
 * N-way replication of all data across storage nodes
 * Scalability from 1 to potentially many thousands of nodes
 * Fast recovery from node failures
 * Automatic rebalancing of data on node addition/removal
 * Easy deployment: most FS components are userspace daemons

In contrast to cluster filesystems like GFS2 and OCFS2 that rely on 
symmetric access by all clients to shared block devices, Ceph separates 
data and metadata management into independent server clusters, similar 
to Lustre.  Unlike Lustre, however, metadata and storage nodes run 
entirely as user space daemons.  The storage daemon utilizes btrfs to 
store data objects, leveraging its advanced features (transactions, 
checksumming, metadata replication, etc.).  File data is striped across 
storage nodes in large chunks to distribute workload and facilitate high 
throughputs. When storage nodes fail, data is re-replicated in a 
distributed fashion by the storage nodes themselves (with some minimal 
coordination from the cluster monitor), making the system extremely 
efficient and scalable.

Metadata servers effectively form a large, consistent, distributed
in-memory cache above the storage cluster that is scalable,
dynamically redistributes metadata in response to workload changes,
and can tolerate arbitrary (well, non-Byzantine) node failures.  The
metadata server embeds inodes with only a single link inside the
directories that contain them, allowing entire directories of dentries
and inodes to be loaded into its cache with a single I/O operation.
Hard links are supported via an auxiliary table facilitating inode
lookup by number.  The contents of large directories can be fragmented
and managed by independent metadata servers, allowing scalable
concurrent access.

The system offers automatic data rebalancing/migration when scaling from 
a small cluster of just a few nodes to many hundreds, without requiring 
an administrator to carve the data set into static volumes or go through 
the tedious process of migrating data between servers.  When the file 
system approaches full, new storage nodes can be easily added and things 
will "just work."

A git tree containing just the client (and this patch series) is at
	git://ceph.newdream.net/linux-ceph-client.git

A standalone tree with just the client kenrel module is at
	git://ceph.newdream.net/ceph-client.git

The source for the full system is at
	git://ceph.newdream.net/ceph.git

The corresponding user space daemons need to be built in order to test
it.  Instructions for getting a test setup running are at
        http://ceph.newdream.net/wiki/

Debian packages are available from
	http://ceph.newdream.net/debian

The Ceph home page is at
	http://ceph.newdream.net

[1] Snapshots
        http://marc.info/?l=linux-fsdevel&m=122341525709480&w=2
[2] Recursive accounting
        http://marc.info/?l=linux-fsdevel&m=121614651204667&w=2

---
 Documentation/filesystems/ceph.txt |  175 +++
 fs/Kconfig                         |    2 +
 fs/Makefile                        |    1 +
 fs/staging/Kconfig                 |   48 +
 fs/staging/Makefile                |    6 +
 fs/staging/ceph/Kconfig            |   14 +
 fs/staging/ceph/Makefile           |   35 +
 fs/staging/ceph/addr.c             | 1101 +++++++++++++++
 fs/staging/ceph/caps.c             | 2499 +++++++++++++++++++++++++++++++++
 fs/staging/ceph/ceph_debug.h       |   86 ++
 fs/staging/ceph/ceph_fs.h          |  913 ++++++++++++
 fs/staging/ceph/ceph_ver.h         |    6 +
 fs/staging/ceph/crush/crush.c      |  140 ++
 fs/staging/ceph/crush/crush.h      |  188 +++
 fs/staging/ceph/crush/hash.h       |   90 ++
 fs/staging/ceph/crush/mapper.c     |  597 ++++++++
 fs/staging/ceph/crush/mapper.h     |   19 +
 fs/staging/ceph/debugfs.c          |  607 ++++++++
 fs/staging/ceph/decode.h           |  151 ++
 fs/staging/ceph/dir.c              | 1129 +++++++++++++++
 fs/staging/ceph/export.c           |  156 +++
 fs/staging/ceph/file.c             |  794 +++++++++++
 fs/staging/ceph/inode.c            | 2356 +++++++++++++++++++++++++++++++
 fs/staging/ceph/ioctl.c            |   65 +
 fs/staging/ceph/ioctl.h            |   12 +
 fs/staging/ceph/mds_client.c       | 2694 ++++++++++++++++++++++++++++++++++++
 fs/staging/ceph/mds_client.h       |  347 +++++
 fs/staging/ceph/mdsmap.c           |  132 ++
 fs/staging/ceph/mdsmap.h           |   45 +
 fs/staging/ceph/messenger.c        | 2394 ++++++++++++++++++++++++++++++++
 fs/staging/ceph/messenger.h        |  273 ++++
 fs/staging/ceph/mon_client.c       |  451 ++++++
 fs/staging/ceph/mon_client.h       |  135 ++
 fs/staging/ceph/msgr.h             |  155 +++
 fs/staging/ceph/osd_client.c       |  987 +++++++++++++
 fs/staging/ceph/osd_client.h       |  151 ++
 fs/staging/ceph/osdmap.c           |  703 ++++++++++
 fs/staging/ceph/osdmap.h           |   83 ++
 fs/staging/ceph/rados.h            |  398 ++++++
 fs/staging/ceph/snap.c             |  895 ++++++++++++
 fs/staging/ceph/super.c            | 1200 ++++++++++++++++
 fs/staging/ceph/super.h            |  946 +++++++++++++
 fs/staging/ceph/types.h            |   27 +
 fs/staging/fsstaging.c             |   19 +
 scripts/mod/modpost.c              |    4 +-
 45 files changed, 23228 insertions(+), 1 deletions(-)

^ permalink raw reply	[flat|nested] 36+ messages in thread
* [PATCH 00/21] ceph distributed file system client
@ 2009-09-22 17:38 Sage Weil
  2009-09-22 17:38 ` [PATCH 01/21] ceph: documentation Sage Weil
  0 siblings, 1 reply; 36+ messages in thread
From: Sage Weil @ 2009-09-22 17:38 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel, akpm; +Cc: yehuda, Sage Weil

Hi,

This is v0.15 of the Ceph distributed file system client.  Changes since
v0.14:

 - checkpatch, sparse cleanups
 - ioctl number documented
 - some message api simplifications, avoiding more memory allocations
 - message pools to avoid additional ENOMEM situations
 - new ioctl to determine object name and location/address for given file offset
 - osd failure handling bug fix
 - debugfs cleanups

I've pretty much run out of substantiative feedback to address with
this code.  There are a few more memory preallocation issues I am
continuing to look at, but I don't think they are show stoppers.  The
code has been running on my test cluster for the last week without
problems, and would greatly benefit from broader testing.

Any additional review, or suggestions for how to get this merged are
much appreciated.

Thanks-
sage

Kernel client git tree:
        git://ceph.newdream.net/linux-ceph-client.git

System:
	git://ceph.newdream.net/ceph.git

---
 Documentation/filesystems/ceph.txt   |  140 ++
 Documentation/ioctl/ioctl-number.txt |    1 +
 fs/Kconfig                           |    1 +
 fs/Makefile                          |    1 +
 fs/ceph/Kconfig                      |   26 +
 fs/ceph/Makefile                     |   35 +
 fs/ceph/addr.c                       | 1117 +++++++++++++
 fs/ceph/buffer.h                     |   83 +
 fs/ceph/caps.c                       | 2800 ++++++++++++++++++++++++++++++++
 fs/ceph/ceph_debug.h                 |   35 +
 fs/ceph/ceph_fs.h                    |  937 +++++++++++
 fs/ceph/ceph_ver.h                   |    6 +
 fs/ceph/crush/crush.c                |  140 ++
 fs/ceph/crush/crush.h                |  188 +++
 fs/ceph/crush/hash.h                 |   90 ++
 fs/ceph/crush/mapper.c               |  589 +++++++
 fs/ceph/crush/mapper.h               |   20 +
 fs/ceph/debugfs.c                    |  430 +++++
 fs/ceph/decode.h                     |  136 ++
 fs/ceph/dir.c                        | 1175 ++++++++++++++
 fs/ceph/export.c                     |  222 +++
 fs/ceph/file.c                       |  902 +++++++++++
 fs/ceph/inode.c                      | 2404 ++++++++++++++++++++++++++++
 fs/ceph/ioctl.c                      |  157 ++
 fs/ceph/ioctl.h                      |   39 +
 fs/ceph/mds_client.c                 | 2915 ++++++++++++++++++++++++++++++++++
 fs/ceph/mds_client.h                 |  321 ++++
 fs/ceph/mdsmap.c                     |  139 ++
 fs/ceph/mdsmap.h                     |   47 +
 fs/ceph/messenger.c                  | 1868 ++++++++++++++++++++++
 fs/ceph/messenger.h                  |  255 +++
 fs/ceph/mon_client.c                 |  694 ++++++++
 fs/ceph/mon_client.h                 |  109 ++
 fs/ceph/msgpool.c                    |  167 ++
 fs/ceph/msgpool.h                    |   26 +
 fs/ceph/msgr.h                       |  157 ++
 fs/ceph/osd_client.c                 | 1292 +++++++++++++++
 fs/ceph/osd_client.h                 |  144 ++
 fs/ceph/osdmap.c                     |  872 ++++++++++
 fs/ceph/osdmap.h                     |   94 ++
 fs/ceph/rados.h                      |  426 +++++
 fs/ceph/snap.c                       |  897 +++++++++++
 fs/ceph/super.c                      | 1015 ++++++++++++
 fs/ceph/super.h                      |  945 +++++++++++
 fs/ceph/types.h                      |   27 +
 45 files changed, 24084 insertions(+), 0 deletions(-)

^ permalink raw reply	[flat|nested] 36+ messages in thread
* [PATCH 00/21] ceph distributed file system client
@ 2009-10-05 22:50 Sage Weil
  2009-10-05 22:50 ` [PATCH 01/21] ceph: documentation Sage Weil
  0 siblings, 1 reply; 36+ messages in thread
From: Sage Weil @ 2009-10-05 22:50 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel; +Cc: yehuda, Sage Weil

Hi,

This is v0.16 of the Ceph distributed file system client.  This version 
addresses comments from Andrew and Andi, and fixes a few bugs.  Changes 
since v0.15 include:

 - corrected much inline abuse
 - marked init only methods with __init
 - use KMEM_CACHE where possible
 - use sockaddr_storage for on-wire types (for eventual ipv6 support)
 - slightly improved ceph_buffer use of vmalloc
 - use pr_fmt
 - use smp_mb instead of spinlock for ceph_i_test
 - xattr cleanups
 - fix invalidate bug
 - fix msgr queue accounting bug

Unless anyone sees any major problems here, I plan to send this to 
Stephen shortly for inclusion in linux-next, and will ask Linus to pull 
during the .33 window.

Thank you everyone who has taken the time for review so far!

sage


Kernel client git tree:
        git://ceph.newdream.net/linux-ceph-client.git

System:
	git://ceph.newdream.net/ceph.git

---
 Documentation/filesystems/ceph.txt   |  139 ++
 Documentation/ioctl/ioctl-number.txt |    1 +
 MAINTAINERS                          |    9 +
 fs/Kconfig                           |    1 +
 fs/Makefile                          |    1 +
 fs/ceph/Kconfig                      |   26 +
 fs/ceph/Makefile                     |   36 +
 fs/ceph/addr.c                       | 1115 +++++++++++++
 fs/ceph/buffer.c                     |   34 +
 fs/ceph/buffer.h                     |   55 +
 fs/ceph/caps.c                       | 2830 +++++++++++++++++++++++++++++++++
 fs/ceph/ceph_debug.h                 |   37 +
 fs/ceph/ceph_frag.c                  |   21 +
 fs/ceph/ceph_frag.h                  |  109 ++
 fs/ceph/ceph_fs.c                    |   80 +
 fs/ceph/ceph_fs.h                    |  629 ++++++++
 fs/ceph/ceph_strings.c               |  163 ++
 fs/ceph/ceph_ver.h                   |    6 +
 fs/ceph/crush/crush.c                |  140 ++
 fs/ceph/crush/crush.h                |  188 +++
 fs/ceph/crush/hash.h                 |   90 ++
 fs/ceph/crush/mapper.c               |  589 +++++++
 fs/ceph/crush/mapper.h               |   20 +
 fs/ceph/debugfs.c                    |  425 +++++
 fs/ceph/decode.h                     |  136 ++
 fs/ceph/dir.c                        | 1212 ++++++++++++++
 fs/ceph/export.c                     |  223 +++
 fs/ceph/file.c                       |  904 +++++++++++
 fs/ceph/inode.c                      | 1620 +++++++++++++++++++
 fs/ceph/ioctl.c                      |  157 ++
 fs/ceph/ioctl.h                      |   39 +
 fs/ceph/mds_client.c                 | 2912 ++++++++++++++++++++++++++++++++++
 fs/ceph/mds_client.h                 |  321 ++++
 fs/ceph/mdsmap.c                     |  166 ++
 fs/ceph/mdsmap.h                     |   53 +
 fs/ceph/messenger.c                  | 2019 +++++++++++++++++++++++
 fs/ceph/messenger.h                  |  243 +++
 fs/ceph/mon_client.c                 |  694 ++++++++
 fs/ceph/mon_client.h                 |  109 ++
 fs/ceph/msgpool.c                    |  167 ++
 fs/ceph/msgpool.h                    |   26 +
 fs/ceph/msgr.h                       |  157 ++
 fs/ceph/osd_client.c                 | 1294 +++++++++++++++
 fs/ceph/osd_client.h                 |  144 ++
 fs/ceph/osdmap.c                     |  875 ++++++++++
 fs/ceph/osdmap.h                     |  123 ++
 fs/ceph/rados.h                      |  372 +++++
 fs/ceph/snap.c                       |  897 +++++++++++
 fs/ceph/super.c                      |  936 +++++++++++
 fs/ceph/super.h                      |  890 +++++++++++
 fs/ceph/types.h                      |   28 +
 fs/ceph/xattr.c                      |  833 ++++++++++
 52 files changed, 24294 insertions(+), 0 deletions(-)
---
 Documentation/filesystems/ceph.txt   |  139 ++
 Documentation/ioctl/ioctl-number.txt |    1 +
 MAINTAINERS                          |    9 +
 fs/Kconfig                           |    1 +
 fs/Makefile                          |    1 +
 fs/ceph/Kconfig                      |   26 +
 fs/ceph/Makefile                     |   36 +
 fs/ceph/addr.c                       | 1115 +++++++++++++
 fs/ceph/buffer.c                     |   34 +
 fs/ceph/buffer.h                     |   55 +
 fs/ceph/caps.c                       | 2830 +++++++++++++++++++++++++++++++++
 fs/ceph/ceph_debug.h                 |   37 +
 fs/ceph/ceph_frag.c                  |   21 +
 fs/ceph/ceph_frag.h                  |  109 ++
 fs/ceph/ceph_fs.c                    |   80 +
 fs/ceph/ceph_fs.h                    |  629 ++++++++
 fs/ceph/ceph_strings.c               |  163 ++
 fs/ceph/ceph_ver.h                   |    6 +
 fs/ceph/crush/crush.c                |  140 ++
 fs/ceph/crush/crush.h                |  188 +++
 fs/ceph/crush/hash.h                 |   90 ++
 fs/ceph/crush/mapper.c               |  589 +++++++
 fs/ceph/crush/mapper.h               |   20 +
 fs/ceph/debugfs.c                    |  425 +++++
 fs/ceph/decode.h                     |  136 ++
 fs/ceph/dir.c                        | 1212 ++++++++++++++
 fs/ceph/export.c                     |  223 +++
 fs/ceph/file.c                       |  904 +++++++++++
 fs/ceph/inode.c                      | 1620 +++++++++++++++++++
 fs/ceph/ioctl.c                      |  157 ++
 fs/ceph/ioctl.h                      |   39 +
 fs/ceph/mds_client.c                 | 2912 ++++++++++++++++++++++++++++++++++
 fs/ceph/mds_client.h                 |  321 ++++
 fs/ceph/mdsmap.c                     |  166 ++
 fs/ceph/mdsmap.h                     |   53 +
 fs/ceph/messenger.c                  | 2019 +++++++++++++++++++++++
 fs/ceph/messenger.h                  |  243 +++
 fs/ceph/mon_client.c                 |  694 ++++++++
 fs/ceph/mon_client.h                 |  109 ++
 fs/ceph/msgpool.c                    |  167 ++
 fs/ceph/msgpool.h                    |   26 +
 fs/ceph/msgr.h                       |  157 ++
 fs/ceph/osd_client.c                 | 1294 +++++++++++++++
 fs/ceph/osd_client.h                 |  144 ++
 fs/ceph/osdmap.c                     |  875 ++++++++++
 fs/ceph/osdmap.h                     |  123 ++
 fs/ceph/rados.h                      |  372 +++++
 fs/ceph/snap.c                       |  897 +++++++++++
 fs/ceph/super.c                      |  936 +++++++++++
 fs/ceph/super.h                      |  890 +++++++++++
 fs/ceph/types.h                      |   28 +
 fs/ceph/xattr.c                      |  833 ++++++++++
 52 files changed, 24294 insertions(+), 0 deletions(-)

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2009-10-05 22:56 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-19 22:31 [PATCH 00/21] ceph: Ceph distributed file system client v0.9 Sage Weil
2009-06-19 22:31 ` [PATCH 01/21] fs: add fs/staging directory Sage Weil
2009-06-19 22:31   ` [PATCH 02/21] ceph: documentation Sage Weil
2009-06-19 22:31     ` [PATCH 03/21] ceph: on-wire types Sage Weil
2009-06-19 22:31       ` [PATCH 04/21] ceph: client types Sage Weil
2009-06-19 22:31         ` [PATCH 05/21] ceph: super.c Sage Weil
2009-06-19 22:31           ` [PATCH 06/21] ceph: inode operations Sage Weil
2009-06-19 22:31             ` [PATCH 07/21] ceph: directory operations Sage Weil
2009-06-19 22:31               ` [PATCH 08/21] ceph: file operations Sage Weil
2009-06-19 22:31                 ` [PATCH 09/21] ceph: address space operations Sage Weil
2009-06-19 22:31                   ` [PATCH 10/21] ceph: MDS client Sage Weil
2009-06-19 22:31                     ` [PATCH 11/21] ceph: OSD client Sage Weil
2009-06-19 22:31                       ` [PATCH 12/21] ceph: CRUSH mapping algorithm Sage Weil
2009-06-19 22:31                         ` [PATCH 13/21] ceph: monitor client Sage Weil
2009-06-19 22:31                           ` [PATCH 14/21] ceph: capability management Sage Weil
2009-06-19 22:31                             ` [PATCH 15/21] ceph: snapshot management Sage Weil
2009-06-19 22:31                               ` [PATCH 16/21] ceph: messenger library Sage Weil
2009-06-19 22:31                                 ` [PATCH 17/21] ceph: nfs re-export support Sage Weil
2009-06-19 22:31                                   ` [PATCH 18/21] ceph: ioctls Sage Weil
2009-06-19 22:31                                     ` [PATCH 19/21] ceph: debugging Sage Weil
2009-06-19 22:31                                       ` [PATCH 20/21] ceph: debugfs Sage Weil
2009-06-19 22:31                                         ` [PATCH 21/21] ceph: Kconfig, Makefile Sage Weil
2009-06-20  9:12                                   ` [PATCH 17/21] ceph: nfs re-export support Stefan Richter
2009-06-20  9:12                                     ` Stefan Richter
2009-06-20 20:39                                     ` Sage Weil
2009-06-20 21:22                                       ` Stefan Richter
2009-06-20 21:22                                         ` Stefan Richter
2009-06-19 22:44 ` [PATCH 00/21] ceph: Ceph distributed file system client v0.9 Greg KH
2009-06-19 23:15   ` Sage Weil
2009-06-19 23:20     ` Greg KH
2009-06-19 22:45 ` Greg KH
2009-06-19 22:54   ` Stephen Rothwell
2009-06-19 23:12   ` Sage Weil
2009-06-19 23:19     ` Greg KH
2009-09-22 17:38 [PATCH 00/21] ceph distributed file system client Sage Weil
2009-09-22 17:38 ` [PATCH 01/21] ceph: documentation Sage Weil
2009-09-22 17:38   ` [PATCH 02/21] ceph: on-wire types Sage Weil
2009-09-22 17:38     ` [PATCH 03/21] ceph: client types Sage Weil
2009-09-22 17:38       ` [PATCH 04/21] ceph: ref counted buffer Sage Weil
2009-09-22 17:38         ` [PATCH 05/21] ceph: super.c Sage Weil
2009-09-22 17:38           ` [PATCH 06/21] ceph: inode operations Sage Weil
2009-09-22 17:38             ` [PATCH 07/21] ceph: directory operations Sage Weil
2009-09-22 17:38               ` [PATCH 08/21] ceph: file operations Sage Weil
2009-10-05 22:50 [PATCH 00/21] ceph distributed file system client Sage Weil
2009-10-05 22:50 ` [PATCH 01/21] ceph: documentation Sage Weil
2009-10-05 22:50   ` [PATCH 02/21] ceph: on-wire types Sage Weil
2009-10-05 22:50     ` [PATCH 03/21] ceph: client types Sage Weil
2009-10-05 22:50       ` [PATCH 04/21] ceph: ref counted buffer Sage Weil
2009-10-05 22:50         ` [PATCH 05/21] ceph: super.c Sage Weil
2009-10-05 22:50           ` [PATCH 06/21] ceph: inode operations Sage Weil
2009-10-05 22:50             ` [PATCH 07/21] ceph: directory operations Sage Weil
2009-10-05 22:50               ` [PATCH 08/21] ceph: file operations Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.