Best insertion point for storage shim

* Best insertion point for storage shim
@ 2012-08-24 15:49 Stephen Perkins
  2012-08-24 16:28 ` Tommi Virtanen
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Stephen Perkins @ 2012-08-24 15:49 UTC (permalink / raw)
  To: ceph-devel

Hi all,

I'd like to get feedback from folks as to where the best place would be to
insert a "shim" into the RADOS object storage.

Currently, you can configure RADOS to use copy based storage to store
redundant copies of a file (I like 3 redundant copies so I will use that as
an example).  So... each file is stored in three locations on independent
hardware.   The redundancy has a cost of 3x the storage.

I would assume that it is "possible" to configure RADOS to store only 1 copy
of a file (bear with me here).

I'd like to see where it may be possible to insert a "shim" in the storage
such that I can take the file to be stored and apply some erasure coding to
it. Therefore, the file now becomes multiple files that are handed off to
RADOS.  

The shim would also have to take read file requests and read some small
portion of the fragments and recombine.

Basically... what I am asking is...  where would be the best place to start
looking at adding this:
	https://tahoe-lafs.org/trac/tahoe-lafs#

(just the erasure coded part).

Here is the real rationale.  Extreme availability at only 1.3 or 1.6 time
redundancy:

	http://www.zdnet.com/videos/whiteboard/dispersed-storage/156114

Thoughts appreciated,

- Steve

P.S. yes... I posted on this earlier.  Microsoft Azure storage takes this
approach by lazy erasure coding inactive files to significantly reduce
storage costs while increasing reliability.

^ permalink raw reply	[flat|nested] 8+ messages in thread