linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Userspace Block Device
@ 2015-05-18 19:01 Bill Speirs
  2015-05-19  5:34 ` Rob Landley
  0 siblings, 1 reply; 5+ messages in thread
From: Bill Speirs @ 2015-05-18 19:01 UTC (permalink / raw)
  To: linux-kernel

My goal is to provide Amazon S3 or Google Cloud Storage as a block
device. I would like to leverage the libraries that exist for both
systems by servicing requests via a user space program.

I found 2 LKML threads that talk about a "userspace block device":

2005-11-09: http://article.gmane.org/gmane.linux.kernel/346883
2009-07-27: http://article.gmane.org/gmane.linux.kernel/869784

The first thread resulted in Michael Clark suggesting his kernel
module: https://github.com/michaeljclark/userblk The second
essentially resulted in "use nbd". Mr. Clark's module is now over 10
years old, and ndb seems like a bit of a Rube Goldberg solution.

Does the kernel now supports a facility to service bio requests via
user space? If not, what would be the best approach to take? Update
Mr. Clark's code? Or is there a newer and more efficient facility for
kernel <-> user space communication and transferring of data?

Thanks...

Bill-

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Userspace Block Device
  2015-05-18 19:01 Userspace Block Device Bill Speirs
@ 2015-05-19  5:34 ` Rob Landley
  2015-05-19 14:42   ` Bill Speirs
  0 siblings, 1 reply; 5+ messages in thread
From: Rob Landley @ 2015-05-19  5:34 UTC (permalink / raw)
  To: Bill Speirs; +Cc: Kernel Mailing List

On Mon, May 18, 2015 at 2:01 PM, Bill Speirs <bill.speirs@gmail.com> wrote:
> My goal is to provide Amazon S3 or Google Cloud Storage as a block
> device. I would like to leverage the libraries that exist for both
> systems by servicing requests via a user space program.
>
> I found 2 LKML threads that talk about a "userspace block device":
>
> 2005-11-09: http://article.gmane.org/gmane.linux.kernel/346883
> 2009-07-27: http://article.gmane.org/gmane.linux.kernel/869784
>
> The first thread resulted in Michael Clark suggesting his kernel
> module: https://github.com/michaeljclark/userblk The second
> essentially resulted in "use nbd". Mr. Clark's module is now over 10
> years old, and ndb seems like a bit of a Rube Goldberg solution.

I wrote the busybox and toybox nbd clients, and have a todo list item
to write an nbd server for toybox. I believe there's also an nbd
server in qemu. I haven't found any decent documentation on the
protocol yet, but what specifically makes you describe it as rube
goldberg?

Rob

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Userspace Block Device
  2015-05-19  5:34 ` Rob Landley
@ 2015-05-19 14:42   ` Bill Speirs
  2015-05-19 15:19     ` One Thousand Gnomes
  0 siblings, 1 reply; 5+ messages in thread
From: Bill Speirs @ 2015-05-19 14:42 UTC (permalink / raw)
  To: Rob Landley; +Cc: Kernel Mailing List

On Tue, May 19, 2015 at 1:34 AM, Rob Landley <rob@landley.net> wrote:
> On Mon, May 18, 2015 at 2:01 PM, Bill Speirs <bill.speirs@gmail.com> wrote:
>> My goal is to provide Amazon S3 or Google Cloud Storage as a block
>> device. I would like to leverage the libraries that exist for both
>> systems by servicing requests via a user space program.
>> ... ndb seems like a bit of a Rube Goldberg solution.
>
> I wrote the busybox and toybox nbd clients, and have a todo list item
> to write an nbd server for toybox. I believe there's also an nbd
> server in qemu. I haven't found any decent documentation on the
> protocol yet, but what specifically makes you describe it as rube
> goldberg?

My understanding of using nbd is:
- Write an ndb-server that is essentially a gateway between nbd and
S3/Google. For each nbd request, I translate it into the appropriate
S3/Google request and respond appropriately.
- I'd run the above server on the machine on some port.
- I'd run a client on the same server using 127.0.0.1 and the above
port, providing the nbd block device.
- Go drink a beer as I rack up a huge bill with Amazon or Google

Seems a bit much to run a client & server on the same machine with
socket overhead, etc. In looking at the code for your nbd-client
(https://github.com/landley/toybox/blob/master/toys/other/nbd_client.c)
I'm wondering if I couldn't just set a pipe instead of a socket in the
ioctl(nbd, NBD_SET_SOCK, sock) step, then have the same proc (or fork)
listening on the pipe so it's all in a single process/codebase.
Thoughts on this approach?

That said, clearly my bottleneck in all of this will be the
communication with S3/Google, and using something like dm-cache would
make it appear fast for most requests. So maybe my Rube Goldberg
comment was too over-the-top.

Thank you for the pointers and the feedback!

Bill-

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Userspace Block Device
  2015-05-19 14:42   ` Bill Speirs
@ 2015-05-19 15:19     ` One Thousand Gnomes
  2015-05-19 15:33       ` Bill Speirs
  0 siblings, 1 reply; 5+ messages in thread
From: One Thousand Gnomes @ 2015-05-19 15:19 UTC (permalink / raw)
  To: Bill Speirs; +Cc: Rob Landley, Kernel Mailing List

> - Write an ndb-server that is essentially a gateway between nbd and
> S3/Google. For each nbd request, I translate it into the appropriate
> S3/Google request and respond appropriately.
> - I'd run the above server on the machine on some port.
> - I'd run a client on the same server using 127.0.0.1 and the above
> port, providing the nbd block device.
> - Go drink a beer as I rack up a huge bill with Amazon or Google

And you probably would because the block layer will see a lot of I/O
requests that you would really want to process locally, as well as stuff
caused by working at the block not file level (like readaheads).

You also can't deal with coherency this way - eg sharing the virtual disk
between two systems because the file system code isn't expecting other
clients to modify the disk under it.

Rather than nbd you could also look at drbd or some similar kind of
setup where you keep the entire filestore locally and write back changes
to the remote copy. As you can never share the filestore when mounted you
can cache it pretty aggressively.

Alan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Userspace Block Device
  2015-05-19 15:19     ` One Thousand Gnomes
@ 2015-05-19 15:33       ` Bill Speirs
  0 siblings, 0 replies; 5+ messages in thread
From: Bill Speirs @ 2015-05-19 15:33 UTC (permalink / raw)
  To: One Thousand Gnomes; +Cc: Kernel Mailing List

On Tue, May 19, 2015 at 11:19 AM, One Thousand Gnomes
<gnomes@lxorguk.ukuu.org.uk> wrote:
>> ... rack up a huge bill with Amazon or Google
>
> And you probably would because the block layer will see a lot of I/O
> requests that you would really want to process locally, as well as stuff
> caused by working at the block not file level (like readaheads).
>
> You also can't deal with coherency this way - eg sharing the virtual disk
> between two systems because the file system code isn't expecting other
> clients to modify the disk under it.
>
> Rather than nbd you could also look at drbd or some similar kind of
> setup where you keep the entire filestore locally and write back changes
> to the remote copy. As you can never share the filestore when mounted you
> can cache it pretty aggressively.

What kinds of things could I process locally? I was thinking I could
keep a bitmap of "sectors" that have never been written to, then just
return zeroed-out sectors for those. What else I could do? Thoughts?

I'm not looking to share the filesystem, just never have to buy a
bigger disk again and get pseudo-backup along with it (I realize
things in my cache would be lost if my house burned to the ground).

drbd isn't really what I'm looking for, because I don't want to have
to buy a disk that's large enough to fit everything. Just a small fast
SSD (or RAM disk) to cache commonly used files, then spill-over to the
cloud for everything else. In theory, I would have a /home that is
"infinite", and fairly fast for things that are cached.

Thanks for the thoughts/points!

Bill-

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-05-19 15:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-18 19:01 Userspace Block Device Bill Speirs
2015-05-19  5:34 ` Rob Landley
2015-05-19 14:42   ` Bill Speirs
2015-05-19 15:19     ` One Thousand Gnomes
2015-05-19 15:33       ` Bill Speirs

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).