linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joseph Dunn <jdunn14@gmail.com>
To: Chris Murphy <lists@colorremedies.com>
Cc: Anand Jain <anand.jain@oracle.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: btrfs seed question
Date: Thu, 12 Oct 2017 11:50:20 -0400	[thread overview]
Message-ID: <20171012115020.070248a7@olive.ig.local> (raw)
In-Reply-To: <CAJCQCtRrCMqp3FHaCqe-ZjNLryOGvhZE3MoXMqarqO7zsK4_FQ@mail.gmail.com>

On Thu, 12 Oct 2017 16:30:36 +0100
Chris Murphy <lists@colorremedies.com> wrote:

> On Thu, Oct 12, 2017 at 3:44 PM, Joseph Dunn <jdunn14@gmail.com> wrote:
> > On Thu, 12 Oct 2017 15:32:24 +0100
> > Chris Murphy <lists@colorremedies.com> wrote:
> >  
> >> On Thu, Oct 12, 2017 at 2:20 PM, Joseph Dunn <jdunn14@gmail.com> wrote:  
> >> >
> >> > On Thu, 12 Oct 2017 12:18:01 +0800
> >> > Anand Jain <anand.jain@oracle.com> wrote:
> >> >  
> >> > > On 10/12/2017 08:47 AM, Joseph Dunn wrote:  
> >> > > > After seeing how btrfs seeds work I wondered if it was possible to push
> >> > > > specific files from the seed to the rw device.  I know that removing
> >> > > > the seed device will flush all the contents over to the rw device, but
> >> > > > what about flushing individual files on demand?
> >> > > >
> >> > > > I found that opening a file, reading the contents, seeking back to 0,
> >> > > > and writing out the contents does what I want, but I was hoping for a
> >> > > > bit less of a hack.
> >> > > >
> >> > > > Is there maybe an ioctl or something else that might trigger a similar
> >> > > > action?  
> >> > >
> >> > >    You mean to say - seed-device delete to trigger copy of only the
> >> > > specified or the modified files only, instead of whole of seed-device ?
> >> > > What's the use case around this ?
> >> > >  
> >> >
> >> > Not quite.  While the seed device is still connected I would like to
> >> > force some files over to the rw device.  The use case is basically a
> >> > much slower link to a seed device holding significantly more data than
> >> > we currently need.  An example would be a slower iscsi link to the seed
> >> > device and a local rw ssd.  I would like fast access to a certain subset
> >> > of files, likely larger than the memory cache will accommodate.  If at
> >> > a later time I want to discard the image as a whole I could unmount the
> >> > file system or if I want a full local copy I could delete the
> >> > seed-device to sync the fs.  In the mean time I would have access to
> >> > all the files, with some slower (iscsi) and some faster (ssd) and the
> >> > ability to pick which ones are in the faster group at the cost of one
> >> > content transfer.  
> >>
> >>
> >> Multiple seeds?
> >>
> >> Seed A has everything, is remote. Create sprout B also remotely,
> >> deleting the things you don't absolutely need, then make it a seed.
> >> Now via iSCSI you can mount both A and B seeds. Add local rw sprout C
> >> to seed B, then delete B to move files to fast local storage.
> >>  
> > Interesting thought.  I haven't tried working with multiple seeds but
> > I'll see what that can do.  I will say that this approach would require
> > more pre-planning meaning that the choice of fast files could not be
> > made based on current access patterns to tasks at hand.  This might
> > make sense for a core set of files, but it doesn't quite solve the
> > whole problem.  
> 
> 
> I think the use case really dictates a dynamic solution that's smarter
> than either of the proposed ideas (mine or yours). Basically you want
> something that recognizes slow vs fast storage, and intelligently
> populates fast storage with frequently used files.
> 
> Ostensibly this is the realm of dmcache. But I can't tell you whether
> dmcache or via LVM tools, if it's possible to set the proper policy to
> make it work for your use case. And also I have no idea how to set it
> up after the fact, on an already created file system, rather than
> block devices.
> 
> The hot vs cold files thing, is something I thought the VFS folks were
> looking into.
> 
As a consumer of the file system data I tend to see things at a file
level rather than as blocks, but from a block level this does feel
dmcache-ish and I'll look into it.

I did try the multiple seeds approach and correct me if I'm wrong, but
once the files are deleted in the second seed they are no longer
accessible in anything sprouted from that.  The fully dynamic solution
would be nice, but I'm perfectly happy to pick the files for the ssd at
run time, just not at file system preparation time.  In any case, I may
fall back on inotify and overwriting file contents if I don't end up
with a better solution using dmcache or LVM tricks.

-Joseph Dunn

  reply	other threads:[~2017-10-12 15:50 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-12  0:47 btrfs seed question Joseph Dunn
2017-10-12  4:18 ` Anand Jain
2017-10-12 13:20   ` Joseph Dunn
2017-10-12 14:32     ` Chris Murphy
2017-10-12 14:44       ` Joseph Dunn
2017-10-12 15:30         ` Chris Murphy
2017-10-12 15:50           ` Joseph Dunn [this message]
2017-11-03  8:03             ` Kai Krakow
2017-10-12 15:55           ` Austin S. Hemmelgarn
2017-10-13  2:52         ` Anand Jain
2017-11-03  7:56     ` Kai Krakow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171012115020.070248a7@olive.ig.local \
    --to=jdunn14@gmail.com \
    --cc=anand.jain@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).