All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andreas Rohner <e0502196-oe7qfRrRQffzPE21tAIdciO7C/xPubJB@public.gmane.org>
To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: cleaner optimization and online defragmentation: status update
Date: Tue, 18 Jun 2013 20:30:09 +0200	[thread overview]
Message-ID: <51C0A731.9060509@student.tuwien.ac.at> (raw)

Hi,

I have written a simple defragmentation tool and some extensions to the
cleaner. The implementations are not tested enough yet, but I thought I
could use some feedback if I am headed into the right direction. I am
very happy about any suggestions for improvement, because I am quite new
to kernel development.

* Cleaner:

Links to Commits: [1] [2]

I have implemented two new policies for the cleaner. Namely "Greedy",
which selects the segments with the most free blocks, and
"Cost/Benefit", which is inspired by this paper [3]. f2fs uses
apparently the same algorithms.

Unfortunately both of them require, that the file system keeps track of
the free blocks per segment. This is not trivial, mainly because of
snapshots. I chose to simply ignore snapshots altogether. That way the
tracking is simple. The problem is of course, that the
selection policy goes wrong sometimes, because the actual number of free
blocks is less than reported. But it should still perform better, than
the "Timestamp" policy.

If a block is part of a snapshot, gets deleted, is cleaned and then the
snapshot is deleted, this block is actually free, but is not counted as
such.

Steps:
1. Block A is part of a snapshot
2. A gets deleted
3. A is cleaned (it is considered live because it is in the snapshot)
4. the snapshot is deleted
5. A is counted in su_nblocks, but it will never be decremented

If there is a whole segment full of these uncounted blocks they
will never be cleaned. To prevent this kind of starvation I just reset
all the counters to zero after a snapshot gets deleted. This makes the
deletion of snapshots quite expensive and temporarily degrades the
performance of "Cost/Benefit" policy to "Timestamp". This solution is
quite ugly and I would prefer something better, but I found no
other way to prevent this problem.

There is one additional potential problem though. To track the used
blocks per segment I used the su_nblocks attribute of struct
nilfs_segment_usage. This value is currently never used, except once in
the cleaner. But since the cleaner only cleans segments that are not
active or dirty, I assume that it always will be a full segment with
nilfs_get_blocks_per_segment(nilfs) blocks. I haven't tested that enough
yet, but it seems to work. Alternatively I could just add another
attribute su_live_nblocks to struct nilfs_segment_usage.

* Defragmentation:

Links to Commits: [4] [5]

It's just a simple proof-of-concept tool called nilfs-defrag [filename].
It takes the file to defragment as an argument.

It tries to find the mount point and get a pointer to struct nilfs, to
find out the block size and the number of segments per block. It uses
the FIEMAP ioctl to get the extent information, and if the number of
extents per segment exceeds a certain value, it tries to defragment
those extents.

I added a simple new NILFS_IOCTL_MARK_EXTENT_DIRTY, which just reads in
the corresponding blocks and marks them dirty. The dirty blocks are
automatically written out to a new segment and will be hopefully less
fragmented. It seems to work quite nicely, but it needs more testing. I
am not quite sure if I got the locking right in the kernel code.

Sorry for the long post

Best Regards,
Andreas Rohner

[1]
https://github.com/zeitgeist87/linux/commit/bc763ac47c04893d3fece4f2db59f46187415cc4
[2]
https://github.com/zeitgeist87/nilfs-utils/commit/ec8281964b3b57b1b79452d9cb03887e04a089b3
[3] http://dl.acm.org/citation.cfm?id=121137
[4]
https://github.com/zeitgeist87/linux/commit/9ce900df854b1cbc968d35fd7ed892d9bf3b52d8
[5]
https://github.com/zeitgeist87/nilfs-utils/commit/d32c43e26ad5059b79c0ecc3ff167a78b0f6c814
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

             reply	other threads:[~2013-06-18 18:30 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-18 18:30 Andreas Rohner [this message]
     [not found] ` <51C0A731.9060509-oe7qfRrRQffzPE21tAIdciO7C/xPubJB@public.gmane.org>
2013-06-19  7:17   ` cleaner optimization and online defragmentation: status update Vyacheslav Dubeyko
2013-06-19  9:19     ` Andreas Rohner
     [not found]       ` <51C177A4.3030204-oe7qfRrRQffzPE21tAIdciO7C/xPubJB@public.gmane.org>
2013-06-22 15:31         ` Vyacheslav Dubeyko
     [not found]           ` <2F76977A-589B-47EB-8818-382477099600-yeENwD64cLxBDgjK7y7TUQ@public.gmane.org>
2013-06-22 18:18             ` Andreas Rohner
2013-06-22 18:37             ` Clemens Eisserer
     [not found]               ` <CAFvQSYTZfiNQv4==v+m5W+SMCfrxm6WmV8gwKC_+yaW1MBfdOQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-06-24 13:42                 ` Vyacheslav Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51C0A731.9060509@student.tuwien.ac.at \
    --to=e0502196-oe7qfrrrqffzpe21taidcio7c/xpubjb@public.gmane.org \
    --cc=linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.