All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Cc: dsterba@suse.cz, jeffm@suse.com
Subject: [PATCH 00/14] Qgroup metadata reservation rework
Date: Tue, 12 Dec 2017 15:34:22 +0800	[thread overview]
Message-ID: <20171212073436.16447-1-wqu@suse.com> (raw)

[Overall]
The previous rework on qgroup reservation system put a lot of effort on
data, which works quite fine.

But it takes less focus on metadata reservation, causing some problem
like metadata reservation underflow and noisy kernel warning.

This patchset will try to address the remaining problem of metadata
reservation.

The idea of new qgroup metadata reservation is to use 2 types of
metadata reservation:
1) Per-transaction reservation
   Life span will be inside a transaction. Will be freed at transaction
   commit time.

2) Preallocated reservation
   For case where we reserve space before starting a transaction.
   Operation like dealloc and delayed-inode/item belongs to this type.

   This works similar to block_rsv, its reservation can be
   reserved/released at any timing caller like.

   The only point to notice is, if preallocated reservation is used and
   finished without problem, it should be converted to per-transaction
   type instead of just freeing.
   This is to co-operate with qgroup update at commit time.

For preallocated type, this patch will integrate them into inode_rsv
mechanism reworked by Josef, and delayed-inode/item reservation.


[Problem: Over-reserve]
Currently the patchset addresses metadata underflow quite well, but
due to the over-reserve nature of btrfs and highly bounded to inode_rsv,
qgroup metadata reservation also tends to be over-reserved.

This is especially obvious for small limit.
For 128M limit, it's will only be able to write about 70M before hitting
quota limit.
Although for larger limit, like 5G limit, it can reach 4.5G or more
before hitting limit.

Such over-reserved behavior can lead to some problem with existing test
cases (where limit is normally less than 20M).

While it's also possible to be addressed by use more accurate space other
than max estimations.

For example, to calculate metadata needed for delalloc, we use
btrfs_calc_trans_metadata_size(), which always try to reserve space for
CoW a full-height tree, and will also include csum size.
Both calculate is way over-killed for qgroup metadata reservation.

[Patch structure]
The patch is consist of 2 main parts:
1) Type based qgroup reservation
   The original patchset is sent several months ago.
   Nothing is modified at all, just rebased. And not conflict at all.

   It's from patch 1 to patch 6.

2) Split meta qgroup reservation into per-trans and prealloc sub types
   The real work to address metadata underflow.
   Due to the over-reserve problem, this part is still in RFC state.
   But the framework should mostly be fine, only needs extra fine-tuning
   to get more accurate qgroup rsv to avoid too early limit.

   It's from patch 7 to 14.

Qu Wenruo (14):
  btrfs: qgroup: Skeleton to support separate qgroup reservation type
  btrfs: qgroup: Introduce helpers to update and access new qgroup rsv
  btrfs: qgroup: Make qgroup_reserve and its callers to use separate
    reservation type
  btrfs: qgroup: Fix wrong qgroup reservation update for relationship
    modification
  btrfs: qgroup: Update trace events to use new separate rsv types
  btrfs: qgroup: Cleanup the remaining old reservation counters
  btrfs: qgroup: Split meta rsv type into meta_prealloc and
    meta_pertrans
  btrfs: qgroup: Don't use root->qgroup_meta_rsv for qgroup
  btrfs: qgroup: Introduce function to convert META_PREALLOC into
    META_PERTRANS
  btrfs: qgroup: Use separate meta reservation type for delalloc
  btrfs: delayed-inode: Use new qgroup meta rsv for delayed inode and
    item
  btrfs: qgroup: Use root->qgroup_meta_rsv_* to record qgroup meta
    reserved space
  btrfs: qgroup: Update trace events for metadata reservation
  Revert "btrfs: qgroups: Retry after commit on getting EDQUOT"

 fs/btrfs/ctree.h             |  15 +-
 fs/btrfs/delayed-inode.c     |  50 +++++--
 fs/btrfs/disk-io.c           |   2 +-
 fs/btrfs/extent-tree.c       |  49 +++---
 fs/btrfs/file.c              |  15 +-
 fs/btrfs/free-space-cache.c  |   2 +-
 fs/btrfs/inode-map.c         |   4 +-
 fs/btrfs/inode.c             |  27 ++--
 fs/btrfs/ioctl.c             |  10 +-
 fs/btrfs/ordered-data.c      |   2 +-
 fs/btrfs/qgroup.c            | 350 ++++++++++++++++++++++++++++++++-----------
 fs/btrfs/qgroup.h            | 102 ++++++++++++-
 fs/btrfs/relocation.c        |   9 +-
 fs/btrfs/transaction.c       |   8 +-
 include/trace/events/btrfs.h |  73 ++++++++-
 15 files changed, 537 insertions(+), 181 deletions(-)

-- 
2.15.1


             reply	other threads:[~2017-12-12  7:34 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-12  7:34 Qu Wenruo [this message]
2017-12-12  7:34 ` [PATCH 01/14] btrfs: qgroup: Skeleton to support separate qgroup reservation type Qu Wenruo
2017-12-12  7:34 ` [PATCH 02/14] btrfs: qgroup: Introduce helpers to update and access new qgroup rsv Qu Wenruo
2017-12-21 15:23   ` Nikolay Borisov
2017-12-12  7:34 ` [PATCH 03/14] btrfs: qgroup: Make qgroup_reserve and its callers to use separate reservation type Qu Wenruo
2017-12-12  7:34 ` [PATCH 04/14] btrfs: qgroup: Fix wrong qgroup reservation update for relationship modification Qu Wenruo
2017-12-12  7:34 ` [PATCH 05/14] btrfs: qgroup: Update trace events to use new separate rsv types Qu Wenruo
2017-12-12  7:34 ` [PATCH 06/14] btrfs: qgroup: Cleanup the remaining old reservation counters Qu Wenruo
2017-12-12  7:34 ` [PATCH 07/14] btrfs: qgroup: Split meta rsv type into meta_prealloc and meta_pertrans Qu Wenruo
2017-12-12  7:34 ` [PATCH 08/14] btrfs: qgroup: Don't use root->qgroup_meta_rsv for qgroup Qu Wenruo
2017-12-12  7:34 ` [PATCH 09/14] btrfs: qgroup: Introduce function to convert META_PREALLOC into META_PERTRANS Qu Wenruo
2017-12-12  7:34 ` [PATCH 10/14] btrfs: qgroup: Use separate meta reservation type for delalloc Qu Wenruo
2017-12-12  7:34 ` [PATCH 11/14] btrfs: delayed-inode: Use new qgroup meta rsv for delayed inode and item Qu Wenruo
2017-12-12  7:34 ` [PATCH 12/14] btrfs: qgroup: Use root->qgroup_meta_rsv_* to record qgroup meta reserved space Qu Wenruo
2017-12-12  7:34 ` [PATCH 13/14] btrfs: qgroup: Update trace events for metadata reservation Qu Wenruo
2017-12-12  7:34 ` [PATCH 14/14] Revert "btrfs: qgroups: Retry after commit on getting EDQUOT" Qu Wenruo
2017-12-12 14:16 ` [PATCH 00/14] Qgroup metadata reservation rework Nikolay Borisov
2017-12-12 18:01   ` David Sterba
2017-12-13  0:54     ` Qu Wenruo
2017-12-12 21:12 ` David Sterba
2017-12-13  0:55   ` Qu Wenruo
2018-03-26 14:10     ` David Sterba
2018-03-26 23:49       ` Qu Wenruo
2018-03-27 15:23         ` David Sterba
2018-03-27 18:00           ` Filipe Manana
2018-03-27 16:30         ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171212073436.16447-1-wqu@suse.com \
    --to=wqu@suse.com \
    --cc=dsterba@suse.cz \
    --cc=jeffm@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.