linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/42][v5] My current patch queue
@ 2018-10-12 19:32 Josef Bacik
  2018-10-12 19:32 ` [PATCH 01/42] btrfs: add btrfs_delete_ref_head helper Josef Bacik
                   ` (41 more replies)
  0 siblings, 42 replies; 49+ messages in thread
From: Josef Bacik @ 2018-10-12 19:32 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

v3->v4:
- added stacktraces to all the changelogs
- added the various reviewed-by's.
- fixed the loop in inode_rsv_refill to not use goto again;

v2->v3:
- reworked the truncate/evict throttling, we were still occasionally hitting
  enospc aborts in production in these paths because we were too aggressive with
  space usage.
- reworked the delayed iput stuff to be a little less racey and less deadlocky.
- Addressed the comments from Dave and Omar.
- A lot of production testing.

v1->v2:
- addressed all of the issues brought up.
- added more comments.
- split up some patches.

original message:

This is the current queue of things that I've been working on.  The main thing
these patches are doing is separating out the delayed refs reservations from the
global reserve into their own block rsv.  We have been consistently hitting
issues in production where we abort a transaction because we run out of the
global reserve either while running delayed refs or while updating dirty block
groups.  This is because the math around global reserves is made up bullshit
magic that has been tweaked more and more throughout the years.  The result is
something that is inconsistent across the board and sometimes wrong.  So instead
we need a way to know exactly how much space we need to keep around in order to
satisfy our outstanding delayed refs and our dirty block groups.

Since we don't know how many delayed refs we need at the start of any
modification we simply use the nr_items passed into btrfs_start_transaction() as
a guess for what we may need.  This has the side effect of putting more pressure
on the ENOSPC system, but it's pressure we can deal with more intelligently
because we always know how much space we have outstanding, instead of guessing
with weird global reserve math.

This works similar to every other reservation we have, we reserve the worst case
up front, and then at transaction end time we free up any space we didn't
actually use for delayed refs.

My performance tests show that we are bit faster now since we can do more
intelligent flushing and don't have to fall back on simply committing the
transaction in hopes that we have enough space for everything we need to do.

That leads me to the 2nd part of this pull, there's a bunch of fixes around
ENOSPC.  Because we are a bit faster now there were a bunch of things uncovered
in testing, but they seem to be all resolved now.

The final chunk of fixes are around transaction aborts.  There were a lot of
accounting bugs I was running into while running generic/435, so I fixed a bunch
of those up so now it runs cleanly.

I have been running these patches through xfstests on multiple machines for a
while, they are pretty solid and ready for wider testing and review.  Thanks,

Josef


^ permalink raw reply	[flat|nested] 49+ messages in thread
* [PATCH 00/42][v4] My current patch queue
@ 2018-10-11 19:53 Josef Bacik
  2018-10-11 19:53 ` [PATCH 01/42] btrfs: add btrfs_delete_ref_head helper Josef Bacik
  0 siblings, 1 reply; 49+ messages in thread
From: Josef Bacik @ 2018-10-11 19:53 UTC (permalink / raw)
  To: kernel-team, linux-btrfs

v3->v4:
- added stacktraces to all the changelogs
- added the various reviewed-by's.
- fixed the loop in inode_rsv_refill to not use goto again;

v2->v3:
- reworked the truncate/evict throttling, we were still occasionally hitting
  enospc aborts in production in these paths because we were too aggressive with
  space usage.
- reworked the delayed iput stuff to be a little less racey and less deadlocky.
- Addressed the comments from Dave and Omar.
- A lot of production testing.

v1->v2:
- addressed all of the issues brought up.
- added more comments.
- split up some patches.

original message:

This is the current queue of things that I've been working on.  The main thing
these patches are doing is separating out the delayed refs reservations from the
global reserve into their own block rsv.  We have been consistently hitting
issues in production where we abort a transaction because we run out of the
global reserve either while running delayed refs or while updating dirty block
groups.  This is because the math around global reserves is made up bullshit
magic that has been tweaked more and more throughout the years.  The result is
something that is inconsistent across the board and sometimes wrong.  So instead
we need a way to know exactly how much space we need to keep around in order to
satisfy our outstanding delayed refs and our dirty block groups.

Since we don't know how many delayed refs we need at the start of any
modification we simply use the nr_items passed into btrfs_start_transaction() as
a guess for what we may need.  This has the side effect of putting more pressure
on the ENOSPC system, but it's pressure we can deal with more intelligently
because we always know how much space we have outstanding, instead of guessing
with weird global reserve math.

This works similar to every other reservation we have, we reserve the worst case
up front, and then at transaction end time we free up any space we didn't
actually use for delayed refs.

My performance tests show that we are bit faster now since we can do more
intelligent flushing and don't have to fall back on simply committing the
transaction in hopes that we have enough space for everything we need to do.

That leads me to the 2nd part of this pull, there's a bunch of fixes around
ENOSPC.  Because we are a bit faster now there were a bunch of things uncovered
in testing, but they seem to be all resolved now.

The final chunk of fixes are around transaction aborts.  There were a lot of
accounting bugs I was running into while running generic/435, so I fixed a bunch
of those up so now it runs cleanly.

I have been running these patches through xfstests on multiple machines for a
while, they are pretty solid and ready for wider testing and review.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 49+ messages in thread
* [PATCH 00/42][v3] My current patch queue
@ 2018-09-28 11:17 Josef Bacik
  2018-09-28 11:17 ` [PATCH 01/42] btrfs: add btrfs_delete_ref_head helper Josef Bacik
  0 siblings, 1 reply; 49+ messages in thread
From: Josef Bacik @ 2018-09-28 11:17 UTC (permalink / raw)
  To: kernel-team, linux-btrfs

v2->v3:
- reworked the truncate/evict throttling, we were still occasionally hitting
  enospc aborts in production in these paths because we were too aggressive with
  space usage.
- reworked the delayed iput stuff to be a little less racey and less deadlocky.
- Addressed the comments from Dave and Omar.
- A lot of production testing.

v1->v2:
- addressed all of the issues brought up.
- added more comments.
- split up some patches.

original message:

This is the current queue of things that I've been working on.  The main thing
these patches are doing is separating out the delayed refs reservations from the
global reserve into their own block rsv.  We have been consistently hitting
issues in production where we abort a transaction because we run out of the
global reserve either while running delayed refs or while updating dirty block
groups.  This is because the math around global reserves is made up bullshit
magic that has been tweaked more and more throughout the years.  The result is
something that is inconsistent across the board and sometimes wrong.  So instead
we need a way to know exactly how much space we need to keep around in order to
satisfy our outstanding delayed refs and our dirty block groups.

Since we don't know how many delayed refs we need at the start of any
modification we simply use the nr_items passed into btrfs_start_transaction() as
a guess for what we may need.  This has the side effect of putting more pressure
on the ENOSPC system, but it's pressure we can deal with more intelligently
because we always know how much space we have outstanding, instead of guessing
with weird global reserve math.

This works similar to every other reservation we have, we reserve the worst case
up front, and then at transaction end time we free up any space we didn't
actually use for delayed refs.

My performance tests show that we are bit faster now since we can do more
intelligent flushing and don't have to fall back on simply committing the
transaction in hopes that we have enough space for everything we need to do.

That leads me to the 2nd part of this pull, there's a bunch of fixes around
ENOSPC.  Because we are a bit faster now there were a bunch of things uncovered
in testing, but they seem to be all resolved now.

The final chunk of fixes are around transaction aborts.  There were a lot of
accounting bugs I was running into while running generic/435, so I fixed a bunch
of those up so now it runs cleanly.

I have been running these patches through xfstests on multiple machines for a
while, they are pretty solid and ready for wider testing and review.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2018-10-18 16:46 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-12 19:32 [PATCH 00/42][v5] My current patch queue Josef Bacik
2018-10-12 19:32 ` [PATCH 01/42] btrfs: add btrfs_delete_ref_head helper Josef Bacik
2018-10-12 19:32 ` [PATCH 02/42] btrfs: add cleanup_ref_head_accounting helper Josef Bacik
2018-10-12 19:32 ` [PATCH 03/42] btrfs: cleanup extent_op handling Josef Bacik
2018-10-12 19:32 ` [PATCH 04/42] btrfs: only track ref_heads in delayed_ref_updates Josef Bacik
2018-10-12 19:32 ` [PATCH 05/42] btrfs: only count ref heads run in __btrfs_run_delayed_refs Josef Bacik
2018-10-12 19:32 ` [PATCH 06/42] btrfs: introduce delayed_refs_rsv Josef Bacik
2018-10-12 19:32 ` [PATCH 07/42] btrfs: check if free bgs for commit Josef Bacik
2018-10-12 19:32 ` [PATCH 08/42] btrfs: dump block_rsv whe dumping space info Josef Bacik
2018-10-12 19:32 ` [PATCH 09/42] btrfs: release metadata before running delayed refs Josef Bacik
2018-10-12 19:32 ` [PATCH 10/42] btrfs: protect space cache inode alloc with nofs Josef Bacik
2018-10-12 19:32 ` [PATCH 11/42] btrfs: fix truncate throttling Josef Bacik
2018-10-12 19:32 ` [PATCH 12/42] btrfs: don't use global rsv for chunk allocation Josef Bacik
2018-10-12 19:32 ` [PATCH 13/42] btrfs: add ALLOC_CHUNK_FORCE to the flushing code Josef Bacik
2018-10-12 19:32 ` [PATCH 14/42] btrfs: reset max_extent_size properly Josef Bacik
2018-10-12 19:32 ` [PATCH 15/42] btrfs: don't enospc all tickets on flush failure Josef Bacik
2018-10-12 19:32 ` [PATCH 16/42] btrfs: loop in inode_rsv_refill Josef Bacik
2018-10-12 19:32 ` [PATCH 17/42] btrfs: run delayed iputs before committing Josef Bacik
2018-10-12 19:32 ` [PATCH 18/42] btrfs: move the dio_sem higher up the callchain Josef Bacik
2018-10-18 16:46   ` David Sterba
2018-10-12 19:32 ` [PATCH 19/42] btrfs: set max_extent_size properly Josef Bacik
2018-10-17 11:16   ` David Sterba
2018-10-12 19:32 ` [PATCH 20/42] btrfs: don't use ctl->free_space for max_extent_size Josef Bacik
2018-10-12 19:32 ` [PATCH 21/42] btrfs: reset max_extent_size on clear in a bitmap Josef Bacik
2018-10-12 19:32 ` [PATCH 22/42] btrfs: only run delayed refs if we're committing Josef Bacik
2018-10-12 19:32 ` [PATCH 23/42] btrfs: make sure we create all new bgs Josef Bacik
2018-10-12 19:32 ` [PATCH 24/42] btrfs: assert on non-empty delayed iputs Josef Bacik
2018-10-12 19:32 ` [PATCH 25/42] btrfs: pass delayed_refs_root to btrfs_delayed_ref_lock Josef Bacik
2018-10-12 19:32 ` [PATCH 26/42] btrfs: make btrfs_destroy_delayed_refs use btrfs_delayed_ref_lock Josef Bacik
2018-10-12 19:32 ` [PATCH 27/42] btrfs: make btrfs_destroy_delayed_refs use btrfs_delete_ref_head Josef Bacik
2018-10-12 19:32 ` [PATCH 28/42] btrfs: handle delayed ref head accounting cleanup in abort Josef Bacik
2018-10-12 19:32 ` [PATCH 29/42] btrfs: call btrfs_create_pending_block_groups unconditionally Josef Bacik
2018-10-12 19:32 ` [PATCH 30/42] btrfs: just delete pending bgs if we are aborted Josef Bacik
2018-10-12 19:32 ` [PATCH 31/42] btrfs: cleanup pending bgs on transaction abort Josef Bacik
2018-10-12 19:32 ` [PATCH 32/42] btrfs: only free reserved extent if we didn't insert it Josef Bacik
2018-10-12 19:32 ` [PATCH 33/42] btrfs: fix insert_reserved error handling Josef Bacik
2018-10-12 19:32 ` [PATCH 34/42] btrfs: wait on ordered extents on abort cleanup Josef Bacik
2018-10-12 19:32 ` [PATCH 35/42] MAINTAINERS: update my email address for btrfs Josef Bacik
2018-10-12 19:32 ` [PATCH 36/42] btrfs: wait on caching when putting the bg cache Josef Bacik
2018-10-12 19:32 ` [PATCH 37/42] btrfs: wakeup cleaner thread when adding delayed iput Josef Bacik
2018-10-12 19:32 ` [PATCH 38/42] btrfs: be more explicit about allowed flush states Josef Bacik
2018-10-12 19:32 ` [PATCH 39/42] btrfs: replace cleaner_delayed_iput_mutex with a waitqueue Josef Bacik
2018-10-12 19:32 ` [PATCH 40/42] btrfs: drop min_size from evict_refill_and_join Josef Bacik
2018-10-12 19:32 ` [PATCH 41/42] btrfs: reserve extra space during evict() Josef Bacik
2018-10-12 19:32 ` [PATCH 42/42] btrfs: don't run delayed_iputs in commit Josef Bacik
2018-10-12 20:45   ` Filipe Manana
2018-10-17 11:45   ` David Sterba
  -- strict thread matches above, loose matches on Subject: below --
2018-10-11 19:53 [PATCH 00/42][v4] My current patch queue Josef Bacik
2018-10-11 19:53 ` [PATCH 01/42] btrfs: add btrfs_delete_ref_head helper Josef Bacik
2018-09-28 11:17 [PATCH 00/42][v3] My current patch queue Josef Bacik
2018-09-28 11:17 ` [PATCH 01/42] btrfs: add btrfs_delete_ref_head helper Josef Bacik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).