linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL] Btrfs updates for 4.18
@ 2018-06-04 15:43 David Sterba
  2018-06-09 16:21 ` Filipe Manana
  0 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2018-06-04 15:43 UTC (permalink / raw)
  To: torvalds; +Cc: David Sterba, clm, linux-btrfs, linux-kernel

Hi,

there are some new features and a usual load of cleanups, more details below.

Specifically, there's a set of new non-privileged ioctls to allow
subvolume listing.  It works but still needs a security review as it's a
new interface and we might need to do some tweaks to the data
structures. The fixes could be considred regressions but may touch the
interfaces too.

Currently there are no merge conflicts but linux-next has reported a few
in the past, originating from other *FS trees.

Please pull, thanks.

---

User visible features:

- added support for the ioctl FS_IOC_FSGETXATTR, per-inode flags, successor
  of GET/SETFLAGS; now supports only existing flags: append, immutable,
  noatime, nodump, sync

- 3 new unprivileged ioctls to allow users to enumerate subvolumes

- dedupe syscall implementation does not restrict the range to 16MiB, though it
  still splits the whole range to 16MiB chunks

- on user demand, rmdir() is able to delete an empty subvolume, export the
  capability in sysfs

- fix inode number types in tracepoints, other cleanups

- send: improved speed when dealing with a large removed directory,
  measurements show decrease from 2000 minutes to 2 minutes on a directory with
  2 million entries

- pre-commit check of superblock to detect a mysterious in-memory corruption

- log message updates


Other changes:

- orphan inode cleanup improved, does no keep long-standing reservations that
  could lead up to early ENOSPC in some cases

- slight improvement of handling snapshotted NOCOW files by avoiding some
  unnecessary tree searches

- avoid OOM when dealing with many unmergeable small extents at flush time

- speedup conversion of free space tree representations from/to bitmap/tree

- code refactoring, deletion, cleanups
  - delayed refs
  - delayed iput
  - redundant argument removals
  - memory barrier cleanups
  - remove a redundant mutex supposedly excluding several ioctls to run in
    parallel

- new tracepoints for blockgroup manipulation

- more sanity checks of compressed headers

----------------------------------------------------------------
The following changes since commit b04e217704b7f879c6b91222b066983a44a7a09f:

  Linux 4.17-rc7 (2018-05-27 13:01:47 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-4.18-tag

for you to fetch changes up to 23d0b79dfaed2305b500b0215b0421701ada6b1a:

  btrfs: Add unprivileged version of ino_lookup ioctl (2018-05-31 11:35:24 +0200)

----------------------------------------------------------------
Al Viro (1):
      btrfs: take the last remnants of ->d_fsdata use out

Anand Jain (19):
      btrfs: add comment about BTRFS_FS_EXCL_OP
      btrfs: rename struct btrfs_fs_devices::list
      btrfs: cleanup __btrfs_open_devices() drop head pointer
      btrfs: rename __btrfs_close_devices to close_fs_devices
      btrfs: rename __btrfs_open_devices to open_fs_devices
      btrfs: cleanup find_device() drop list_head pointer
      btrfs: cleanup btrfs_rm_device() promote fs_devices pointer
      btrfs: move btrfs_raid_type_names values to btrfs_raid_attr table
      btrfs: move btrfs_raid_group values to btrfs_raid_attr table
      btrfs: move btrfs_raid_mindev_errorvalues to btrfs_raid_attr table
      btrfs: reduce uuid_mutex critical section while scanning devices
      btrfs: use existing cur_devices, cleanup btrfs_rm_device
      btrfs: document uuid_mutex uasge in read_chunk_tree
      btrfs: replace uuid_mutex by device_list_mutex in btrfs_open_devices
      btrfs: drop uuid_mutex in btrfs_dev_replace_finishing
      btrfs: drop uuid_mutex in btrfs_destroy_dev_replace_tgtdev
      btrfs: use common variable for fs_devices in btrfs_destroy_dev_replace_tgtdev
      btrfs: add prefix "balance:" for log messages
      btrfs: fix describe_relocation when printing unknown flags

Chengguang Xu (1):
      btrfs: return original error code when failing from option parsing

Colin Ian King (1):
      btrfs: send: fix spelling mistake: "send_in_progres" -> "send_in_progress"

David Sterba (38):
      btrfs: tracepoints, use correct type for inode number
      btrfs: tracepoints, use %llu instead of %Lu
      btrfs: tracepoints, drop unnecessary ULL casts
      btrfs: tracepoints, fix whitespace in strings
      btrfs: tracepoints, use extended format with UUID where possible
      btrfs: tests: pass fs_info to extent_map tests
      btrfs: use fs_info for btrfs_handle_em_exist tracepoint
      btrfs: squeeze btrfs_dev_replace_continue_on_mount to its caller
      btrfs: make success path out of btrfs_init_dev_replace_tgtdev more clear
      btrfs: export and rename free_device
      btrfs: move btrfs_init_dev_replace_tgtdev to dev-replace.c and make static
      btrfs: move volume_mutex to callers of btrfs_rm_device
      btrfs: move clearing of EXCL_OP out of __cancel_balance
      btrfs: add proper safety check before resuming dev-replace
      btrfs: add sanity check when resuming balance after mount
      btrfs: cleanup helpers that reset balance state
      btrfs: remove wrong use of volume_mutex from btrfs_dev_replace_start
      btrfs: kill btrfs_fs_info::volume_mutex
      btrfs: track running balance in a simpler way
      btrfs: move and comment read-only check in btrfs_cancel_balance
      btrfs: drop lock parameter from update_ioctl_balance_args and rename
      btrfs: use mutex in btrfs_resume_balance_async
      btrfs: open code set_balance_control
      btrfs: remove redundant btrfs_balance_control::fs_info
      btrfs: introduce conditional wakeup helpers
      btrfs: add barriers to btrfs_sync_log before log_commit_wait wakeups
      btrfs: replace waitqueue_actvie with cond_wake_up
      btrfs: rename btrfs_update_iflags to reflect which flags it touches
      btrfs: rename btrfs_mask_flags to reflect which flags it touches
      btrfs: rename check_flags to reflect which flags it touches
      btrfs: rename btrfs_flags_to_ioctl to reflect which flags it touches
      btrfs: add helpers for FS_XFLAG_* conversion
      btrfs: add FS_IOC_FSGETXATTR ioctl
      btrfs: add FS_IOC_FSSETXATTR ioctl
      btrfs: unify naming of flags variables for SETFLAGS and XFLAGS
      btrfs: use kvzalloc for EXTENT_SAME temporary data
      btrfs: tests: add helper for error messages and update them
      btrfs: tests: drop newline from test_msg strings

Ethan Lien (2):
      btrfs: lift some btrfs_cross_ref_exist checks in nocow path
      btrfs: balance dirty metadata pages in btrfs_finish_ordered_io

Gu JinXiang (2):
      btrfs: drop unused parameter qgroup_reserved
      btrfs: drop useless member qgroup_reserved of btrfs_pending_snapshot

Gu Jinxiang (3):
      btrfs: remove unused fs_info parameter
      btrfs: do reverse path readahead in btrfs_shrink_device
      btrfs: propagate failures of __exclude_logged_extent to upper caller

Howard McLauchlan (3):
      btrfs: clean up le_bitmap_{set, clear}()
      btrfs: optimize free space tree bitmap conversion
      btrfs: remove unused le_test_bit()

Kees Cook (1):
      btrfs: raid56: Remove VLA usage

Liu Bo (7):
      Btrfs: add parent_transid parameter to veirfy_level_key
      Btrfs: remove superfluous free_extent_buffer in read_block_for_search
      Btrfs: use more straightforward extent_buffer_uptodate check
      Btrfs: move get root out of btrfs_search_slot to a helper
      Btrfs: grab write lock directly if write_lock_level is the max level
      Btrfs: remove always true check in unlock_up
      Btrfs: remove unused check of skip_locking

Lu Fengqi (3):
      btrfs: drop unused space_info parameter from create_space_info
      btrfs: Remove fs_info argument from btrfs_uuid_tree_add
      btrfs: Remove fs_info argument from btrfs_uuid_tree_rem

Misono Tomohiro (5):
      btrfs: Move may_destroy_subvol() from ioctl.c to inode.c
      btrfs: Factor out the main deletion process from btrfs_ioctl_snap_destroy()
      btrfs: Allow rmdir(2) to delete an empty subvolume
      btrfs: sysfs: Add entry which shows if rmdir can work on subvolumes
      btrfs: use error code returned by btrfs_read_fs_root_no_name in search ioctl

Nikolay Borisov (54):
      btrfs: Replace owner argument in add_pinned_bytes with a boolean
      btrfs: Drop delayed_refs argument from btrfs_check_delayed_seq
      btrfs: Use while loop instead of labels in __endio_write_update_ordered
      btrfs: Fix lock release order
      btrfs: Consolidate error checking for btrfs_alloc_chunk
      btrfs: Sink extent_tree arguments in try_release_extent_mapping
      btrfs: Remove map argument from try_release_extent_state
      btrfs: Remove redundant tree argument from extent_readpages
      btrfs: Use list_empty instead of list_empty_careful
      btrfs: Remove tree argument from extent_writepages
      btrfs: Remove btrfs_wait_and_free_delalloc_work
      btrfs: Drop add_delayed_ref_head fs_info parameter
      btrfs: Drop fs_info parameter from add_delayed_data_ref
      btrfs: Drop fs_info parameter from btrfs_merge_delayed_refs
      btrfs: Remove delayed_iput parameter of btrfs_start_delalloc_roots
      btrfs: Remove delayed_iput parameter from btrfs_start_delalloc_inodes
      btrfs: Remove delay_iput parameter from __start_delalloc_inodes
      btrfs: Remove delayed_iput member from btrfs_delalloc_work
      btrfs: Unexport btrfs_alloc_delalloc_work
      btrfs: Remove devid parameter from btrfs_rmap_block
      btrfs: Factor out common delayed refs init code
      btrfs: Use init_delayed_ref_common in add_delayed_tree_ref
      btrfs: Use init_delayed_ref_common in add_delayed_data_ref
      btrfs: Open-code add_delayed_tree_ref
      btrfs: Open-code add_delayed_data_ref
      btrfs: Introduce init_delayed_ref_head
      btrfs: Use init_delayed_ref_head in add_delayed_ref_head
      btrfs: split delayed ref head initialization and addition
      btrfs: Add assert in __btrfs_del_delalloc_inode
      btrfs: Make btrfs_init_dummy_trans initialize trans' fs_info field
      btrfs: Remove fs_info argument from add_block_group_free_space
      btrfs: Remove fs_info argument from __add_block_group_free_space
      btrfs: Remove fs_info argument from __add_to_free_space_tree
      btrfs: Remove fs_info parameter from add_new_free_space_info
      btrfs: Remove fs_info argument from add_new_free_space
      btrfs: Remove fs_info parameter from remove_block_group_free_space
      btrfs: Remove fs_info argument from convert_free_space_to_bitmaps
      btrfs: Remove fs_info parameter from convert_free_space_to_extents
      btrfs: Remove fs_info argument from update_free_space_extent_count
      btrfs: Remove fs_info argument from modify_free_space_bitmap
      btrfs: Remove fs_info argument from add_free_space_extent
      btrfs: Remove fs_info argument from remove_free_space_extent
      btrfs: Remove fs_info argument from __remove_from_free_space_tree
      btrfs: Remove fs_info argument from remove_from_free_space_tree
      btrfs: Remove fs_info argument from add_to_free_space_tree
      btrfs: Remove fs_info argument from populate_free_space_tree
      btrfs: Unexport and rename btrfs_invalidate_inodes
      btrfs: Remove stale comment about select_delayed_ref
      btrfs: Remove fs_info argument from alloc_reserved_tree_block
      btrfs: Simplify alloc_reserved_tree_block interface
      btrfs: Pass btrfs_delayed_extent_op to alloc_reserved_tree_block
      btrfs: Streamline shared ref check in alloc_reserved_tree_block
      btrfs: Factor out read portion of btrfs_get_blocks_direct
      btrfs: Factor out write portion of btrfs_get_blocks_direct

Omar Sandoval (16):
      Btrfs: update stale comments referencing vmtruncate()
      Btrfs: fix error handling in btrfs_truncate_inode_items()
      Btrfs: don't BUG_ON() in btrfs_truncate_inode_items()
      Btrfs: stop creating orphan items for truncate
      Btrfs: get rid of BTRFS_INODE_HAS_ORPHAN_ITEM
      Btrfs: delete dead code in btrfs_orphan_commit_root()
      Btrfs: don't return ino to ino cache if inode item removal fails
      Btrfs: refactor btrfs_evict_inode() reserve refill dance
      Btrfs: fix ENOSPC caused by orphan items reservations
      Btrfs: get rid of unused orphan infrastructure
      Btrfs: renumber BTRFS_INODE_ runtime flags and switch to enums
      Btrfs: reserve space for O_TMPFILE orphan item deletion
      Btrfs: allow empty subvol= again
      Btrfs: fix clone vs chattr NODATASUM race
      Btrfs: fix memory and mount leak in btrfs_ioctl_rm_dev_v2()
      Btrfs: clean up error handling in btrfs_truncate()

Qu Wenruo (15):
      btrfs: print-tree: Add eb locking status output for debug build
      btrfs: trace: Remove unnecessary fs_info parameter for btrfs__reserve_extent event class
      btrfs: trace: Add trace points for unused block groups
      btrfs: trace: Allow trace_qgroup_update_counters() to record old rfer/excl value
      btrfs: qgroup: Allow trace_btrfs_qgroup_account_extent() to record its transid
      btrfs: Move btrfs_check_super_valid() to avoid forward declaration
      btrfs: Refactor btrfs_check_super_valid
      btrfs: Do super block verification before writing it to disk
      btrfs: qgroup: Search commit root for rescan to avoid missing extent
      btrfs: qgroup: Finish rescan when hit the last leaf of extent tree
      btrfs: compression: Add linux/sizes.h for compression.h
      btrfs: lzo: document the compressed data format
      btrfs: lzo: Add header length check to avoid potential out-of-bounds access
      btrfs: lzo: Harden inline lzo compressed extent decompression
      btrfs: qgroup: show more meaningful qgroup_rescan_init error message

Robbie Ko (2):
      btrfs: incremental send, move allocation until it's needed in orphan_dir_info
      btrfs: incremental send, improve rmdir performance for large directory

Su Yue (3):
      btrfs: rename btrfs_get_block_group_info and make it static
      btrfs: return error value if create_io_em failed in cow_file_range
      btrfs: return ENOMEM if path allocation fails in btrfs_cross_ref_exist

Timofey Titovets (3):
      Btrfs: split btrfs_extent_same
      Btrfs: dedupe_file_range ioctl: remove 16MiB restriction
      Btrfs: reuse cmp workspace in EXTENT_SAME ioctl

Tomohiro Misono (4):
      btrfs: sysfs: Use enum/define value for feature array definitions
      btrfs: Add unprivileged ioctl which returns subvolume information
      btrfs: Add unprivileged ioctl which returns subvolume's ROOT_REF
      btrfs: Add unprivileged version of ino_lookup ioctl

 fs/btrfs/btrfs_inode.h                 |   22 +-
 fs/btrfs/compression.c                 |    7 +-
 fs/btrfs/compression.h                 |    2 +
 fs/btrfs/ctree.c                       |  123 +--
 fs/btrfs/ctree.h                       |   76 +-
 fs/btrfs/delayed-inode.c               |    9 +-
 fs/btrfs/delayed-ref.c                 |  275 +++----
 fs/btrfs/delayed-ref.h                 |    5 +-
 fs/btrfs/dev-replace.c                 |  150 +++-
 fs/btrfs/disk-io.c                     |  391 +++++----
 fs/btrfs/extent-tree.c                 |  253 +++---
 fs/btrfs/extent_io.c                   |   62 +-
 fs/btrfs/extent_io.h                   |   20 +-
 fs/btrfs/extent_map.c                  |    6 +-
 fs/btrfs/extent_map.h                  |    3 +-
 fs/btrfs/free-space-cache.c            |    6 +-
 fs/btrfs/free-space-tree.c             |  192 +++--
 fs/btrfs/free-space-tree.h             |    8 -
 fs/btrfs/inode.c                       | 1371 ++++++++++++++++----------------
 fs/btrfs/ioctl.c                       | 1210 ++++++++++++++++++----------
 fs/btrfs/locking.c                     |   34 +-
 fs/btrfs/lzo.c                         |   76 +-
 fs/btrfs/ordered-data.c                |   14 +-
 fs/btrfs/print-tree.c                  |   21 +
 fs/btrfs/qgroup.c                      |   69 +-
 fs/btrfs/raid56.c                      |   38 +-
 fs/btrfs/relocation.c                  |    8 +-
 fs/btrfs/scrub.c                       |    1 +
 fs/btrfs/send.c                        |   46 +-
 fs/btrfs/super.c                       |    7 +-
 fs/btrfs/sysfs.c                       |   52 +-
 fs/btrfs/sysfs.h                       |    4 +-
 fs/btrfs/tests/btrfs-tests.c           |    4 +-
 fs/btrfs/tests/btrfs-tests.h           |    6 +-
 fs/btrfs/tests/extent-buffer-tests.c   |   56 +-
 fs/btrfs/tests/extent-io-tests.c       |   75 +-
 fs/btrfs/tests/extent-map-tests.c      |   90 ++-
 fs/btrfs/tests/free-space-tests.c      |  177 +++--
 fs/btrfs/tests/free-space-tree-tests.c |  129 +--
 fs/btrfs/tests/inode-tests.c           |  312 ++++----
 fs/btrfs/tests/qgroup-tests.c          |  100 +--
 fs/btrfs/transaction.c                 |   15 +-
 fs/btrfs/transaction.h                 |    1 -
 fs/btrfs/tree-log.c                    |   28 +-
 fs/btrfs/uuid-tree.c                   |   10 +-
 fs/btrfs/volumes.c                     |  506 ++++++------
 fs/btrfs/volumes.h                     |   24 +-
 include/trace/events/btrfs.h           |  323 ++++----
 include/uapi/linux/btrfs.h             |   97 +++
 49 files changed, 3579 insertions(+), 2935 deletions(-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] Btrfs updates for 4.18
  2018-06-04 15:43 [GIT PULL] Btrfs updates for 4.18 David Sterba
@ 2018-06-09 16:21 ` Filipe Manana
  2018-06-11  8:14   ` Anand Jain
  0 siblings, 1 reply; 8+ messages in thread
From: Filipe Manana @ 2018-06-09 16:21 UTC (permalink / raw)
  To: David Sterba; +Cc: linux-btrfs, Anand Jain

On Mon, Jun 4, 2018 at 4:43 PM, David Sterba <dsterba@suse.com> wrote:
> Hi,
>
> there are some new features and a usual load of cleanups, more details below.
>
> Specifically, there's a set of new non-privileged ioctls to allow
> subvolume listing.  It works but still needs a security review as it's a
> new interface and we might need to do some tweaks to the data
> structures. The fixes could be considred regressions but may touch the
> interfaces too.
>
> Currently there are no merge conflicts but linux-next has reported a few
> in the past, originating from other *FS trees.
>
> Please pull, thanks.
>
> ---
>
> User visible features:
>
> - added support for the ioctl FS_IOC_FSGETXATTR, per-inode flags, successor
>   of GET/SETFLAGS; now supports only existing flags: append, immutable,
>   noatime, nodump, sync
>
> - 3 new unprivileged ioctls to allow users to enumerate subvolumes
>
> - dedupe syscall implementation does not restrict the range to 16MiB, though it
>   still splits the whole range to 16MiB chunks
>
> - on user demand, rmdir() is able to delete an empty subvolume, export the
>   capability in sysfs
>
> - fix inode number types in tracepoints, other cleanups
>
> - send: improved speed when dealing with a large removed directory,
>   measurements show decrease from 2000 minutes to 2 minutes on a directory with
>   2 million entries
>
> - pre-commit check of superblock to detect a mysterious in-memory corruption
>
> - log message updates
>
>
> Other changes:
>
> - orphan inode cleanup improved, does no keep long-standing reservations that
>   could lead up to early ENOSPC in some cases
>
> - slight improvement of handling snapshotted NOCOW files by avoiding some
>   unnecessary tree searches
>
> - avoid OOM when dealing with many unmergeable small extents at flush time
>
> - speedup conversion of free space tree representations from/to bitmap/tree
>
> - code refactoring, deletion, cleanups
>   - delayed refs
>   - delayed iput
>   - redundant argument removals
>   - memory barrier cleanups
>   - remove a redundant mutex supposedly excluding several ioctls to run in
>     parallel
>
> - new tracepoints for blockgroup manipulation
>
> - more sanity checks of compressed headers
>
> ----------------------------------------------------------------
> The following changes since commit b04e217704b7f879c6b91222b066983a44a7a09f:
>
>   Linux 4.17-rc7 (2018-05-27 13:01:47 -0700)
>
> are available in the Git repository at:
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-4.18-tag
>
> for you to fetch changes up to 23d0b79dfaed2305b500b0215b0421701ada6b1a:
>
>   btrfs: Add unprivileged version of ino_lookup ioctl (2018-05-31 11:35:24 +0200)
>
> ----------------------------------------------------------------
> Al Viro (1):
>       btrfs: take the last remnants of ->d_fsdata use out
>
> Anand Jain (19):
>       btrfs: add comment about BTRFS_FS_EXCL_OP
>       btrfs: rename struct btrfs_fs_devices::list
>       btrfs: cleanup __btrfs_open_devices() drop head pointer
>       btrfs: rename __btrfs_close_devices to close_fs_devices
>       btrfs: rename __btrfs_open_devices to open_fs_devices
>       btrfs: cleanup find_device() drop list_head pointer
>       btrfs: cleanup btrfs_rm_device() promote fs_devices pointer
>       btrfs: move btrfs_raid_type_names values to btrfs_raid_attr table
>       btrfs: move btrfs_raid_group values to btrfs_raid_attr table
>       btrfs: move btrfs_raid_mindev_errorvalues to btrfs_raid_attr table
>       btrfs: reduce uuid_mutex critical section while scanning devices
>       btrfs: use existing cur_devices, cleanup btrfs_rm_device
>       btrfs: document uuid_mutex uasge in read_chunk_tree
>       btrfs: replace uuid_mutex by device_list_mutex in btrfs_open_devices

This change (commit 542c5908abfe84f7b4c1717492ecc92ea0ea328d, "btrfs:
replace uuid_mutex by device_list_mutex in btrfs_open_devices"), at
the very least
introduces a lockdep warning:

[  865.021049] ======================================================
[  865.021950] WARNING: possible circular locking dependency detected
[  865.022828] 4.17.0-rc7-btrfs-next-59+ #1 Not tainted
[  865.023491] ------------------------------------------------------
[  865.024342] fsstress/27897 is trying to acquire lock:
[  865.025070] 0000000099260c12 (&fs_info->reloc_mutex){+.+.}, at:
btrfs_record_root_in_trans+0x43/0x62 [btrfs]
[  865.026369]
[  865.026369] but task is already holding lock:
[  865.027206] 000000008dc17c22 (&mm->mmap_sem){++++}, at:
vm_mmap_pgoff+0x77/0xe8
[  865.028251]
[  865.028251] which lock already depends on the new lock.
[  865.028251]
[  865.029482]
[  865.029482] the existing dependency chain (in reverse order) is:
[  865.030523]
[  865.030523] -> #7 (&mm->mmap_sem){++++}:
[  865.031241]        _copy_to_user+0x1e/0x63
[  865.031745]        filldir+0x9e/0xef
[  865.032285]        dir_emit_dots+0x3b/0xbd
[  865.032881]        dcache_readdir+0x22/0xbb
[  865.033502]        iterate_dir+0xa3/0x13e
[  865.034131]        __do_sys_getdents+0xa1/0x106
[  865.034821]        do_syscall_64+0x51/0x5f
[  865.035423]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  865.036212]
[  865.036212] -> #6 (&sb->s_type->i_mutex_key#4){++++}:
[  865.037155]        start_creating+0x65/0xd2
[  865.037752]        debugfs_create_dir+0xc/0x9b
[  865.038374]        blk_mq_debugfs_register+0x30/0xec
[  865.039083]        blk_register_queue+0x11e/0x199
[  865.039753]        __device_add_disk+0x36d/0x44b
[  865.040434]        sd_probe_async+0xf6/0x19f [sd_mod]
[  865.041136]        async_run_entry_fn+0x34/0xe0
[  865.041811]        process_one_work+0x295/0x4b8
[  865.042446]        worker_thread+0x1ab/0x25e
[  865.043032]        kthread+0xf5/0xfa
[  865.043568]        ret_from_fork+0x3a/0x50
[  865.044163]
[  865.044163] -> #5 (&q->sysfs_lock){+.+.}:
[  865.044916]        blk_mq_sysfs_unregister+0x1d/0x53
[  865.045576]        blk_mq_realloc_hw_ctxs+0x2e/0x410
[  865.046209]        blk_mq_init_allocated_queue+0xaf/0x40d
[  865.046853]        blk_mq_init_queue+0x34/0x50
[  865.047494]        loop_add+0xf9/0x27f [loop]
[  865.048110]        param_set_lid_init_state+0x8e/0x94 [button]
[  865.048867]        do_one_initcall+0x11b/0x2de
[  865.049509]        do_init_module+0x5b/0x1ff
[  865.050077]        load_module+0x1c78/0x22b5
[  865.050669]        __do_sys_finit_module+0x7b/0x86
[  865.051288]        do_syscall_64+0x51/0x5f
[  865.051886]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  865.052700]
[  865.052700] -> #4 (loop_index_mutex){+.+.}:
[  865.053473]        lo_open+0x17/0x47 [loop]
[  865.054046]        __blkdev_get+0x145/0x42a
[  865.054649]        blkdev_get+0x1aa/0x2e9
[  865.055187]        do_dentry_open+0x17a/0x288
[  865.055843]        path_openat+0x534/0x699
[  865.056438]        do_filp_open+0x4d/0xa3
[  865.057026]        do_sys_open+0x69/0xee
[  865.057631]        do_syscall_64+0x51/0x5f
[  865.058227]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  865.058971]
[  865.058971] -> #3 (&bdev->bd_mutex){+.+.}:
[  865.059785]        __blkdev_get+0x409/0x42a
[  865.060377]        blkdev_get+0x1aa/0x2e9
[  865.060942]        blkdev_get_by_path+0x2c/0x5f
[  865.061555]        btrfs_get_bdev_and_sb+0x1b/0x97 [btrfs]
[  865.062264]        open_fs_devices+0x81/0x1f6 [btrfs]
[  865.063030]        btrfs_open_devices+0x5c/0x74 [btrfs]
[  865.063803]        btrfs_mount_root+0x1f7/0x45c [btrfs]
[  865.064554]        mount_fs+0x64/0x10b
[  865.065116]        vfs_kern_mount+0x68/0xce
[  865.069630]        btrfs_mount+0x12e/0x764 [btrfs]
[  865.070361]        mount_fs+0x64/0x10b
[  865.070962]        vfs_kern_mount+0x68/0xce
[  865.071613]        do_mount+0x6e5/0x973
[  865.072161]        ksys_mount+0x72/0x97
[  865.072732]        __x64_sys_mount+0x21/0x24
[  865.073356]        do_syscall_64+0x51/0x5f
[  865.073928]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  865.074687]
[  865.074687] -> #2 (&fs_devs->device_list_mutex){+.+.}:
[  865.075596]        btrfs_run_dev_stats+0x37/0x2fe [btrfs]
[  865.076339]        commit_cowonly_roots+0x87/0x261 [btrfs]
[  865.076921]        btrfs_commit_transaction+0x3b8/0x760 [btrfs]
[  865.077691]        btrfs_create_uuid_tree+0x9e/0x106 [btrfs]
[  865.078476]        open_ctree+0x1c1c/0x1ef9 [btrfs]
[  865.079140]        btrfs_mount_root+0x342/0x45c [btrfs]
[  865.079796]        mount_fs+0x64/0x10b
[  865.080297]        vfs_kern_mount+0x68/0xce
[  865.080902]        btrfs_mount+0x12e/0x764 [btrfs]
[  865.081566]        mount_fs+0x64/0x10b
[  865.082165]        vfs_kern_mount+0x68/0xce
[  865.082778]        do_mount+0x6e5/0x973
[  865.083308]        ksys_mount+0x72/0x97
[  865.083869]        __x64_sys_mount+0x21/0x24
[  865.084453]        do_syscall_64+0x51/0x5f
[  865.084991]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  865.085746]
[  865.085746] -> #1 (&fs_info->tree_log_mutex){+.+.}:
[  865.086729]        btrfs_commit_transaction+0x366/0x760 [btrfs]
[  865.087580]        btrfs_create_uuid_tree+0x9e/0x106 [btrfs]
[  865.088412]        open_ctree+0x1c1c/0x1ef9 [btrfs]
[  865.089092]        btrfs_mount_root+0x342/0x45c [btrfs]
[  865.089752]        mount_fs+0x64/0x10b
[  865.090256]        vfs_kern_mount+0x68/0xce
[  865.090895]        btrfs_mount+0x12e/0x764 [btrfs]
[  865.091564]        mount_fs+0x64/0x10b
[  865.092090]        vfs_kern_mount+0x68/0xce
[  865.092662]        do_mount+0x6e5/0x973
[  865.093224]        ksys_mount+0x72/0x97
[  865.093789]        __x64_sys_mount+0x21/0x24
[  865.094344]        do_syscall_64+0x51/0x5f
[  865.094887]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  865.095579]
[  865.095579] -> #0 (&fs_info->reloc_mutex){+.+.}:
[  865.096401]        __mutex_lock+0x81/0x3ee
[  865.097026]        btrfs_record_root_in_trans+0x43/0x62 [btrfs]
[  865.097885]        start_transaction+0x29f/0x377 [btrfs]
[  865.098679]        btrfs_dirty_inode+0x3c/0xbb [btrfs]
[  865.099349]        touch_atime+0x82/0xa1
[  865.099899]        btrfs_file_mmap+0x2d/0x44 [btrfs]
[  865.100590]        mmap_region+0x27b/0x421
[  865.101153]        do_mmap+0x3f0/0x492
[  865.101673]        vm_mmap_pgoff+0xa1/0xe8
[  865.102167]        ksys_mmap_pgoff+0x18d/0x1b1
[  865.102641]        do_syscall_64+0x51/0x5f
[  865.103126]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  865.103914]
[  865.103914] other info that might help us debug this:
[  865.103914]
[  865.105096] Chain exists of:
[  865.105096]   &fs_info->reloc_mutex --> &sb->s_type->i_mutex_key#4
--> &mm->mmap_sem
[  865.105096]
[  865.106636]  Possible unsafe locking scenario:
[  865.106636]
[  865.107435]        CPU0                    CPU1
[  865.108071]        ----                    ----
[  865.108725]   lock(&mm->mmap_sem);
[  865.109243]                                lock(&sb->s_type->i_mutex_key#4);
[  865.110144]                                lock(&mm->mmap_sem);
[  865.110961]   lock(&fs_info->reloc_mutex);
[  865.111568]
[  865.111568]  *** DEADLOCK ***
[  865.111568]
[  865.112401] 3 locks held by fsstress/27897:
[  865.112953]  #0: 000000008dc17c22 (&mm->mmap_sem){++++}, at:
vm_mmap_pgoff+0x77/0xe8
[  865.113955]  #1: 00000000bf2b52fc (sb_writers#11){.+.+}, at:
touch_atime+0x3b/0xa1
[  865.115020]  #2: 00000000a7121e15 (sb_internal#2){.+.+}, at:
start_transaction+0x1b6/0x377 [btrfs]
[  865.116274]
[  865.116274] stack backtrace:
[  865.116937] CPU: 3 PID: 27897 Comm: fsstress Not tainted
4.17.0-rc7-btrfs-next-59+ #1
[  865.118063] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[  865.119676] Call Trace:
[  865.120092]  dump_stack+0x5f/0x86
[  865.120641]  print_circular_bug.isra.21+0x1c7/0x1d4
[  865.121367]  __lock_acquire+0xb97/0xf09
[  865.121929]  ? lock_acquire+0x16a/0x1af
[  865.122524]  lock_acquire+0x16a/0x1af
[  865.123101]  ? btrfs_record_root_in_trans+0x43/0x62 [btrfs]
[  865.123854]  __mutex_lock+0x81/0x3ee
[  865.124438]  ? btrfs_record_root_in_trans+0x43/0x62 [btrfs]
[  865.125233]  ? module_assert_mutex_or_preempt+0x13/0x2d
[  865.126011]  ? btrfs_record_root_in_trans+0x43/0x62 [btrfs]
[  865.126839]  ? join_transaction+0x376/0x38d [btrfs]
[  865.127545]  ? btrfs_record_root_in_trans+0x43/0x62 [btrfs]
[  865.128277]  btrfs_record_root_in_trans+0x43/0x62 [btrfs]
[  865.129022]  start_transaction+0x29f/0x377 [btrfs]
[  865.129726]  btrfs_dirty_inode+0x3c/0xbb [btrfs]
[  865.130326]  touch_atime+0x82/0xa1
[  865.130863]  btrfs_file_mmap+0x2d/0x44 [btrfs]
[  865.131533]  mmap_region+0x27b/0x421
[  865.132081]  do_mmap+0x3f0/0x492
[  865.132561]  vm_mmap_pgoff+0xa1/0xe8
[  865.133097]  ksys_mmap_pgoff+0x18d/0x1b1
[  865.133540]  ? do_syscall_64+0x12/0x5f
[  865.134059]  do_syscall_64+0x51/0x5f
[  865.134648]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  865.135358] RIP: 0033:0x7f88758e2ad3
[  865.135909] RSP: 002b:00007ffd668823e8 EFLAGS: 00000246 ORIG_RAX:
0000000000000009
[  865.136928] RAX: ffffffffffffffda RBX: 000000000001e000 RCX: 00007f88758e2ad3
[  865.137804] RDX: 0000000000000002 RSI: 000000000000a7ef RDI: 0000000000000000
[  865.138734] RBP: 0000000000000000 R08: 0000000000000003 R09: 000000000001e000
[  865.139668] R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000002
[  865.140601] R13: 000000000000a7ef R14: 0000000000000002 R15: 0000000000000003

I haven't looked enough to see if it's really possible to deadlock.
Also, after a quick glance, specially after reading
the locking rules comment at the top of volumes.c which says:

 * uuid_mutex (global lock)
 * ------------------------
 * protects the fs_uuids list that tracks all per-fs fs_devices, resulting from
 * the SCAN_DEV ioctl registration or from mount either implicitly (the first
 * device) or requested by the device= mount option
 *
 * the mutex can be very coarse and can cover long-running operations
 *
 * protects: updates to fs_devices counters like missing devices, rw devices,
 * seeding, structure cloning, openning/closing devices at mount/umount time

generates some confusion since btrfs_open_devices(), after that
commit, no longer takes the uuid_mutex and it
updates some fs_devices counters (opened, open_devices, etc).

Always reproducible by running btrfs/004 from fstests.


>       btrfs: drop uuid_mutex in btrfs_dev_replace_finishing
>       btrfs: drop uuid_mutex in btrfs_destroy_dev_replace_tgtdev
>       btrfs: use common variable for fs_devices in btrfs_destroy_dev_replace_tgtdev
>       btrfs: add prefix "balance:" for log messages
>       btrfs: fix describe_relocation when printing unknown flags
>
> Chengguang Xu (1):
>       btrfs: return original error code when failing from option parsing
>
> Colin Ian King (1):
>       btrfs: send: fix spelling mistake: "send_in_progres" -> "send_in_progress"
>
> David Sterba (38):
>       btrfs: tracepoints, use correct type for inode number
>       btrfs: tracepoints, use %llu instead of %Lu
>       btrfs: tracepoints, drop unnecessary ULL casts
>       btrfs: tracepoints, fix whitespace in strings
>       btrfs: tracepoints, use extended format with UUID where possible
>       btrfs: tests: pass fs_info to extent_map tests
>       btrfs: use fs_info for btrfs_handle_em_exist tracepoint
>       btrfs: squeeze btrfs_dev_replace_continue_on_mount to its caller
>       btrfs: make success path out of btrfs_init_dev_replace_tgtdev more clear
>       btrfs: export and rename free_device
>       btrfs: move btrfs_init_dev_replace_tgtdev to dev-replace.c and make static
>       btrfs: move volume_mutex to callers of btrfs_rm_device
>       btrfs: move clearing of EXCL_OP out of __cancel_balance
>       btrfs: add proper safety check before resuming dev-replace
>       btrfs: add sanity check when resuming balance after mount
>       btrfs: cleanup helpers that reset balance state
>       btrfs: remove wrong use of volume_mutex from btrfs_dev_replace_start
>       btrfs: kill btrfs_fs_info::volume_mutex
>       btrfs: track running balance in a simpler way
>       btrfs: move and comment read-only check in btrfs_cancel_balance
>       btrfs: drop lock parameter from update_ioctl_balance_args and rename
>       btrfs: use mutex in btrfs_resume_balance_async
>       btrfs: open code set_balance_control
>       btrfs: remove redundant btrfs_balance_control::fs_info
>       btrfs: introduce conditional wakeup helpers
>       btrfs: add barriers to btrfs_sync_log before log_commit_wait wakeups
>       btrfs: replace waitqueue_actvie with cond_wake_up
>       btrfs: rename btrfs_update_iflags to reflect which flags it touches
>       btrfs: rename btrfs_mask_flags to reflect which flags it touches
>       btrfs: rename check_flags to reflect which flags it touches
>       btrfs: rename btrfs_flags_to_ioctl to reflect which flags it touches
>       btrfs: add helpers for FS_XFLAG_* conversion
>       btrfs: add FS_IOC_FSGETXATTR ioctl
>       btrfs: add FS_IOC_FSSETXATTR ioctl
>       btrfs: unify naming of flags variables for SETFLAGS and XFLAGS
>       btrfs: use kvzalloc for EXTENT_SAME temporary data
>       btrfs: tests: add helper for error messages and update them
>       btrfs: tests: drop newline from test_msg strings
>
> Ethan Lien (2):
>       btrfs: lift some btrfs_cross_ref_exist checks in nocow path
>       btrfs: balance dirty metadata pages in btrfs_finish_ordered_io
>
> Gu JinXiang (2):
>       btrfs: drop unused parameter qgroup_reserved
>       btrfs: drop useless member qgroup_reserved of btrfs_pending_snapshot
>
> Gu Jinxiang (3):
>       btrfs: remove unused fs_info parameter
>       btrfs: do reverse path readahead in btrfs_shrink_device
>       btrfs: propagate failures of __exclude_logged_extent to upper caller
>
> Howard McLauchlan (3):
>       btrfs: clean up le_bitmap_{set, clear}()
>       btrfs: optimize free space tree bitmap conversion
>       btrfs: remove unused le_test_bit()
>
> Kees Cook (1):
>       btrfs: raid56: Remove VLA usage
>
> Liu Bo (7):
>       Btrfs: add parent_transid parameter to veirfy_level_key
>       Btrfs: remove superfluous free_extent_buffer in read_block_for_search
>       Btrfs: use more straightforward extent_buffer_uptodate check
>       Btrfs: move get root out of btrfs_search_slot to a helper
>       Btrfs: grab write lock directly if write_lock_level is the max level
>       Btrfs: remove always true check in unlock_up
>       Btrfs: remove unused check of skip_locking
>
> Lu Fengqi (3):
>       btrfs: drop unused space_info parameter from create_space_info
>       btrfs: Remove fs_info argument from btrfs_uuid_tree_add
>       btrfs: Remove fs_info argument from btrfs_uuid_tree_rem
>
> Misono Tomohiro (5):
>       btrfs: Move may_destroy_subvol() from ioctl.c to inode.c
>       btrfs: Factor out the main deletion process from btrfs_ioctl_snap_destroy()
>       btrfs: Allow rmdir(2) to delete an empty subvolume
>       btrfs: sysfs: Add entry which shows if rmdir can work on subvolumes
>       btrfs: use error code returned by btrfs_read_fs_root_no_name in search ioctl
>
> Nikolay Borisov (54):
>       btrfs: Replace owner argument in add_pinned_bytes with a boolean
>       btrfs: Drop delayed_refs argument from btrfs_check_delayed_seq
>       btrfs: Use while loop instead of labels in __endio_write_update_ordered
>       btrfs: Fix lock release order
>       btrfs: Consolidate error checking for btrfs_alloc_chunk
>       btrfs: Sink extent_tree arguments in try_release_extent_mapping
>       btrfs: Remove map argument from try_release_extent_state
>       btrfs: Remove redundant tree argument from extent_readpages
>       btrfs: Use list_empty instead of list_empty_careful
>       btrfs: Remove tree argument from extent_writepages
>       btrfs: Remove btrfs_wait_and_free_delalloc_work
>       btrfs: Drop add_delayed_ref_head fs_info parameter
>       btrfs: Drop fs_info parameter from add_delayed_data_ref
>       btrfs: Drop fs_info parameter from btrfs_merge_delayed_refs
>       btrfs: Remove delayed_iput parameter of btrfs_start_delalloc_roots
>       btrfs: Remove delayed_iput parameter from btrfs_start_delalloc_inodes
>       btrfs: Remove delay_iput parameter from __start_delalloc_inodes
>       btrfs: Remove delayed_iput member from btrfs_delalloc_work
>       btrfs: Unexport btrfs_alloc_delalloc_work
>       btrfs: Remove devid parameter from btrfs_rmap_block
>       btrfs: Factor out common delayed refs init code
>       btrfs: Use init_delayed_ref_common in add_delayed_tree_ref
>       btrfs: Use init_delayed_ref_common in add_delayed_data_ref
>       btrfs: Open-code add_delayed_tree_ref
>       btrfs: Open-code add_delayed_data_ref
>       btrfs: Introduce init_delayed_ref_head
>       btrfs: Use init_delayed_ref_head in add_delayed_ref_head
>       btrfs: split delayed ref head initialization and addition
>       btrfs: Add assert in __btrfs_del_delalloc_inode
>       btrfs: Make btrfs_init_dummy_trans initialize trans' fs_info field
>       btrfs: Remove fs_info argument from add_block_group_free_space
>       btrfs: Remove fs_info argument from __add_block_group_free_space
>       btrfs: Remove fs_info argument from __add_to_free_space_tree
>       btrfs: Remove fs_info parameter from add_new_free_space_info
>       btrfs: Remove fs_info argument from add_new_free_space
>       btrfs: Remove fs_info parameter from remove_block_group_free_space
>       btrfs: Remove fs_info argument from convert_free_space_to_bitmaps
>       btrfs: Remove fs_info parameter from convert_free_space_to_extents
>       btrfs: Remove fs_info argument from update_free_space_extent_count
>       btrfs: Remove fs_info argument from modify_free_space_bitmap
>       btrfs: Remove fs_info argument from add_free_space_extent
>       btrfs: Remove fs_info argument from remove_free_space_extent
>       btrfs: Remove fs_info argument from __remove_from_free_space_tree
>       btrfs: Remove fs_info argument from remove_from_free_space_tree
>       btrfs: Remove fs_info argument from add_to_free_space_tree
>       btrfs: Remove fs_info argument from populate_free_space_tree
>       btrfs: Unexport and rename btrfs_invalidate_inodes
>       btrfs: Remove stale comment about select_delayed_ref
>       btrfs: Remove fs_info argument from alloc_reserved_tree_block
>       btrfs: Simplify alloc_reserved_tree_block interface
>       btrfs: Pass btrfs_delayed_extent_op to alloc_reserved_tree_block
>       btrfs: Streamline shared ref check in alloc_reserved_tree_block
>       btrfs: Factor out read portion of btrfs_get_blocks_direct
>       btrfs: Factor out write portion of btrfs_get_blocks_direct
>
> Omar Sandoval (16):
>       Btrfs: update stale comments referencing vmtruncate()
>       Btrfs: fix error handling in btrfs_truncate_inode_items()
>       Btrfs: don't BUG_ON() in btrfs_truncate_inode_items()
>       Btrfs: stop creating orphan items for truncate
>       Btrfs: get rid of BTRFS_INODE_HAS_ORPHAN_ITEM
>       Btrfs: delete dead code in btrfs_orphan_commit_root()
>       Btrfs: don't return ino to ino cache if inode item removal fails
>       Btrfs: refactor btrfs_evict_inode() reserve refill dance
>       Btrfs: fix ENOSPC caused by orphan items reservations
>       Btrfs: get rid of unused orphan infrastructure
>       Btrfs: renumber BTRFS_INODE_ runtime flags and switch to enums
>       Btrfs: reserve space for O_TMPFILE orphan item deletion
>       Btrfs: allow empty subvol= again
>       Btrfs: fix clone vs chattr NODATASUM race
>       Btrfs: fix memory and mount leak in btrfs_ioctl_rm_dev_v2()
>       Btrfs: clean up error handling in btrfs_truncate()
>
> Qu Wenruo (15):
>       btrfs: print-tree: Add eb locking status output for debug build
>       btrfs: trace: Remove unnecessary fs_info parameter for btrfs__reserve_extent event class
>       btrfs: trace: Add trace points for unused block groups
>       btrfs: trace: Allow trace_qgroup_update_counters() to record old rfer/excl value
>       btrfs: qgroup: Allow trace_btrfs_qgroup_account_extent() to record its transid
>       btrfs: Move btrfs_check_super_valid() to avoid forward declaration
>       btrfs: Refactor btrfs_check_super_valid
>       btrfs: Do super block verification before writing it to disk
>       btrfs: qgroup: Search commit root for rescan to avoid missing extent
>       btrfs: qgroup: Finish rescan when hit the last leaf of extent tree
>       btrfs: compression: Add linux/sizes.h for compression.h
>       btrfs: lzo: document the compressed data format
>       btrfs: lzo: Add header length check to avoid potential out-of-bounds access
>       btrfs: lzo: Harden inline lzo compressed extent decompression
>       btrfs: qgroup: show more meaningful qgroup_rescan_init error message
>
> Robbie Ko (2):
>       btrfs: incremental send, move allocation until it's needed in orphan_dir_info
>       btrfs: incremental send, improve rmdir performance for large directory
>
> Su Yue (3):
>       btrfs: rename btrfs_get_block_group_info and make it static
>       btrfs: return error value if create_io_em failed in cow_file_range
>       btrfs: return ENOMEM if path allocation fails in btrfs_cross_ref_exist
>
> Timofey Titovets (3):
>       Btrfs: split btrfs_extent_same
>       Btrfs: dedupe_file_range ioctl: remove 16MiB restriction
>       Btrfs: reuse cmp workspace in EXTENT_SAME ioctl
>
> Tomohiro Misono (4):
>       btrfs: sysfs: Use enum/define value for feature array definitions
>       btrfs: Add unprivileged ioctl which returns subvolume information
>       btrfs: Add unprivileged ioctl which returns subvolume's ROOT_REF
>       btrfs: Add unprivileged version of ino_lookup ioctl
>
>  fs/btrfs/btrfs_inode.h                 |   22 +-
>  fs/btrfs/compression.c                 |    7 +-
>  fs/btrfs/compression.h                 |    2 +
>  fs/btrfs/ctree.c                       |  123 +--
>  fs/btrfs/ctree.h                       |   76 +-
>  fs/btrfs/delayed-inode.c               |    9 +-
>  fs/btrfs/delayed-ref.c                 |  275 +++----
>  fs/btrfs/delayed-ref.h                 |    5 +-
>  fs/btrfs/dev-replace.c                 |  150 +++-
>  fs/btrfs/disk-io.c                     |  391 +++++----
>  fs/btrfs/extent-tree.c                 |  253 +++---
>  fs/btrfs/extent_io.c                   |   62 +-
>  fs/btrfs/extent_io.h                   |   20 +-
>  fs/btrfs/extent_map.c                  |    6 +-
>  fs/btrfs/extent_map.h                  |    3 +-
>  fs/btrfs/free-space-cache.c            |    6 +-
>  fs/btrfs/free-space-tree.c             |  192 +++--
>  fs/btrfs/free-space-tree.h             |    8 -
>  fs/btrfs/inode.c                       | 1371 ++++++++++++++++----------------
>  fs/btrfs/ioctl.c                       | 1210 ++++++++++++++++++----------
>  fs/btrfs/locking.c                     |   34 +-
>  fs/btrfs/lzo.c                         |   76 +-
>  fs/btrfs/ordered-data.c                |   14 +-
>  fs/btrfs/print-tree.c                  |   21 +
>  fs/btrfs/qgroup.c                      |   69 +-
>  fs/btrfs/raid56.c                      |   38 +-
>  fs/btrfs/relocation.c                  |    8 +-
>  fs/btrfs/scrub.c                       |    1 +
>  fs/btrfs/send.c                        |   46 +-
>  fs/btrfs/super.c                       |    7 +-
>  fs/btrfs/sysfs.c                       |   52 +-
>  fs/btrfs/sysfs.h                       |    4 +-
>  fs/btrfs/tests/btrfs-tests.c           |    4 +-
>  fs/btrfs/tests/btrfs-tests.h           |    6 +-
>  fs/btrfs/tests/extent-buffer-tests.c   |   56 +-
>  fs/btrfs/tests/extent-io-tests.c       |   75 +-
>  fs/btrfs/tests/extent-map-tests.c      |   90 ++-
>  fs/btrfs/tests/free-space-tests.c      |  177 +++--
>  fs/btrfs/tests/free-space-tree-tests.c |  129 +--
>  fs/btrfs/tests/inode-tests.c           |  312 ++++----
>  fs/btrfs/tests/qgroup-tests.c          |  100 +--
>  fs/btrfs/transaction.c                 |   15 +-
>  fs/btrfs/transaction.h                 |    1 -
>  fs/btrfs/tree-log.c                    |   28 +-
>  fs/btrfs/uuid-tree.c                   |   10 +-
>  fs/btrfs/volumes.c                     |  506 ++++++------
>  fs/btrfs/volumes.h                     |   24 +-
>  include/trace/events/btrfs.h           |  323 ++++----
>  include/uapi/linux/btrfs.h             |   97 +++
>  49 files changed, 3579 insertions(+), 2935 deletions(-)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] Btrfs updates for 4.18
  2018-06-09 16:21 ` Filipe Manana
@ 2018-06-11  8:14   ` Anand Jain
  2018-06-11  9:50     ` Filipe Manana
  0 siblings, 1 reply; 8+ messages in thread
From: Anand Jain @ 2018-06-11  8:14 UTC (permalink / raw)
  To: fdmanana, David Sterba; +Cc: linux-btrfs



On 06/10/2018 12:21 AM, Filipe Manana wrote:
> On Mon, Jun 4, 2018 at 4:43 PM, David Sterba <dsterba@suse.com> wrote:
>> Hi,
>>
>> there are some new features and a usual load of cleanups, more details below.
>>
>> Specifically, there's a set of new non-privileged ioctls to allow
>> subvolume listing.  It works but still needs a security review as it's a
>> new interface and we might need to do some tweaks to the data
>> structures. The fixes could be considred regressions but may touch the
>> interfaces too.
>>
>> Currently there are no merge conflicts but linux-next has reported a few
>> in the past, originating from other *FS trees.
>>
>> Please pull, thanks.
>>
>> ---
>>
>> User visible features:
>>
>> - added support for the ioctl FS_IOC_FSGETXATTR, per-inode flags, successor
>>    of GET/SETFLAGS; now supports only existing flags: append, immutable,
>>    noatime, nodump, sync
>>
>> - 3 new unprivileged ioctls to allow users to enumerate subvolumes
>>
>> - dedupe syscall implementation does not restrict the range to 16MiB, though it
>>    still splits the whole range to 16MiB chunks
>>
>> - on user demand, rmdir() is able to delete an empty subvolume, export the
>>    capability in sysfs
>>
>> - fix inode number types in tracepoints, other cleanups
>>
>> - send: improved speed when dealing with a large removed directory,
>>    measurements show decrease from 2000 minutes to 2 minutes on a directory with
>>    2 million entries
>>
>> - pre-commit check of superblock to detect a mysterious in-memory corruption
>>
>> - log message updates
>>
>>
>> Other changes:
>>
>> - orphan inode cleanup improved, does no keep long-standing reservations that
>>    could lead up to early ENOSPC in some cases
>>
>> - slight improvement of handling snapshotted NOCOW files by avoiding some
>>    unnecessary tree searches
>>
>> - avoid OOM when dealing with many unmergeable small extents at flush time
>>
>> - speedup conversion of free space tree representations from/to bitmap/tree
>>
>> - code refactoring, deletion, cleanups
>>    - delayed refs
>>    - delayed iput
>>    - redundant argument removals
>>    - memory barrier cleanups
>>    - remove a redundant mutex supposedly excluding several ioctls to run in
>>      parallel
>>
>> - new tracepoints for blockgroup manipulation
>>
>> - more sanity checks of compressed headers
>>
>> ----------------------------------------------------------------
>> The following changes since commit b04e217704b7f879c6b91222b066983a44a7a09f:
>>
>>    Linux 4.17-rc7 (2018-05-27 13:01:47 -0700)
>>
>> are available in the Git repository at:
>>
>>    git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-4.18-tag
>>
>> for you to fetch changes up to 23d0b79dfaed2305b500b0215b0421701ada6b1a:
>>
>>    btrfs: Add unprivileged version of ino_lookup ioctl (2018-05-31 11:35:24 +0200)
>>
>> ----------------------------------------------------------------
>> Al Viro (1):
>>        btrfs: take the last remnants of ->d_fsdata use out
>>
>> Anand Jain (19):
>>        btrfs: add comment about BTRFS_FS_EXCL_OP
>>        btrfs: rename struct btrfs_fs_devices::list
>>        btrfs: cleanup __btrfs_open_devices() drop head pointer
>>        btrfs: rename __btrfs_close_devices to close_fs_devices
>>        btrfs: rename __btrfs_open_devices to open_fs_devices
>>        btrfs: cleanup find_device() drop list_head pointer
>>        btrfs: cleanup btrfs_rm_device() promote fs_devices pointer
>>        btrfs: move btrfs_raid_type_names values to btrfs_raid_attr table
>>        btrfs: move btrfs_raid_group values to btrfs_raid_attr table
>>        btrfs: move btrfs_raid_mindev_errorvalues to btrfs_raid_attr table
>>        btrfs: reduce uuid_mutex critical section while scanning devices
>>        btrfs: use existing cur_devices, cleanup btrfs_rm_device
>>        btrfs: document uuid_mutex uasge in read_chunk_tree
>>        btrfs: replace uuid_mutex by device_list_mutex in btrfs_open_devices
> 
> This change (commit 542c5908abfe84f7b4c1717492ecc92ea0ea328d, "btrfs:
> replace uuid_mutex by device_list_mutex in btrfs_open_devices"), at
> the very least
> introduces a lockdep warning:
> 
> [  865.021049] ======================================================
> [  865.021950] WARNING: possible circular locking dependency detected
> [  865.022828] 4.17.0-rc7-btrfs-next-59+ #1 Not tainted
> [  865.023491] ------------------------------------------------------
> [  865.024342] fsstress/27897 is trying to acquire lock:
> [  865.025070] 0000000099260c12 (&fs_info->reloc_mutex){+.+.}, at:
> btrfs_record_root_in_trans+0x43/0x62 [btrfs]
> [  865.026369]
> [  865.026369] but task is already holding lock:
> [  865.027206] 000000008dc17c22 (&mm->mmap_sem){++++}, at:
> vm_mmap_pgoff+0x77/0xe8
> [  865.028251]
> [  865.028251] which lock already depends on the new lock.
> [  865.028251]
> [  865.029482]
> [  865.029482] the existing dependency chain (in reverse order) is:
> [  865.030523]
> [  865.030523] -> #7 (&mm->mmap_sem){++++}:
> [  865.031241]        _copy_to_user+0x1e/0x63
> [  865.031745]        filldir+0x9e/0xef
> [  865.032285]        dir_emit_dots+0x3b/0xbd
> [  865.032881]        dcache_readdir+0x22/0xbb
> [  865.033502]        iterate_dir+0xa3/0x13e
> [  865.034131]        __do_sys_getdents+0xa1/0x106
> [  865.034821]        do_syscall_64+0x51/0x5f
> [  865.035423]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  865.036212]
> [  865.036212] -> #6 (&sb->s_type->i_mutex_key#4){++++}:
> [  865.037155]        start_creating+0x65/0xd2
> [  865.037752]        debugfs_create_dir+0xc/0x9b
> [  865.038374]        blk_mq_debugfs_register+0x30/0xec
> [  865.039083]        blk_register_queue+0x11e/0x199
> [  865.039753]        __device_add_disk+0x36d/0x44b
> [  865.040434]        sd_probe_async+0xf6/0x19f [sd_mod]
> [  865.041136]        async_run_entry_fn+0x34/0xe0
> [  865.041811]        process_one_work+0x295/0x4b8
> [  865.042446]        worker_thread+0x1ab/0x25e
> [  865.043032]        kthread+0xf5/0xfa
> [  865.043568]        ret_from_fork+0x3a/0x50
> [  865.044163]
> [  865.044163] -> #5 (&q->sysfs_lock){+.+.}:
> [  865.044916]        blk_mq_sysfs_unregister+0x1d/0x53
> [  865.045576]        blk_mq_realloc_hw_ctxs+0x2e/0x410
> [  865.046209]        blk_mq_init_allocated_queue+0xaf/0x40d
> [  865.046853]        blk_mq_init_queue+0x34/0x50
> [  865.047494]        loop_add+0xf9/0x27f [loop]
> [  865.048110]        param_set_lid_init_state+0x8e/0x94 [button]
> [  865.048867]        do_one_initcall+0x11b/0x2de
> [  865.049509]        do_init_module+0x5b/0x1ff
> [  865.050077]        load_module+0x1c78/0x22b5
> [  865.050669]        __do_sys_finit_module+0x7b/0x86
> [  865.051288]        do_syscall_64+0x51/0x5f
> [  865.051886]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  865.052700]
> [  865.052700] -> #4 (loop_index_mutex){+.+.}:
> [  865.053473]        lo_open+0x17/0x47 [loop]
> [  865.054046]        __blkdev_get+0x145/0x42a
> [  865.054649]        blkdev_get+0x1aa/0x2e9
> [  865.055187]        do_dentry_open+0x17a/0x288
> [  865.055843]        path_openat+0x534/0x699
> [  865.056438]        do_filp_open+0x4d/0xa3
> [  865.057026]        do_sys_open+0x69/0xee
> [  865.057631]        do_syscall_64+0x51/0x5f
> [  865.058227]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  865.058971]
> [  865.058971] -> #3 (&bdev->bd_mutex){+.+.}:
> [  865.059785]        __blkdev_get+0x409/0x42a
> [  865.060377]        blkdev_get+0x1aa/0x2e9
> [  865.060942]        blkdev_get_by_path+0x2c/0x5f
> [  865.061555]        btrfs_get_bdev_and_sb+0x1b/0x97 [btrfs]
> [  865.062264]        open_fs_devices+0x81/0x1f6 [btrfs]
> [  865.063030]        btrfs_open_devices+0x5c/0x74 [btrfs]
> [  865.063803]        btrfs_mount_root+0x1f7/0x45c [btrfs]
> [  865.064554]        mount_fs+0x64/0x10b
> [  865.065116]        vfs_kern_mount+0x68/0xce
> [  865.069630]        btrfs_mount+0x12e/0x764 [btrfs]
> [  865.070361]        mount_fs+0x64/0x10b
> [  865.070962]        vfs_kern_mount+0x68/0xce
> [  865.071613]        do_mount+0x6e5/0x973
> [  865.072161]        ksys_mount+0x72/0x97
> [  865.072732]        __x64_sys_mount+0x21/0x24
> [  865.073356]        do_syscall_64+0x51/0x5f
> [  865.073928]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  865.074687]
> [  865.074687] -> #2 (&fs_devs->device_list_mutex){+.+.}:
> [  865.075596]        btrfs_run_dev_stats+0x37/0x2fe [btrfs]
> [  865.076339]        commit_cowonly_roots+0x87/0x261 [btrfs]
> [  865.076921]        btrfs_commit_transaction+0x3b8/0x760 [btrfs]
> [  865.077691]        btrfs_create_uuid_tree+0x9e/0x106 [btrfs]
> [  865.078476]        open_ctree+0x1c1c/0x1ef9 [btrfs]
> [  865.079140]        btrfs_mount_root+0x342/0x45c [btrfs]
> [  865.079796]        mount_fs+0x64/0x10b
> [  865.080297]        vfs_kern_mount+0x68/0xce
> [  865.080902]        btrfs_mount+0x12e/0x764 [btrfs]
> [  865.081566]        mount_fs+0x64/0x10b
> [  865.082165]        vfs_kern_mount+0x68/0xce
> [  865.082778]        do_mount+0x6e5/0x973
> [  865.083308]        ksys_mount+0x72/0x97
> [  865.083869]        __x64_sys_mount+0x21/0x24
> [  865.084453]        do_syscall_64+0x51/0x5f
> [  865.084991]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  865.085746]
> [  865.085746] -> #1 (&fs_info->tree_log_mutex){+.+.}:
> [  865.086729]        btrfs_commit_transaction+0x366/0x760 [btrfs]
> [  865.087580]        btrfs_create_uuid_tree+0x9e/0x106 [btrfs]
> [  865.088412]        open_ctree+0x1c1c/0x1ef9 [btrfs]
> [  865.089092]        btrfs_mount_root+0x342/0x45c [btrfs]
> [  865.089752]        mount_fs+0x64/0x10b
> [  865.090256]        vfs_kern_mount+0x68/0xce
> [  865.090895]        btrfs_mount+0x12e/0x764 [btrfs]
> [  865.091564]        mount_fs+0x64/0x10b
> [  865.092090]        vfs_kern_mount+0x68/0xce
> [  865.092662]        do_mount+0x6e5/0x973
> [  865.093224]        ksys_mount+0x72/0x97
> [  865.093789]        __x64_sys_mount+0x21/0x24
> [  865.094344]        do_syscall_64+0x51/0x5f
> [  865.094887]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  865.095579]
> [  865.095579] -> #0 (&fs_info->reloc_mutex){+.+.}:
> [  865.096401]        __mutex_lock+0x81/0x3ee
> [  865.097026]        btrfs_record_root_in_trans+0x43/0x62 [btrfs]
> [  865.097885]        start_transaction+0x29f/0x377 [btrfs]
> [  865.098679]        btrfs_dirty_inode+0x3c/0xbb [btrfs]
> [  865.099349]        touch_atime+0x82/0xa1
> [  865.099899]        btrfs_file_mmap+0x2d/0x44 [btrfs]
> [  865.100590]        mmap_region+0x27b/0x421
> [  865.101153]        do_mmap+0x3f0/0x492
> [  865.101673]        vm_mmap_pgoff+0xa1/0xe8
> [  865.102167]        ksys_mmap_pgoff+0x18d/0x1b1
> [  865.102641]        do_syscall_64+0x51/0x5f
> [  865.103126]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  865.103914]
> [  865.103914] other info that might help us debug this:
> [  865.103914]
> [  865.105096] Chain exists of:
> [  865.105096]   &fs_info->reloc_mutex --> &sb->s_type->i_mutex_key#4
> --> &mm->mmap_sem
> [  865.105096]
> [  865.106636]  Possible unsafe locking scenario:
> [  865.106636]
> [  865.107435]        CPU0                    CPU1
> [  865.108071]        ----                    ----
> [  865.108725]   lock(&mm->mmap_sem);
> [  865.109243]                                lock(&sb->s_type->i_mutex_key#4);
> [  865.110144]                                lock(&mm->mmap_sem);
> [  865.110961]   lock(&fs_info->reloc_mutex);
> [  865.111568]
> [  865.111568]  *** DEADLOCK ***
> [  865.111568]
> [  865.112401] 3 locks held by fsstress/27897:
> [  865.112953]  #0: 000000008dc17c22 (&mm->mmap_sem){++++}, at:
> vm_mmap_pgoff+0x77/0xe8
> [  865.113955]  #1: 00000000bf2b52fc (sb_writers#11){.+.+}, at:
> touch_atime+0x3b/0xa1
> [  865.115020]  #2: 00000000a7121e15 (sb_internal#2){.+.+}, at:
> start_transaction+0x1b6/0x377 [btrfs]
> [  865.116274]
> [  865.116274] stack backtrace:
> [  865.116937] CPU: 3 PID: 27897 Comm: fsstress Not tainted
> 4.17.0-rc7-btrfs-next-59+ #1
> [  865.118063] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [  865.119676] Call Trace:
> [  865.120092]  dump_stack+0x5f/0x86
> [  865.120641]  print_circular_bug.isra.21+0x1c7/0x1d4
> [  865.121367]  __lock_acquire+0xb97/0xf09
> [  865.121929]  ? lock_acquire+0x16a/0x1af
> [  865.122524]  lock_acquire+0x16a/0x1af
> [  865.123101]  ? btrfs_record_root_in_trans+0x43/0x62 [btrfs]
> [  865.123854]  __mutex_lock+0x81/0x3ee
> [  865.124438]  ? btrfs_record_root_in_trans+0x43/0x62 [btrfs]
> [  865.125233]  ? module_assert_mutex_or_preempt+0x13/0x2d
> [  865.126011]  ? btrfs_record_root_in_trans+0x43/0x62 [btrfs]
> [  865.126839]  ? join_transaction+0x376/0x38d [btrfs]
> [  865.127545]  ? btrfs_record_root_in_trans+0x43/0x62 [btrfs]
> [  865.128277]  btrfs_record_root_in_trans+0x43/0x62 [btrfs]
> [  865.129022]  start_transaction+0x29f/0x377 [btrfs]
> [  865.129726]  btrfs_dirty_inode+0x3c/0xbb [btrfs]
> [  865.130326]  touch_atime+0x82/0xa1
> [  865.130863]  btrfs_file_mmap+0x2d/0x44 [btrfs]
> [  865.131533]  mmap_region+0x27b/0x421
> [  865.132081]  do_mmap+0x3f0/0x492
> [  865.132561]  vm_mmap_pgoff+0xa1/0xe8
> [  865.133097]  ksys_mmap_pgoff+0x18d/0x1b1
> [  865.133540]  ? do_syscall_64+0x12/0x5f
> [  865.134059]  do_syscall_64+0x51/0x5f
> [  865.134648]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [  865.135358] RIP: 0033:0x7f88758e2ad3
> [  865.135909] RSP: 002b:00007ffd668823e8 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000009
> [  865.136928] RAX: ffffffffffffffda RBX: 000000000001e000 RCX: 00007f88758e2ad3
> [  865.137804] RDX: 0000000000000002 RSI: 000000000000a7ef RDI: 0000000000000000
> [  865.138734] RBP: 0000000000000000 R08: 0000000000000003 R09: 000000000001e000
> [  865.139668] R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000002
> [  865.140601] R13: 000000000000a7ef R14: 0000000000000002 R15: 0000000000000003
> 
> I haven't looked enough to see if it's really possible to deadlock.
> Also, after a quick glance, specially after reading
> the locking rules comment at the top of volumes.c which says:
> 
>   * uuid_mutex (global lock)
>   * ------------------------
>   * protects the fs_uuids list that tracks all per-fs fs_devices, resulting from
>   * the SCAN_DEV ioctl registration or from mount either implicitly (the first
>   * device) or requested by the device= mount option
>   *
>   * the mutex can be very coarse and can cover long-running operations
>   *
>   * protects: updates to fs_devices counters like missing devices, rw devices,
>   * seeding, structure cloning, openning/closing devices at mount/umount time
> 
> generates some confusion since btrfs_open_devices(), after that
> commit, no longer takes the uuid_mutex and it
> updates some fs_devices counters (opened, open_devices, etc).

  As uuid_mutex is a global fs_uuids lock for the per fsid operations
  doesn't make any sense.

  This problem is reproducible only for-4.18, misc-next if fine.
  I am looking deeper.

  Thanks for the report.

-Anand



> Always reproducible by running btrfs/004 from fstests.
> 
> 
>>        btrfs: drop uuid_mutex in btrfs_dev_replace_finishing
>>        btrfs: drop uuid_mutex in btrfs_destroy_dev_replace_tgtdev
>>        btrfs: use common variable for fs_devices in btrfs_destroy_dev_replace_tgtdev
>>        btrfs: add prefix "balance:" for log messages
>>        btrfs: fix describe_relocation when printing unknown flags
>>
>> Chengguang Xu (1):
>>        btrfs: return original error code when failing from option parsing
>>
>> Colin Ian King (1):
>>        btrfs: send: fix spelling mistake: "send_in_progres" -> "send_in_progress"
>>
>> David Sterba (38):
>>        btrfs: tracepoints, use correct type for inode number
>>        btrfs: tracepoints, use %llu instead of %Lu
>>        btrfs: tracepoints, drop unnecessary ULL casts
>>        btrfs: tracepoints, fix whitespace in strings
>>        btrfs: tracepoints, use extended format with UUID where possible
>>        btrfs: tests: pass fs_info to extent_map tests
>>        btrfs: use fs_info for btrfs_handle_em_exist tracepoint
>>        btrfs: squeeze btrfs_dev_replace_continue_on_mount to its caller
>>        btrfs: make success path out of btrfs_init_dev_replace_tgtdev more clear
>>        btrfs: export and rename free_device
>>        btrfs: move btrfs_init_dev_replace_tgtdev to dev-replace.c and make static
>>        btrfs: move volume_mutex to callers of btrfs_rm_device
>>        btrfs: move clearing of EXCL_OP out of __cancel_balance
>>        btrfs: add proper safety check before resuming dev-replace
>>        btrfs: add sanity check when resuming balance after mount
>>        btrfs: cleanup helpers that reset balance state
>>        btrfs: remove wrong use of volume_mutex from btrfs_dev_replace_start
>>        btrfs: kill btrfs_fs_info::volume_mutex
>>        btrfs: track running balance in a simpler way
>>        btrfs: move and comment read-only check in btrfs_cancel_balance
>>        btrfs: drop lock parameter from update_ioctl_balance_args and rename
>>        btrfs: use mutex in btrfs_resume_balance_async
>>        btrfs: open code set_balance_control
>>        btrfs: remove redundant btrfs_balance_control::fs_info
>>        btrfs: introduce conditional wakeup helpers
>>        btrfs: add barriers to btrfs_sync_log before log_commit_wait wakeups
>>        btrfs: replace waitqueue_actvie with cond_wake_up
>>        btrfs: rename btrfs_update_iflags to reflect which flags it touches
>>        btrfs: rename btrfs_mask_flags to reflect which flags it touches
>>        btrfs: rename check_flags to reflect which flags it touches
>>        btrfs: rename btrfs_flags_to_ioctl to reflect which flags it touches
>>        btrfs: add helpers for FS_XFLAG_* conversion
>>        btrfs: add FS_IOC_FSGETXATTR ioctl
>>        btrfs: add FS_IOC_FSSETXATTR ioctl
>>        btrfs: unify naming of flags variables for SETFLAGS and XFLAGS
>>        btrfs: use kvzalloc for EXTENT_SAME temporary data
>>        btrfs: tests: add helper for error messages and update them
>>        btrfs: tests: drop newline from test_msg strings
>>
>> Ethan Lien (2):
>>        btrfs: lift some btrfs_cross_ref_exist checks in nocow path
>>        btrfs: balance dirty metadata pages in btrfs_finish_ordered_io
>>
>> Gu JinXiang (2):
>>        btrfs: drop unused parameter qgroup_reserved
>>        btrfs: drop useless member qgroup_reserved of btrfs_pending_snapshot
>>
>> Gu Jinxiang (3):
>>        btrfs: remove unused fs_info parameter
>>        btrfs: do reverse path readahead in btrfs_shrink_device
>>        btrfs: propagate failures of __exclude_logged_extent to upper caller
>>
>> Howard McLauchlan (3):
>>        btrfs: clean up le_bitmap_{set, clear}()
>>        btrfs: optimize free space tree bitmap conversion
>>        btrfs: remove unused le_test_bit()
>>
>> Kees Cook (1):
>>        btrfs: raid56: Remove VLA usage
>>
>> Liu Bo (7):
>>        Btrfs: add parent_transid parameter to veirfy_level_key
>>        Btrfs: remove superfluous free_extent_buffer in read_block_for_search
>>        Btrfs: use more straightforward extent_buffer_uptodate check
>>        Btrfs: move get root out of btrfs_search_slot to a helper
>>        Btrfs: grab write lock directly if write_lock_level is the max level
>>        Btrfs: remove always true check in unlock_up
>>        Btrfs: remove unused check of skip_locking
>>
>> Lu Fengqi (3):
>>        btrfs: drop unused space_info parameter from create_space_info
>>        btrfs: Remove fs_info argument from btrfs_uuid_tree_add
>>        btrfs: Remove fs_info argument from btrfs_uuid_tree_rem
>>
>> Misono Tomohiro (5):
>>        btrfs: Move may_destroy_subvol() from ioctl.c to inode.c
>>        btrfs: Factor out the main deletion process from btrfs_ioctl_snap_destroy()
>>        btrfs: Allow rmdir(2) to delete an empty subvolume
>>        btrfs: sysfs: Add entry which shows if rmdir can work on subvolumes
>>        btrfs: use error code returned by btrfs_read_fs_root_no_name in search ioctl
>>
>> Nikolay Borisov (54):
>>        btrfs: Replace owner argument in add_pinned_bytes with a boolean
>>        btrfs: Drop delayed_refs argument from btrfs_check_delayed_seq
>>        btrfs: Use while loop instead of labels in __endio_write_update_ordered
>>        btrfs: Fix lock release order
>>        btrfs: Consolidate error checking for btrfs_alloc_chunk
>>        btrfs: Sink extent_tree arguments in try_release_extent_mapping
>>        btrfs: Remove map argument from try_release_extent_state
>>        btrfs: Remove redundant tree argument from extent_readpages
>>        btrfs: Use list_empty instead of list_empty_careful
>>        btrfs: Remove tree argument from extent_writepages
>>        btrfs: Remove btrfs_wait_and_free_delalloc_work
>>        btrfs: Drop add_delayed_ref_head fs_info parameter
>>        btrfs: Drop fs_info parameter from add_delayed_data_ref
>>        btrfs: Drop fs_info parameter from btrfs_merge_delayed_refs
>>        btrfs: Remove delayed_iput parameter of btrfs_start_delalloc_roots
>>        btrfs: Remove delayed_iput parameter from btrfs_start_delalloc_inodes
>>        btrfs: Remove delay_iput parameter from __start_delalloc_inodes
>>        btrfs: Remove delayed_iput member from btrfs_delalloc_work
>>        btrfs: Unexport btrfs_alloc_delalloc_work
>>        btrfs: Remove devid parameter from btrfs_rmap_block
>>        btrfs: Factor out common delayed refs init code
>>        btrfs: Use init_delayed_ref_common in add_delayed_tree_ref
>>        btrfs: Use init_delayed_ref_common in add_delayed_data_ref
>>        btrfs: Open-code add_delayed_tree_ref
>>        btrfs: Open-code add_delayed_data_ref
>>        btrfs: Introduce init_delayed_ref_head
>>        btrfs: Use init_delayed_ref_head in add_delayed_ref_head
>>        btrfs: split delayed ref head initialization and addition
>>        btrfs: Add assert in __btrfs_del_delalloc_inode
>>        btrfs: Make btrfs_init_dummy_trans initialize trans' fs_info field
>>        btrfs: Remove fs_info argument from add_block_group_free_space
>>        btrfs: Remove fs_info argument from __add_block_group_free_space
>>        btrfs: Remove fs_info argument from __add_to_free_space_tree
>>        btrfs: Remove fs_info parameter from add_new_free_space_info
>>        btrfs: Remove fs_info argument from add_new_free_space
>>        btrfs: Remove fs_info parameter from remove_block_group_free_space
>>        btrfs: Remove fs_info argument from convert_free_space_to_bitmaps
>>        btrfs: Remove fs_info parameter from convert_free_space_to_extents
>>        btrfs: Remove fs_info argument from update_free_space_extent_count
>>        btrfs: Remove fs_info argument from modify_free_space_bitmap
>>        btrfs: Remove fs_info argument from add_free_space_extent
>>        btrfs: Remove fs_info argument from remove_free_space_extent
>>        btrfs: Remove fs_info argument from __remove_from_free_space_tree
>>        btrfs: Remove fs_info argument from remove_from_free_space_tree
>>        btrfs: Remove fs_info argument from add_to_free_space_tree
>>        btrfs: Remove fs_info argument from populate_free_space_tree
>>        btrfs: Unexport and rename btrfs_invalidate_inodes
>>        btrfs: Remove stale comment about select_delayed_ref
>>        btrfs: Remove fs_info argument from alloc_reserved_tree_block
>>        btrfs: Simplify alloc_reserved_tree_block interface
>>        btrfs: Pass btrfs_delayed_extent_op to alloc_reserved_tree_block
>>        btrfs: Streamline shared ref check in alloc_reserved_tree_block
>>        btrfs: Factor out read portion of btrfs_get_blocks_direct
>>        btrfs: Factor out write portion of btrfs_get_blocks_direct
>>
>> Omar Sandoval (16):
>>        Btrfs: update stale comments referencing vmtruncate()
>>        Btrfs: fix error handling in btrfs_truncate_inode_items()
>>        Btrfs: don't BUG_ON() in btrfs_truncate_inode_items()
>>        Btrfs: stop creating orphan items for truncate
>>        Btrfs: get rid of BTRFS_INODE_HAS_ORPHAN_ITEM
>>        Btrfs: delete dead code in btrfs_orphan_commit_root()
>>        Btrfs: don't return ino to ino cache if inode item removal fails
>>        Btrfs: refactor btrfs_evict_inode() reserve refill dance
>>        Btrfs: fix ENOSPC caused by orphan items reservations
>>        Btrfs: get rid of unused orphan infrastructure
>>        Btrfs: renumber BTRFS_INODE_ runtime flags and switch to enums
>>        Btrfs: reserve space for O_TMPFILE orphan item deletion
>>        Btrfs: allow empty subvol= again
>>        Btrfs: fix clone vs chattr NODATASUM race
>>        Btrfs: fix memory and mount leak in btrfs_ioctl_rm_dev_v2()
>>        Btrfs: clean up error handling in btrfs_truncate()
>>
>> Qu Wenruo (15):
>>        btrfs: print-tree: Add eb locking status output for debug build
>>        btrfs: trace: Remove unnecessary fs_info parameter for btrfs__reserve_extent event class
>>        btrfs: trace: Add trace points for unused block groups
>>        btrfs: trace: Allow trace_qgroup_update_counters() to record old rfer/excl value
>>        btrfs: qgroup: Allow trace_btrfs_qgroup_account_extent() to record its transid
>>        btrfs: Move btrfs_check_super_valid() to avoid forward declaration
>>        btrfs: Refactor btrfs_check_super_valid
>>        btrfs: Do super block verification before writing it to disk
>>        btrfs: qgroup: Search commit root for rescan to avoid missing extent
>>        btrfs: qgroup: Finish rescan when hit the last leaf of extent tree
>>        btrfs: compression: Add linux/sizes.h for compression.h
>>        btrfs: lzo: document the compressed data format
>>        btrfs: lzo: Add header length check to avoid potential out-of-bounds access
>>        btrfs: lzo: Harden inline lzo compressed extent decompression
>>        btrfs: qgroup: show more meaningful qgroup_rescan_init error message
>>
>> Robbie Ko (2):
>>        btrfs: incremental send, move allocation until it's needed in orphan_dir_info
>>        btrfs: incremental send, improve rmdir performance for large directory
>>
>> Su Yue (3):
>>        btrfs: rename btrfs_get_block_group_info and make it static
>>        btrfs: return error value if create_io_em failed in cow_file_range
>>        btrfs: return ENOMEM if path allocation fails in btrfs_cross_ref_exist
>>
>> Timofey Titovets (3):
>>        Btrfs: split btrfs_extent_same
>>        Btrfs: dedupe_file_range ioctl: remove 16MiB restriction
>>        Btrfs: reuse cmp workspace in EXTENT_SAME ioctl
>>
>> Tomohiro Misono (4):
>>        btrfs: sysfs: Use enum/define value for feature array definitions
>>        btrfs: Add unprivileged ioctl which returns subvolume information
>>        btrfs: Add unprivileged ioctl which returns subvolume's ROOT_REF
>>        btrfs: Add unprivileged version of ino_lookup ioctl
>>
>>   fs/btrfs/btrfs_inode.h                 |   22 +-
>>   fs/btrfs/compression.c                 |    7 +-
>>   fs/btrfs/compression.h                 |    2 +
>>   fs/btrfs/ctree.c                       |  123 +--
>>   fs/btrfs/ctree.h                       |   76 +-
>>   fs/btrfs/delayed-inode.c               |    9 +-
>>   fs/btrfs/delayed-ref.c                 |  275 +++----
>>   fs/btrfs/delayed-ref.h                 |    5 +-
>>   fs/btrfs/dev-replace.c                 |  150 +++-
>>   fs/btrfs/disk-io.c                     |  391 +++++----
>>   fs/btrfs/extent-tree.c                 |  253 +++---
>>   fs/btrfs/extent_io.c                   |   62 +-
>>   fs/btrfs/extent_io.h                   |   20 +-
>>   fs/btrfs/extent_map.c                  |    6 +-
>>   fs/btrfs/extent_map.h                  |    3 +-
>>   fs/btrfs/free-space-cache.c            |    6 +-
>>   fs/btrfs/free-space-tree.c             |  192 +++--
>>   fs/btrfs/free-space-tree.h             |    8 -
>>   fs/btrfs/inode.c                       | 1371 ++++++++++++++++----------------
>>   fs/btrfs/ioctl.c                       | 1210 ++++++++++++++++++----------
>>   fs/btrfs/locking.c                     |   34 +-
>>   fs/btrfs/lzo.c                         |   76 +-
>>   fs/btrfs/ordered-data.c                |   14 +-
>>   fs/btrfs/print-tree.c                  |   21 +
>>   fs/btrfs/qgroup.c                      |   69 +-
>>   fs/btrfs/raid56.c                      |   38 +-
>>   fs/btrfs/relocation.c                  |    8 +-
>>   fs/btrfs/scrub.c                       |    1 +
>>   fs/btrfs/send.c                        |   46 +-
>>   fs/btrfs/super.c                       |    7 +-
>>   fs/btrfs/sysfs.c                       |   52 +-
>>   fs/btrfs/sysfs.h                       |    4 +-
>>   fs/btrfs/tests/btrfs-tests.c           |    4 +-
>>   fs/btrfs/tests/btrfs-tests.h           |    6 +-
>>   fs/btrfs/tests/extent-buffer-tests.c   |   56 +-
>>   fs/btrfs/tests/extent-io-tests.c       |   75 +-
>>   fs/btrfs/tests/extent-map-tests.c      |   90 ++-
>>   fs/btrfs/tests/free-space-tests.c      |  177 +++--
>>   fs/btrfs/tests/free-space-tree-tests.c |  129 +--
>>   fs/btrfs/tests/inode-tests.c           |  312 ++++----
>>   fs/btrfs/tests/qgroup-tests.c          |  100 +--
>>   fs/btrfs/transaction.c                 |   15 +-
>>   fs/btrfs/transaction.h                 |    1 -
>>   fs/btrfs/tree-log.c                    |   28 +-
>>   fs/btrfs/uuid-tree.c                   |   10 +-
>>   fs/btrfs/volumes.c                     |  506 ++++++------
>>   fs/btrfs/volumes.h                     |   24 +-
>>   include/trace/events/btrfs.h           |  323 ++++----
>>   include/uapi/linux/btrfs.h             |   97 +++
>>   49 files changed, 3579 insertions(+), 2935 deletions(-)
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] Btrfs updates for 4.18
  2018-06-11  8:14   ` Anand Jain
@ 2018-06-11  9:50     ` Filipe Manana
  2018-06-11 16:16       ` David Sterba
  0 siblings, 1 reply; 8+ messages in thread
From: Filipe Manana @ 2018-06-11  9:50 UTC (permalink / raw)
  To: Anand Jain; +Cc: David Sterba, linux-btrfs

On Mon, Jun 11, 2018 at 9:14 AM, Anand Jain <anand.jain@oracle.com> wrote:
>
>
> On 06/10/2018 12:21 AM, Filipe Manana wrote:
>>
>> On Mon, Jun 4, 2018 at 4:43 PM, David Sterba <dsterba@suse.com> wrote:
>>>
>>> Hi,
>>>
>>> there are some new features and a usual load of cleanups, more details
>>> below.
>>>
>>> Specifically, there's a set of new non-privileged ioctls to allow
>>> subvolume listing.  It works but still needs a security review as it's a
>>> new interface and we might need to do some tweaks to the data
>>> structures. The fixes could be considred regressions but may touch the
>>> interfaces too.
>>>
>>> Currently there are no merge conflicts but linux-next has reported a few
>>> in the past, originating from other *FS trees.
>>>
>>> Please pull, thanks.
>>>
>>> ---
>>>
>>> User visible features:
>>>
>>> - added support for the ioctl FS_IOC_FSGETXATTR, per-inode flags,
>>> successor
>>>    of GET/SETFLAGS; now supports only existing flags: append, immutable,
>>>    noatime, nodump, sync
>>>
>>> - 3 new unprivileged ioctls to allow users to enumerate subvolumes
>>>
>>> - dedupe syscall implementation does not restrict the range to 16MiB,
>>> though it
>>>    still splits the whole range to 16MiB chunks
>>>
>>> - on user demand, rmdir() is able to delete an empty subvolume, export
>>> the
>>>    capability in sysfs
>>>
>>> - fix inode number types in tracepoints, other cleanups
>>>
>>> - send: improved speed when dealing with a large removed directory,
>>>    measurements show decrease from 2000 minutes to 2 minutes on a
>>> directory with
>>>    2 million entries
>>>
>>> - pre-commit check of superblock to detect a mysterious in-memory
>>> corruption
>>>
>>> - log message updates
>>>
>>>
>>> Other changes:
>>>
>>> - orphan inode cleanup improved, does no keep long-standing reservations
>>> that
>>>    could lead up to early ENOSPC in some cases
>>>
>>> - slight improvement of handling snapshotted NOCOW files by avoiding some
>>>    unnecessary tree searches
>>>
>>> - avoid OOM when dealing with many unmergeable small extents at flush
>>> time
>>>
>>> - speedup conversion of free space tree representations from/to
>>> bitmap/tree
>>>
>>> - code refactoring, deletion, cleanups
>>>    - delayed refs
>>>    - delayed iput
>>>    - redundant argument removals
>>>    - memory barrier cleanups
>>>    - remove a redundant mutex supposedly excluding several ioctls to run
>>> in
>>>      parallel
>>>
>>> - new tracepoints for blockgroup manipulation
>>>
>>> - more sanity checks of compressed headers
>>>
>>> ----------------------------------------------------------------
>>> The following changes since commit
>>> b04e217704b7f879c6b91222b066983a44a7a09f:
>>>
>>>    Linux 4.17-rc7 (2018-05-27 13:01:47 -0700)
>>>
>>> are available in the Git repository at:
>>>
>>>    git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git
>>> for-4.18-tag
>>>
>>> for you to fetch changes up to 23d0b79dfaed2305b500b0215b0421701ada6b1a:
>>>
>>>    btrfs: Add unprivileged version of ino_lookup ioctl (2018-05-31
>>> 11:35:24 +0200)
>>>
>>> ----------------------------------------------------------------
>>> Al Viro (1):
>>>        btrfs: take the last remnants of ->d_fsdata use out
>>>
>>> Anand Jain (19):
>>>        btrfs: add comment about BTRFS_FS_EXCL_OP
>>>        btrfs: rename struct btrfs_fs_devices::list
>>>        btrfs: cleanup __btrfs_open_devices() drop head pointer
>>>        btrfs: rename __btrfs_close_devices to close_fs_devices
>>>        btrfs: rename __btrfs_open_devices to open_fs_devices
>>>        btrfs: cleanup find_device() drop list_head pointer
>>>        btrfs: cleanup btrfs_rm_device() promote fs_devices pointer
>>>        btrfs: move btrfs_raid_type_names values to btrfs_raid_attr table
>>>        btrfs: move btrfs_raid_group values to btrfs_raid_attr table
>>>        btrfs: move btrfs_raid_mindev_errorvalues to btrfs_raid_attr table
>>>        btrfs: reduce uuid_mutex critical section while scanning devices
>>>        btrfs: use existing cur_devices, cleanup btrfs_rm_device
>>>        btrfs: document uuid_mutex uasge in read_chunk_tree
>>>        btrfs: replace uuid_mutex by device_list_mutex in
>>> btrfs_open_devices
>>
>>
>> This change (commit 542c5908abfe84f7b4c1717492ecc92ea0ea328d, "btrfs:
>> replace uuid_mutex by device_list_mutex in btrfs_open_devices"), at
>> the very least
>> introduces a lockdep warning:
>>
>> [  865.021049] ======================================================
>> [  865.021950] WARNING: possible circular locking dependency detected
>> [  865.022828] 4.17.0-rc7-btrfs-next-59+ #1 Not tainted
>> [  865.023491] ------------------------------------------------------
>> [  865.024342] fsstress/27897 is trying to acquire lock:
>> [  865.025070] 0000000099260c12 (&fs_info->reloc_mutex){+.+.}, at:
>> btrfs_record_root_in_trans+0x43/0x62 [btrfs]
>> [  865.026369]
>> [  865.026369] but task is already holding lock:
>> [  865.027206] 000000008dc17c22 (&mm->mmap_sem){++++}, at:
>> vm_mmap_pgoff+0x77/0xe8
>> [  865.028251]
>> [  865.028251] which lock already depends on the new lock.
>> [  865.028251]
>> [  865.029482]
>> [  865.029482] the existing dependency chain (in reverse order) is:
>> [  865.030523]
>> [  865.030523] -> #7 (&mm->mmap_sem){++++}:
>> [  865.031241]        _copy_to_user+0x1e/0x63
>> [  865.031745]        filldir+0x9e/0xef
>> [  865.032285]        dir_emit_dots+0x3b/0xbd
>> [  865.032881]        dcache_readdir+0x22/0xbb
>> [  865.033502]        iterate_dir+0xa3/0x13e
>> [  865.034131]        __do_sys_getdents+0xa1/0x106
>> [  865.034821]        do_syscall_64+0x51/0x5f
>> [  865.035423]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> [  865.036212]
>> [  865.036212] -> #6 (&sb->s_type->i_mutex_key#4){++++}:
>> [  865.037155]        start_creating+0x65/0xd2
>> [  865.037752]        debugfs_create_dir+0xc/0x9b
>> [  865.038374]        blk_mq_debugfs_register+0x30/0xec
>> [  865.039083]        blk_register_queue+0x11e/0x199
>> [  865.039753]        __device_add_disk+0x36d/0x44b
>> [  865.040434]        sd_probe_async+0xf6/0x19f [sd_mod]
>> [  865.041136]        async_run_entry_fn+0x34/0xe0
>> [  865.041811]        process_one_work+0x295/0x4b8
>> [  865.042446]        worker_thread+0x1ab/0x25e
>> [  865.043032]        kthread+0xf5/0xfa
>> [  865.043568]        ret_from_fork+0x3a/0x50
>> [  865.044163]
>> [  865.044163] -> #5 (&q->sysfs_lock){+.+.}:
>> [  865.044916]        blk_mq_sysfs_unregister+0x1d/0x53
>> [  865.045576]        blk_mq_realloc_hw_ctxs+0x2e/0x410
>> [  865.046209]        blk_mq_init_allocated_queue+0xaf/0x40d
>> [  865.046853]        blk_mq_init_queue+0x34/0x50
>> [  865.047494]        loop_add+0xf9/0x27f [loop]
>> [  865.048110]        param_set_lid_init_state+0x8e/0x94 [button]
>> [  865.048867]        do_one_initcall+0x11b/0x2de
>> [  865.049509]        do_init_module+0x5b/0x1ff
>> [  865.050077]        load_module+0x1c78/0x22b5
>> [  865.050669]        __do_sys_finit_module+0x7b/0x86
>> [  865.051288]        do_syscall_64+0x51/0x5f
>> [  865.051886]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> [  865.052700]
>> [  865.052700] -> #4 (loop_index_mutex){+.+.}:
>> [  865.053473]        lo_open+0x17/0x47 [loop]
>> [  865.054046]        __blkdev_get+0x145/0x42a
>> [  865.054649]        blkdev_get+0x1aa/0x2e9
>> [  865.055187]        do_dentry_open+0x17a/0x288
>> [  865.055843]        path_openat+0x534/0x699
>> [  865.056438]        do_filp_open+0x4d/0xa3
>> [  865.057026]        do_sys_open+0x69/0xee
>> [  865.057631]        do_syscall_64+0x51/0x5f
>> [  865.058227]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> [  865.058971]
>> [  865.058971] -> #3 (&bdev->bd_mutex){+.+.}:
>> [  865.059785]        __blkdev_get+0x409/0x42a
>> [  865.060377]        blkdev_get+0x1aa/0x2e9
>> [  865.060942]        blkdev_get_by_path+0x2c/0x5f
>> [  865.061555]        btrfs_get_bdev_and_sb+0x1b/0x97 [btrfs]
>> [  865.062264]        open_fs_devices+0x81/0x1f6 [btrfs]
>> [  865.063030]        btrfs_open_devices+0x5c/0x74 [btrfs]
>> [  865.063803]        btrfs_mount_root+0x1f7/0x45c [btrfs]
>> [  865.064554]        mount_fs+0x64/0x10b
>> [  865.065116]        vfs_kern_mount+0x68/0xce
>> [  865.069630]        btrfs_mount+0x12e/0x764 [btrfs]
>> [  865.070361]        mount_fs+0x64/0x10b
>> [  865.070962]        vfs_kern_mount+0x68/0xce
>> [  865.071613]        do_mount+0x6e5/0x973
>> [  865.072161]        ksys_mount+0x72/0x97
>> [  865.072732]        __x64_sys_mount+0x21/0x24
>> [  865.073356]        do_syscall_64+0x51/0x5f
>> [  865.073928]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> [  865.074687]
>> [  865.074687] -> #2 (&fs_devs->device_list_mutex){+.+.}:
>> [  865.075596]        btrfs_run_dev_stats+0x37/0x2fe [btrfs]
>> [  865.076339]        commit_cowonly_roots+0x87/0x261 [btrfs]
>> [  865.076921]        btrfs_commit_transaction+0x3b8/0x760 [btrfs]
>> [  865.077691]        btrfs_create_uuid_tree+0x9e/0x106 [btrfs]
>> [  865.078476]        open_ctree+0x1c1c/0x1ef9 [btrfs]
>> [  865.079140]        btrfs_mount_root+0x342/0x45c [btrfs]
>> [  865.079796]        mount_fs+0x64/0x10b
>> [  865.080297]        vfs_kern_mount+0x68/0xce
>> [  865.080902]        btrfs_mount+0x12e/0x764 [btrfs]
>> [  865.081566]        mount_fs+0x64/0x10b
>> [  865.082165]        vfs_kern_mount+0x68/0xce
>> [  865.082778]        do_mount+0x6e5/0x973
>> [  865.083308]        ksys_mount+0x72/0x97
>> [  865.083869]        __x64_sys_mount+0x21/0x24
>> [  865.084453]        do_syscall_64+0x51/0x5f
>> [  865.084991]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> [  865.085746]
>> [  865.085746] -> #1 (&fs_info->tree_log_mutex){+.+.}:
>> [  865.086729]        btrfs_commit_transaction+0x366/0x760 [btrfs]
>> [  865.087580]        btrfs_create_uuid_tree+0x9e/0x106 [btrfs]
>> [  865.088412]        open_ctree+0x1c1c/0x1ef9 [btrfs]
>> [  865.089092]        btrfs_mount_root+0x342/0x45c [btrfs]
>> [  865.089752]        mount_fs+0x64/0x10b
>> [  865.090256]        vfs_kern_mount+0x68/0xce
>> [  865.090895]        btrfs_mount+0x12e/0x764 [btrfs]
>> [  865.091564]        mount_fs+0x64/0x10b
>> [  865.092090]        vfs_kern_mount+0x68/0xce
>> [  865.092662]        do_mount+0x6e5/0x973
>> [  865.093224]        ksys_mount+0x72/0x97
>> [  865.093789]        __x64_sys_mount+0x21/0x24
>> [  865.094344]        do_syscall_64+0x51/0x5f
>> [  865.094887]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> [  865.095579]
>> [  865.095579] -> #0 (&fs_info->reloc_mutex){+.+.}:
>> [  865.096401]        __mutex_lock+0x81/0x3ee
>> [  865.097026]        btrfs_record_root_in_trans+0x43/0x62 [btrfs]
>> [  865.097885]        start_transaction+0x29f/0x377 [btrfs]
>> [  865.098679]        btrfs_dirty_inode+0x3c/0xbb [btrfs]
>> [  865.099349]        touch_atime+0x82/0xa1
>> [  865.099899]        btrfs_file_mmap+0x2d/0x44 [btrfs]
>> [  865.100590]        mmap_region+0x27b/0x421
>> [  865.101153]        do_mmap+0x3f0/0x492
>> [  865.101673]        vm_mmap_pgoff+0xa1/0xe8
>> [  865.102167]        ksys_mmap_pgoff+0x18d/0x1b1
>> [  865.102641]        do_syscall_64+0x51/0x5f
>> [  865.103126]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> [  865.103914]
>> [  865.103914] other info that might help us debug this:
>> [  865.103914]
>> [  865.105096] Chain exists of:
>> [  865.105096]   &fs_info->reloc_mutex --> &sb->s_type->i_mutex_key#4
>> --> &mm->mmap_sem
>> [  865.105096]
>> [  865.106636]  Possible unsafe locking scenario:
>> [  865.106636]
>> [  865.107435]        CPU0                    CPU1
>> [  865.108071]        ----                    ----
>> [  865.108725]   lock(&mm->mmap_sem);
>> [  865.109243]
>> lock(&sb->s_type->i_mutex_key#4);
>> [  865.110144]                                lock(&mm->mmap_sem);
>> [  865.110961]   lock(&fs_info->reloc_mutex);
>> [  865.111568]
>> [  865.111568]  *** DEADLOCK ***
>> [  865.111568]
>> [  865.112401] 3 locks held by fsstress/27897:
>> [  865.112953]  #0: 000000008dc17c22 (&mm->mmap_sem){++++}, at:
>> vm_mmap_pgoff+0x77/0xe8
>> [  865.113955]  #1: 00000000bf2b52fc (sb_writers#11){.+.+}, at:
>> touch_atime+0x3b/0xa1
>> [  865.115020]  #2: 00000000a7121e15 (sb_internal#2){.+.+}, at:
>> start_transaction+0x1b6/0x377 [btrfs]
>> [  865.116274]
>> [  865.116274] stack backtrace:
>> [  865.116937] CPU: 3 PID: 27897 Comm: fsstress Not tainted
>> 4.17.0-rc7-btrfs-next-59+ #1
>> [  865.118063] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
>> [  865.119676] Call Trace:
>> [  865.120092]  dump_stack+0x5f/0x86
>> [  865.120641]  print_circular_bug.isra.21+0x1c7/0x1d4
>> [  865.121367]  __lock_acquire+0xb97/0xf09
>> [  865.121929]  ? lock_acquire+0x16a/0x1af
>> [  865.122524]  lock_acquire+0x16a/0x1af
>> [  865.123101]  ? btrfs_record_root_in_trans+0x43/0x62 [btrfs]
>> [  865.123854]  __mutex_lock+0x81/0x3ee
>> [  865.124438]  ? btrfs_record_root_in_trans+0x43/0x62 [btrfs]
>> [  865.125233]  ? module_assert_mutex_or_preempt+0x13/0x2d
>> [  865.126011]  ? btrfs_record_root_in_trans+0x43/0x62 [btrfs]
>> [  865.126839]  ? join_transaction+0x376/0x38d [btrfs]
>> [  865.127545]  ? btrfs_record_root_in_trans+0x43/0x62 [btrfs]
>> [  865.128277]  btrfs_record_root_in_trans+0x43/0x62 [btrfs]
>> [  865.129022]  start_transaction+0x29f/0x377 [btrfs]
>> [  865.129726]  btrfs_dirty_inode+0x3c/0xbb [btrfs]
>> [  865.130326]  touch_atime+0x82/0xa1
>> [  865.130863]  btrfs_file_mmap+0x2d/0x44 [btrfs]
>> [  865.131533]  mmap_region+0x27b/0x421
>> [  865.132081]  do_mmap+0x3f0/0x492
>> [  865.132561]  vm_mmap_pgoff+0xa1/0xe8
>> [  865.133097]  ksys_mmap_pgoff+0x18d/0x1b1
>> [  865.133540]  ? do_syscall_64+0x12/0x5f
>> [  865.134059]  do_syscall_64+0x51/0x5f
>> [  865.134648]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> [  865.135358] RIP: 0033:0x7f88758e2ad3
>> [  865.135909] RSP: 002b:00007ffd668823e8 EFLAGS: 00000246 ORIG_RAX:
>> 0000000000000009
>> [  865.136928] RAX: ffffffffffffffda RBX: 000000000001e000 RCX:
>> 00007f88758e2ad3
>> [  865.137804] RDX: 0000000000000002 RSI: 000000000000a7ef RDI:
>> 0000000000000000
>> [  865.138734] RBP: 0000000000000000 R08: 0000000000000003 R09:
>> 000000000001e000
>> [  865.139668] R10: 0000000000000002 R11: 0000000000000246 R12:
>> 0000000000000002
>> [  865.140601] R13: 000000000000a7ef R14: 0000000000000002 R15:
>> 0000000000000003
>>
>> I haven't looked enough to see if it's really possible to deadlock.
>> Also, after a quick glance, specially after reading
>> the locking rules comment at the top of volumes.c which says:
>>
>>   * uuid_mutex (global lock)
>>   * ------------------------
>>   * protects the fs_uuids list that tracks all per-fs fs_devices,
>> resulting from
>>   * the SCAN_DEV ioctl registration or from mount either implicitly (the
>> first
>>   * device) or requested by the device= mount option
>>   *
>>   * the mutex can be very coarse and can cover long-running operations
>>   *
>>   * protects: updates to fs_devices counters like missing devices, rw
>> devices,
>>   * seeding, structure cloning, openning/closing devices at mount/umount
>> time
>>
>> generates some confusion since btrfs_open_devices(), after that
>> commit, no longer takes the uuid_mutex and it
>> updates some fs_devices counters (opened, open_devices, etc).
>
>
>  As uuid_mutex is a global fs_uuids lock for the per fsid operations
>  doesn't make any sense.
>
>  This problem is reproducible only for-4.18, misc-next if fine.
>  I am looking deeper.

What about the unprotected updates (increments) to fs_devices->opened
and fs_devices->open_devices?
Other functions are accessing/updating them while holding the uuid mutex.

>
>  Thanks for the report.
>
> -Anand
>
>
>
>
>> Always reproducible by running btrfs/004 from fstests.
>>
>>
>>>        btrfs: drop uuid_mutex in btrfs_dev_replace_finishing
>>>        btrfs: drop uuid_mutex in btrfs_destroy_dev_replace_tgtdev
>>>        btrfs: use common variable for fs_devices in
>>> btrfs_destroy_dev_replace_tgtdev
>>>        btrfs: add prefix "balance:" for log messages
>>>        btrfs: fix describe_relocation when printing unknown flags
>>>
>>> Chengguang Xu (1):
>>>        btrfs: return original error code when failing from option parsing
>>>
>>> Colin Ian King (1):
>>>        btrfs: send: fix spelling mistake: "send_in_progres" ->
>>> "send_in_progress"
>>>
>>> David Sterba (38):
>>>        btrfs: tracepoints, use correct type for inode number
>>>        btrfs: tracepoints, use %llu instead of %Lu
>>>        btrfs: tracepoints, drop unnecessary ULL casts
>>>        btrfs: tracepoints, fix whitespace in strings
>>>        btrfs: tracepoints, use extended format with UUID where possible
>>>        btrfs: tests: pass fs_info to extent_map tests
>>>        btrfs: use fs_info for btrfs_handle_em_exist tracepoint
>>>        btrfs: squeeze btrfs_dev_replace_continue_on_mount to its caller
>>>        btrfs: make success path out of btrfs_init_dev_replace_tgtdev more
>>> clear
>>>        btrfs: export and rename free_device
>>>        btrfs: move btrfs_init_dev_replace_tgtdev to dev-replace.c and
>>> make static
>>>        btrfs: move volume_mutex to callers of btrfs_rm_device
>>>        btrfs: move clearing of EXCL_OP out of __cancel_balance
>>>        btrfs: add proper safety check before resuming dev-replace
>>>        btrfs: add sanity check when resuming balance after mount
>>>        btrfs: cleanup helpers that reset balance state
>>>        btrfs: remove wrong use of volume_mutex from
>>> btrfs_dev_replace_start
>>>        btrfs: kill btrfs_fs_info::volume_mutex
>>>        btrfs: track running balance in a simpler way
>>>        btrfs: move and comment read-only check in btrfs_cancel_balance
>>>        btrfs: drop lock parameter from update_ioctl_balance_args and
>>> rename
>>>        btrfs: use mutex in btrfs_resume_balance_async
>>>        btrfs: open code set_balance_control
>>>        btrfs: remove redundant btrfs_balance_control::fs_info
>>>        btrfs: introduce conditional wakeup helpers
>>>        btrfs: add barriers to btrfs_sync_log before log_commit_wait
>>> wakeups
>>>        btrfs: replace waitqueue_actvie with cond_wake_up
>>>        btrfs: rename btrfs_update_iflags to reflect which flags it
>>> touches
>>>        btrfs: rename btrfs_mask_flags to reflect which flags it touches
>>>        btrfs: rename check_flags to reflect which flags it touches
>>>        btrfs: rename btrfs_flags_to_ioctl to reflect which flags it
>>> touches
>>>        btrfs: add helpers for FS_XFLAG_* conversion
>>>        btrfs: add FS_IOC_FSGETXATTR ioctl
>>>        btrfs: add FS_IOC_FSSETXATTR ioctl
>>>        btrfs: unify naming of flags variables for SETFLAGS and XFLAGS
>>>        btrfs: use kvzalloc for EXTENT_SAME temporary data
>>>        btrfs: tests: add helper for error messages and update them
>>>        btrfs: tests: drop newline from test_msg strings
>>>
>>> Ethan Lien (2):
>>>        btrfs: lift some btrfs_cross_ref_exist checks in nocow path
>>>        btrfs: balance dirty metadata pages in btrfs_finish_ordered_io
>>>
>>> Gu JinXiang (2):
>>>        btrfs: drop unused parameter qgroup_reserved
>>>        btrfs: drop useless member qgroup_reserved of
>>> btrfs_pending_snapshot
>>>
>>> Gu Jinxiang (3):
>>>        btrfs: remove unused fs_info parameter
>>>        btrfs: do reverse path readahead in btrfs_shrink_device
>>>        btrfs: propagate failures of __exclude_logged_extent to upper
>>> caller
>>>
>>> Howard McLauchlan (3):
>>>        btrfs: clean up le_bitmap_{set, clear}()
>>>        btrfs: optimize free space tree bitmap conversion
>>>        btrfs: remove unused le_test_bit()
>>>
>>> Kees Cook (1):
>>>        btrfs: raid56: Remove VLA usage
>>>
>>> Liu Bo (7):
>>>        Btrfs: add parent_transid parameter to veirfy_level_key
>>>        Btrfs: remove superfluous free_extent_buffer in
>>> read_block_for_search
>>>        Btrfs: use more straightforward extent_buffer_uptodate check
>>>        Btrfs: move get root out of btrfs_search_slot to a helper
>>>        Btrfs: grab write lock directly if write_lock_level is the max
>>> level
>>>        Btrfs: remove always true check in unlock_up
>>>        Btrfs: remove unused check of skip_locking
>>>
>>> Lu Fengqi (3):
>>>        btrfs: drop unused space_info parameter from create_space_info
>>>        btrfs: Remove fs_info argument from btrfs_uuid_tree_add
>>>        btrfs: Remove fs_info argument from btrfs_uuid_tree_rem
>>>
>>> Misono Tomohiro (5):
>>>        btrfs: Move may_destroy_subvol() from ioctl.c to inode.c
>>>        btrfs: Factor out the main deletion process from
>>> btrfs_ioctl_snap_destroy()
>>>        btrfs: Allow rmdir(2) to delete an empty subvolume
>>>        btrfs: sysfs: Add entry which shows if rmdir can work on
>>> subvolumes
>>>        btrfs: use error code returned by btrfs_read_fs_root_no_name in
>>> search ioctl
>>>
>>> Nikolay Borisov (54):
>>>        btrfs: Replace owner argument in add_pinned_bytes with a boolean
>>>        btrfs: Drop delayed_refs argument from btrfs_check_delayed_seq
>>>        btrfs: Use while loop instead of labels in
>>> __endio_write_update_ordered
>>>        btrfs: Fix lock release order
>>>        btrfs: Consolidate error checking for btrfs_alloc_chunk
>>>        btrfs: Sink extent_tree arguments in try_release_extent_mapping
>>>        btrfs: Remove map argument from try_release_extent_state
>>>        btrfs: Remove redundant tree argument from extent_readpages
>>>        btrfs: Use list_empty instead of list_empty_careful
>>>        btrfs: Remove tree argument from extent_writepages
>>>        btrfs: Remove btrfs_wait_and_free_delalloc_work
>>>        btrfs: Drop add_delayed_ref_head fs_info parameter
>>>        btrfs: Drop fs_info parameter from add_delayed_data_ref
>>>        btrfs: Drop fs_info parameter from btrfs_merge_delayed_refs
>>>        btrfs: Remove delayed_iput parameter of btrfs_start_delalloc_roots
>>>        btrfs: Remove delayed_iput parameter from
>>> btrfs_start_delalloc_inodes
>>>        btrfs: Remove delay_iput parameter from __start_delalloc_inodes
>>>        btrfs: Remove delayed_iput member from btrfs_delalloc_work
>>>        btrfs: Unexport btrfs_alloc_delalloc_work
>>>        btrfs: Remove devid parameter from btrfs_rmap_block
>>>        btrfs: Factor out common delayed refs init code
>>>        btrfs: Use init_delayed_ref_common in add_delayed_tree_ref
>>>        btrfs: Use init_delayed_ref_common in add_delayed_data_ref
>>>        btrfs: Open-code add_delayed_tree_ref
>>>        btrfs: Open-code add_delayed_data_ref
>>>        btrfs: Introduce init_delayed_ref_head
>>>        btrfs: Use init_delayed_ref_head in add_delayed_ref_head
>>>        btrfs: split delayed ref head initialization and addition
>>>        btrfs: Add assert in __btrfs_del_delalloc_inode
>>>        btrfs: Make btrfs_init_dummy_trans initialize trans' fs_info field
>>>        btrfs: Remove fs_info argument from add_block_group_free_space
>>>        btrfs: Remove fs_info argument from __add_block_group_free_space
>>>        btrfs: Remove fs_info argument from __add_to_free_space_tree
>>>        btrfs: Remove fs_info parameter from add_new_free_space_info
>>>        btrfs: Remove fs_info argument from add_new_free_space
>>>        btrfs: Remove fs_info parameter from remove_block_group_free_space
>>>        btrfs: Remove fs_info argument from convert_free_space_to_bitmaps
>>>        btrfs: Remove fs_info parameter from convert_free_space_to_extents
>>>        btrfs: Remove fs_info argument from update_free_space_extent_count
>>>        btrfs: Remove fs_info argument from modify_free_space_bitmap
>>>        btrfs: Remove fs_info argument from add_free_space_extent
>>>        btrfs: Remove fs_info argument from remove_free_space_extent
>>>        btrfs: Remove fs_info argument from __remove_from_free_space_tree
>>>        btrfs: Remove fs_info argument from remove_from_free_space_tree
>>>        btrfs: Remove fs_info argument from add_to_free_space_tree
>>>        btrfs: Remove fs_info argument from populate_free_space_tree
>>>        btrfs: Unexport and rename btrfs_invalidate_inodes
>>>        btrfs: Remove stale comment about select_delayed_ref
>>>        btrfs: Remove fs_info argument from alloc_reserved_tree_block
>>>        btrfs: Simplify alloc_reserved_tree_block interface
>>>        btrfs: Pass btrfs_delayed_extent_op to alloc_reserved_tree_block
>>>        btrfs: Streamline shared ref check in alloc_reserved_tree_block
>>>        btrfs: Factor out read portion of btrfs_get_blocks_direct
>>>        btrfs: Factor out write portion of btrfs_get_blocks_direct
>>>
>>> Omar Sandoval (16):
>>>        Btrfs: update stale comments referencing vmtruncate()
>>>        Btrfs: fix error handling in btrfs_truncate_inode_items()
>>>        Btrfs: don't BUG_ON() in btrfs_truncate_inode_items()
>>>        Btrfs: stop creating orphan items for truncate
>>>        Btrfs: get rid of BTRFS_INODE_HAS_ORPHAN_ITEM
>>>        Btrfs: delete dead code in btrfs_orphan_commit_root()
>>>        Btrfs: don't return ino to ino cache if inode item removal fails
>>>        Btrfs: refactor btrfs_evict_inode() reserve refill dance
>>>        Btrfs: fix ENOSPC caused by orphan items reservations
>>>        Btrfs: get rid of unused orphan infrastructure
>>>        Btrfs: renumber BTRFS_INODE_ runtime flags and switch to enums
>>>        Btrfs: reserve space for O_TMPFILE orphan item deletion
>>>        Btrfs: allow empty subvol= again
>>>        Btrfs: fix clone vs chattr NODATASUM race
>>>        Btrfs: fix memory and mount leak in btrfs_ioctl_rm_dev_v2()
>>>        Btrfs: clean up error handling in btrfs_truncate()
>>>
>>> Qu Wenruo (15):
>>>        btrfs: print-tree: Add eb locking status output for debug build
>>>        btrfs: trace: Remove unnecessary fs_info parameter for
>>> btrfs__reserve_extent event class
>>>        btrfs: trace: Add trace points for unused block groups
>>>        btrfs: trace: Allow trace_qgroup_update_counters() to record old
>>> rfer/excl value
>>>        btrfs: qgroup: Allow trace_btrfs_qgroup_account_extent() to record
>>> its transid
>>>        btrfs: Move btrfs_check_super_valid() to avoid forward declaration
>>>        btrfs: Refactor btrfs_check_super_valid
>>>        btrfs: Do super block verification before writing it to disk
>>>        btrfs: qgroup: Search commit root for rescan to avoid missing
>>> extent
>>>        btrfs: qgroup: Finish rescan when hit the last leaf of extent tree
>>>        btrfs: compression: Add linux/sizes.h for compression.h
>>>        btrfs: lzo: document the compressed data format
>>>        btrfs: lzo: Add header length check to avoid potential
>>> out-of-bounds access
>>>        btrfs: lzo: Harden inline lzo compressed extent decompression
>>>        btrfs: qgroup: show more meaningful qgroup_rescan_init error
>>> message
>>>
>>> Robbie Ko (2):
>>>        btrfs: incremental send, move allocation until it's needed in
>>> orphan_dir_info
>>>        btrfs: incremental send, improve rmdir performance for large
>>> directory
>>>
>>> Su Yue (3):
>>>        btrfs: rename btrfs_get_block_group_info and make it static
>>>        btrfs: return error value if create_io_em failed in cow_file_range
>>>        btrfs: return ENOMEM if path allocation fails in
>>> btrfs_cross_ref_exist
>>>
>>> Timofey Titovets (3):
>>>        Btrfs: split btrfs_extent_same
>>>        Btrfs: dedupe_file_range ioctl: remove 16MiB restriction
>>>        Btrfs: reuse cmp workspace in EXTENT_SAME ioctl
>>>
>>> Tomohiro Misono (4):
>>>        btrfs: sysfs: Use enum/define value for feature array definitions
>>>        btrfs: Add unprivileged ioctl which returns subvolume information
>>>        btrfs: Add unprivileged ioctl which returns subvolume's ROOT_REF
>>>        btrfs: Add unprivileged version of ino_lookup ioctl
>>>
>>>   fs/btrfs/btrfs_inode.h                 |   22 +-
>>>   fs/btrfs/compression.c                 |    7 +-
>>>   fs/btrfs/compression.h                 |    2 +
>>>   fs/btrfs/ctree.c                       |  123 +--
>>>   fs/btrfs/ctree.h                       |   76 +-
>>>   fs/btrfs/delayed-inode.c               |    9 +-
>>>   fs/btrfs/delayed-ref.c                 |  275 +++----
>>>   fs/btrfs/delayed-ref.h                 |    5 +-
>>>   fs/btrfs/dev-replace.c                 |  150 +++-
>>>   fs/btrfs/disk-io.c                     |  391 +++++----
>>>   fs/btrfs/extent-tree.c                 |  253 +++---
>>>   fs/btrfs/extent_io.c                   |   62 +-
>>>   fs/btrfs/extent_io.h                   |   20 +-
>>>   fs/btrfs/extent_map.c                  |    6 +-
>>>   fs/btrfs/extent_map.h                  |    3 +-
>>>   fs/btrfs/free-space-cache.c            |    6 +-
>>>   fs/btrfs/free-space-tree.c             |  192 +++--
>>>   fs/btrfs/free-space-tree.h             |    8 -
>>>   fs/btrfs/inode.c                       | 1371
>>> ++++++++++++++++----------------
>>>   fs/btrfs/ioctl.c                       | 1210
>>> ++++++++++++++++++----------
>>>   fs/btrfs/locking.c                     |   34 +-
>>>   fs/btrfs/lzo.c                         |   76 +-
>>>   fs/btrfs/ordered-data.c                |   14 +-
>>>   fs/btrfs/print-tree.c                  |   21 +
>>>   fs/btrfs/qgroup.c                      |   69 +-
>>>   fs/btrfs/raid56.c                      |   38 +-
>>>   fs/btrfs/relocation.c                  |    8 +-
>>>   fs/btrfs/scrub.c                       |    1 +
>>>   fs/btrfs/send.c                        |   46 +-
>>>   fs/btrfs/super.c                       |    7 +-
>>>   fs/btrfs/sysfs.c                       |   52 +-
>>>   fs/btrfs/sysfs.h                       |    4 +-
>>>   fs/btrfs/tests/btrfs-tests.c           |    4 +-
>>>   fs/btrfs/tests/btrfs-tests.h           |    6 +-
>>>   fs/btrfs/tests/extent-buffer-tests.c   |   56 +-
>>>   fs/btrfs/tests/extent-io-tests.c       |   75 +-
>>>   fs/btrfs/tests/extent-map-tests.c      |   90 ++-
>>>   fs/btrfs/tests/free-space-tests.c      |  177 +++--
>>>   fs/btrfs/tests/free-space-tree-tests.c |  129 +--
>>>   fs/btrfs/tests/inode-tests.c           |  312 ++++----
>>>   fs/btrfs/tests/qgroup-tests.c          |  100 +--
>>>   fs/btrfs/transaction.c                 |   15 +-
>>>   fs/btrfs/transaction.h                 |    1 -
>>>   fs/btrfs/tree-log.c                    |   28 +-
>>>   fs/btrfs/uuid-tree.c                   |   10 +-
>>>   fs/btrfs/volumes.c                     |  506 ++++++------
>>>   fs/btrfs/volumes.h                     |   24 +-
>>>   include/trace/events/btrfs.h           |  323 ++++----
>>>   include/uapi/linux/btrfs.h             |   97 +++
>>>   49 files changed, 3579 insertions(+), 2935 deletions(-)
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>>
>



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] Btrfs updates for 4.18
  2018-06-11  9:50     ` Filipe Manana
@ 2018-06-11 16:16       ` David Sterba
  2018-06-28 11:22         ` Anand Jain
  0 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2018-06-11 16:16 UTC (permalink / raw)
  To: Filipe Manana; +Cc: Anand Jain, David Sterba, linux-btrfs

On Mon, Jun 11, 2018 at 10:50:54AM +0100, Filipe Manana wrote:
> >>>        btrfs: replace uuid_mutex by device_list_mutex in
> >>> btrfs_open_devices

> >>   *
> >>   * the mutex can be very coarse and can cover long-running operations
> >>   *
> >>   * protects: updates to fs_devices counters like missing devices, rw
> >> devices,
> >>   * seeding, structure cloning, openning/closing devices at mount/umount
> >> time
> >>
> >> generates some confusion since btrfs_open_devices(), after that
> >> commit, no longer takes the uuid_mutex and it
> >> updates some fs_devices counters (opened, open_devices, etc).
> >
> >  As uuid_mutex is a global fs_uuids lock for the per fsid operations
> >  doesn't make any sense.
> >
> >  This problem is reproducible only for-4.18, misc-next if fine.
> >  I am looking deeper.
> 
> What about the unprotected updates (increments) to fs_devices->opened
> and fs_devices->open_devices?
> Other functions are accessing/updating them while holding the uuid mutex.

The goal is to reduce usage of uuid_mutex only to protect search or
update of the fs_uuids list, everything else should be protected by the
device_list_mutex.

The commit 542c5908abfe84f7 (use device_list_mutex in
btrfs_open_devices) implements that but then the access to the ->opened
member is not protected consistently. There are patches that convert the
use to device_list_mutex but haven't been merged due to refinements or
pending review.

At this point I think we should revert the one commit 542c5908abfe84f7
as it introduces the locking problems and revisit the whole fs_devices
locking scheme again in the dex dev cycle. That will be post rc1 as
there might be more to revert.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] Btrfs updates for 4.18
  2018-06-11 16:16       ` David Sterba
@ 2018-06-28 11:22         ` Anand Jain
  2018-06-28 18:26           ` David Sterba
  0 siblings, 1 reply; 8+ messages in thread
From: Anand Jain @ 2018-06-28 11:22 UTC (permalink / raw)
  To: dsterba, Filipe Manana, David Sterba, linux-btrfs



On 06/12/2018 12:16 AM, David Sterba wrote:
> On Mon, Jun 11, 2018 at 10:50:54AM +0100, Filipe Manana wrote:
>>>>>         btrfs: replace uuid_mutex by device_list_mutex in
>>>>> btrfs_open_devices
> 
>>>>    *
>>>>    * the mutex can be very coarse and can cover long-running operations
>>>>    *
>>>>    * protects: updates to fs_devices counters like missing devices, rw
>>>> devices,
>>>>    * seeding, structure cloning, openning/closing devices at mount/umount
>>>> time
>>>>
>>>> generates some confusion since btrfs_open_devices(), after that
>>>> commit, no longer takes the uuid_mutex and it
>>>> updates some fs_devices counters (opened, open_devices, etc).
>>>
>>>   As uuid_mutex is a global fs_uuids lock for the per fsid operations
>>>   doesn't make any sense.
>>>
>>>   This problem is reproducible only for-4.18, misc-next if fine.
>>>   I am looking deeper.
>>
>> What about the unprotected updates (increments) to fs_devices->opened
>> and fs_devices->open_devices?
>> Other functions are accessing/updating them while holding the uuid mutex.
> 
> The goal is to reduce usage of uuid_mutex only to protect search or
> update of the fs_uuids list, everything else should be protected by the
> device_list_mutex.
> 
> The commit 542c5908abfe84f7 (use device_list_mutex in
> btrfs_open_devices) implements that but then the access to the ->opened
> member is not protected consistently. There are patches that convert the
> use to device_list_mutex but haven't been merged due to refinements or
> pending review.
> 
> At this point I think we should revert the one commit 542c5908abfe84f7
> as it introduces the locking problems and revisit the whole fs_devices
> locking scheme again in the dex dev cycle. That will be post rc1 as
> there might be more to revert.



  I tried to narrow this, it appears some of the things that
  circular locking dependency check report doesn't make sense.
  Here below is what I find.. as of now.

  The test case btrfs/004 can be simplified to.. which also
  reproduces the problem.

---------------------8<-------------
$ cat 165
#! /bin/bash
# FS QA Test No. btrfs/165
#

seq=`basename $0`
seqres=$RESULT_DIR/$seq
echo "QA output created by $seq"

here=`pwd`
tmp=/tmp/$$
status=1
noise_pid=0

_cleanup()
{
	wait
	rm -f $tmp.*
}
trap "_cleanup; exit \$status" 0 1 2 3 15

# get standard environment, filters and checks
. ./common/rc
. ./common/filter

# real QA test starts here
_supported_fs btrfs
_supported_os Linux
_require_scratch

rm -f $seqres.full

run_check _scratch_mkfs_sized $((2000 * 1024 * 1024))
run_check _scratch_mount
run_check $FSSTRESS_PROG -d $SCRATCH_MNT -w -p 1 -n 2000 $FSSTRESS_AVOID
run_check _scratch_unmount

echo "done"
status=0
exit
---------------------8<-------------

  The circular locking dependency warning occurs at FSSTRESS_PROG.
  And in particular at doproc() in xfstests/ltp/fsstress.c, randomly
  at any of the command at
  opdesc_t        ops[] = { ..}
  which involves calling mmap file operation and if there is something
  to commit.

  The commit transaction does need device_list_mutex which is also being
  used for the btrfs_open_devices() in the commit 542c5908abfe84f7.

  But btrfs_open_devices() is only called at mount, and mmap() can
  establish only be established after the mount has completed. With
  this give its unclear to me why the circular locking dependency check
  is warning about this.

  I feel until we have clarity about this and also solve other problem
  related to the streamlining of uuid_mutex, I suggest we revert
  542c5908abfe84f7. Sorry for the inconvenience.

Thanks, Anand

> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] Btrfs updates for 4.18
  2018-06-28 11:22         ` Anand Jain
@ 2018-06-28 18:26           ` David Sterba
  2018-06-29  6:13             ` Anand Jain
  0 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2018-06-28 18:26 UTC (permalink / raw)
  To: Anand Jain; +Cc: dsterba, Filipe Manana, David Sterba, linux-btrfs

On Thu, Jun 28, 2018 at 07:22:59PM +0800, Anand Jain wrote:
>   The circular locking dependency warning occurs at FSSTRESS_PROG.
>   And in particular at doproc() in xfstests/ltp/fsstress.c, randomly
>   at any of the command at
>   opdesc_t        ops[] = { ..}
>   which involves calling mmap file operation and if there is something
>   to commit.
> 
>   The commit transaction does need device_list_mutex which is also being
>   used for the btrfs_open_devices() in the commit 542c5908abfe84f7.
> 
>   But btrfs_open_devices() is only called at mount, and mmap() can
>   establish only be established after the mount has completed. With
>   this give its unclear to me why the circular locking dependency check
>   is warning about this.
> 
>   I feel until we have clarity about this and also solve other problem
>   related to the streamlining of uuid_mutex, I suggest we revert
>   542c5908abfe84f7. Sorry for the inconvenience.

Ok, the revert is one option. I'm cosidering adding both the locks, like
is in https://patchwork.kernel.org/patch/10478443/ . This would have no
effect, as btrfs_open_devices is called only from mount path and the
list_sort is done only for the first time when there are not other
users of the list that would not also be under the uuid_mutex.

This passed the syzbot and other tests, so this does not break things
and goes towards pushing the device_list_mutex as the real protection
mechanism for the fs_devices members.

Let me know what you think, the revert should be the last option if we
don't have anything better.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL] Btrfs updates for 4.18
  2018-06-28 18:26           ` David Sterba
@ 2018-06-29  6:13             ` Anand Jain
  0 siblings, 0 replies; 8+ messages in thread
From: Anand Jain @ 2018-06-29  6:13 UTC (permalink / raw)
  To: dsterba, Filipe Manana, David Sterba, linux-btrfs



On 06/29/2018 02:26 AM, David Sterba wrote:
> On Thu, Jun 28, 2018 at 07:22:59PM +0800, Anand Jain wrote:
>>    The circular locking dependency warning occurs at FSSTRESS_PROG.
>>    And in particular at doproc() in xfstests/ltp/fsstress.c, randomly
>>    at any of the command at
>>    opdesc_t        ops[] = { ..}
>>    which involves calling mmap file operation and if there is something
>>    to commit.
>>
>>    The commit transaction does need device_list_mutex which is also being
>>    used for the btrfs_open_devices() in the commit 542c5908abfe84f7.
>>
>>    But btrfs_open_devices() is only called at mount, and mmap() can
>>    establish only be established after the mount has completed. With
>>    this give its unclear to me why the circular locking dependency check
>>    is warning about this.
>>
>>    I feel until we have clarity about this and also solve other problem
>>    related to the streamlining of uuid_mutex, I suggest we revert
>>    542c5908abfe84f7. Sorry for the inconvenience.
> 
> Ok, the revert is one option. I'm cosidering adding both the locks, like
> is in https://patchwork.kernel.org/patch/10478443/ . This would have no
> effect, as btrfs_open_devices is called only from mount path and the
> list_sort is done only for the first time when there are not other
> users of the list that would not also be under the uuid_mutex.

> This passed the syzbot and other tests, so this does not break things
> and goes towards pushing the device_list_mutex as the real protection
> mechanism for the fs_devices members.

> Let me know what you think, the revert should be the last option if we
> don't have anything better.

  With this patch [1] as well I find the circular lock warning[2].
  [1]
   https://patchwork.kernel.org/patch/10478443/
  Test case:

   mkfs.btrfs -fq /dev/sdc && mount /dev/sdc /btrfs && 
/xfstests/ltp/fsstress -d /btrfs -w -p 1 -n 2000

  However when the device_list_mutex is removed, the warning goes away.
  Let me investigate bit more about circular locking dependency.

About using uuid_mutex in btrfs_open_devices().
  I am planning to be more conceivable about the using the
  bit map for the volume flags and which shall also include the
  EXCL OPS in progress flag for the fs_devices. Which means we hold
  uuid_mutex and set/reset EXCL OPS flag for the fs_devices. And so
  the other fsids like fsid2 can still hold the uuid_mutex while
  fsid1 is still mounting/opening (which may sleep).
  I hope you would agree to use bit map for volume, we also need this
  bit map to manage the volume status. Or if there is a better solution
  I am fine. However uuid_mutex isn't as it blocks fsids2 to mount.

Thanks, Anand

[2]
-------------------------------------------------------------------
  kernel:
  kernel: ======================================================
  kernel: WARNING: possible circular locking dependency detected
  kernel: 4.18.0-rc1+ #63 Not tainted
  kernel: ------------------------------------------------------
  kernel: fsstress/3062 is trying to acquire lock:
  kernel: 000000007d28aeca (&fs_info->reloc_mutex){+.+.}, at: 
btrfs_record_root_in_trans+0x43/0x70 [btrfs]
  kernel:
                               but task is already holding lock:
  kernel: 000000002fc78565 (&mm->mmap_sem){++++}, at: 
vm_mmap_pgoff+0x9f/0x110
  kernel:
                               which lock already depends on the new lock.
  kernel:
                               the existing dependency chain (in reverse 
order) is:
  kernel:
                               -> #5 (&mm->mmap_sem){++++}:
  kernel:        _copy_from_user+0x1e/0x90
  kernel:        scsi_cmd_ioctl+0x2ba/0x480
  kernel:        cdrom_ioctl+0x3b/0xb2e
  kernel:        sr_block_ioctl+0x7e/0xc0
  kernel:        blkdev_ioctl+0x4ea/0x980
  kernel:        block_ioctl+0x39/0x40
  kernel:        do_vfs_ioctl+0xa2/0x6c0
  kernel:        ksys_ioctl+0x70/0x80
  kernel:        __x64_sys_ioctl+0x16/0x20
  kernel:        do_syscall_64+0x4a/0x180
  kernel:        entry_SYSCALL_64_after_hwframe+0x49/0xbe
  kernel:
                               -> #4 (sr_mutex){+.+.}:
  kernel:        sr_block_open+0x24/0xd0
  kernel:        __blkdev_get+0xcb/0x480
  kernel:        blkdev_get+0x144/0x3a0
  kernel:        do_dentry_open+0x1b1/0x2d0
  kernel:        path_openat+0x57b/0xcc0
  kernel:        do_filp_open+0x9b/0x110
  kernel:        do_sys_open+0x1bd/0x250
  kernel:        do_syscall_64+0x4a/0x180
  kernel:        entry_SYSCALL_64_after_hwframe+0x49/0xbe
  kernel:
                               -> #3 (&bdev->bd_mutex){+.+.}:
  kernel:        __blkdev_get+0x5d/0x480
  kernel:        blkdev_get+0x243/0x3a0
  kernel:        blkdev_get_by_path+0x4a/0x80
  kernel:        btrfs_get_bdev_and_sb+0x1b/0xa0 [btrfs]
  kernel:        open_fs_devices+0x85/0x270 [btrfs]
  kernel:        btrfs_open_devices+0x6b/0x70 [btrfs]
  kernel:        btrfs_mount_root+0x41a/0x7e0 [btrfs]
  kernel:        mount_fs+0x30/0x150
  kernel:        vfs_kern_mount.part.31+0x54/0x140
  kernel:        btrfs_mount+0x175/0x920 [btrfs]
  kernel:        mount_fs+0x30/0x150
  kernel:        vfs_kern_mount.part.31+0x54/0x140
  kernel:        do_mount+0x63b/0xd60
  kernel:        ksys_mount+0x80/0xd0
  kernel:        __x64_sys_mount+0x21/0x30
  kernel:        do_syscall_64+0x4a/0x180
  kernel:        entry_SYSCALL_64_after_hwframe+0x49/0xbe
  kernel:
                               -> #2 (&fs_devs->device_list_mutex){+.+.}:
  kernel:        btrfs_run_dev_stats+0x47/0x3b0 [btrfs]
  kernel:        commit_cowonly_roots+0xb4/0x2b0 [btrfs]
  kernel:        btrfs_commit_transaction+0x3ab/0x9d0 [btrfs]
  kernel:        transaction_kthread+0x156/0x180 [btrfs]
  kernel:        kthread+0x11c/0x140
  kernel:        ret_from_fork+0x3a/0x50
  kernel:
                               -> #1 (&fs_info->tree_log_mutex){+.+.}:
  kernel:        btrfs_commit_transaction+0x350/0x9d0 [btrfs]
  kernel:        transaction_kthread+0x156/0x180 [btrfs]
  kernel:        kthread+0x11c/0x140
  kernel:        ret_from_fork+0x3a/0x50
  kernel:
                               -> #0 (&fs_info->reloc_mutex){+.+.}:
  kernel:        __mutex_lock+0x7f/0x9d0
  kernel:        btrfs_record_root_in_trans+0x43/0x70 [btrfs]
  kernel:        start_transaction+0xa2/0x4a0 [btrfs]
  kernel:        btrfs_dirty_inode+0x42/0xd0 [btrfs]
  kernel:        touch_atime+0xab/0xd0
  kernel:        btrfs_file_mmap+0x3c/0x60 [btrfs]
  kernel:        mmap_region+0x3a8/0x5e0
  kernel:        do_mmap+0x3dd/0x5a0
  kernel:        vm_mmap_pgoff+0xcf/0x110
  kernel:        ksys_mmap_pgoff+0x1b5/0x220
  kernel:        do_syscall_64+0x4a/0x180
  kernel:        entry_SYSCALL_64_after_hwframe+0x49/0xbe
  kernel:
                               other info that might help us debug this:
  kernel: Chain exists of:
                                 &fs_info->reloc_mutex --> sr_mutex --> 
&mm->mmap_sem
  kernel:  Possible unsafe locking scenario:
  kernel:        CPU0                    CPU1
  kernel:        ----                    ----
  kernel:   lock(&mm->mmap_sem);
  kernel:                                lock(sr_mutex);
  kernel:                                lock(&mm->mmap_sem);
  kernel:   lock(&fs_info->reloc_mutex);
  kernel:
                                *** DEADLOCK ***
  kernel: 3 locks held by fsstress/3062:
  kernel:  #0: 000000002fc78565 (&mm->mmap_sem){++++}, at: 
vm_mmap_pgoff+0x9f/0x110
  kernel:  #1: 0000000074df19d7 (sb_writers#9){.+.+}, at: 
touch_atime+0x64/0xd0
  kernel:  #2: 00000000e7f8e0ad (sb_internal#2){.+.+}, at: 
start_transaction+0x2e8/0x4a0 [btrfs]
  kernel:
                               stack backtrace:
  kernel: CPU: 0 PID: 3062 Comm: fsstress Not tainted 4.18.0-rc1+ #63
  kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS 
VirtualBox 12/01/2006
  kernel: Call Trace:
  kernel:  dump_stack+0x67/0x9b
  kernel:  print_circular_bug.isra.36+0x1ce/0x1db
  kernel:  __lock_acquire+0x1442/0x1540
  kernel:  ? lock_acquire+0xa6/0x200
  kernel:  lock_acquire+0xa6/0x200
  kernel:  ? btrfs_record_root_in_trans+0x43/0x70 [btrfs]
  kernel:  __mutex_lock+0x7f/0x9d0
  kernel:  ? btrfs_record_root_in_trans+0x43/0x70 [btrfs]
  kernel:  ? rcu_read_lock_sched_held+0x74/0x80
  kernel:  ? find_held_lock+0x2d/0x90
  kernel:  ? join_transaction+0x3b0/0x410 [btrfs]
  kernel:  ? btrfs_record_root_in_trans+0x43/0x70 [btrfs]
  kernel:  btrfs_record_root_in_trans+0x43/0x70 [btrfs]
  kernel:  start_transaction+0xa2/0x4a0 [btrfs]
  kernel:  btrfs_dirty_inode+0x42/0xd0 [btrfs]
  kernel:  touch_atime+0xab/0xd0
  kernel:  btrfs_file_mmap+0x3c/0x60 [btrfs]
  kernel:  mmap_region+0x3a8/0x5e0
  kernel:  do_mmap+0x3dd/0x5a0
  kernel:  vm_mmap_pgoff+0xcf/0x110
  kernel:  ksys_mmap_pgoff+0x1b5/0x220
  kernel:  ? trace_hardirqs_off_thunk+0x1a/0x1c
  kernel:  do_syscall_64+0x4a/0x180
  kernel:  entry_SYSCALL_64_after_hwframe+0x49/0xbe
  kernel: RIP: 0033:0x7ff6d43429da
  kernel: Code: 89 f5 41 54 49 89 fc 55 53 74 35 49 63 e8 48 63 da 4d 89 
f9 49 89 e8 4d 63 d6 48 89 da 4c 89 ee 4c 89 e7 b8 09 00 00 00 0f 05 
<48> 3d 00 f0 ff ff 77 56 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 1f 00
  kernel: RSP: 002b:00007fff738dc648 EFLAGS: 00000246 ORIG_RAX: 
0000000000000009
  kernel: RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007ff6d43429da
  kernel: RDX: 0000000000000002 RSI: 000000000001bcd5 RDI: 0000000000000000
  kernel: RBP: 0000000000000003 R08: 0000000000000003 R09: 000000000008e000
  kernel: R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000
  kernel: R13: 000000000001bcd5 R14: 0000000000000002 R15: 000000000008e000
-------------------------------------------------------------------

> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-06-29  6:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-04 15:43 [GIT PULL] Btrfs updates for 4.18 David Sterba
2018-06-09 16:21 ` Filipe Manana
2018-06-11  8:14   ` Anand Jain
2018-06-11  9:50     ` Filipe Manana
2018-06-11 16:16       ` David Sterba
2018-06-28 11:22         ` Anand Jain
2018-06-28 18:26           ` David Sterba
2018-06-29  6:13             ` Anand Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).