linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Andreas Dilger <adilger@dilger.ca>
Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Matthew Wilcox <mawilcox@microsoft.com>
Subject: Re: XArray documentation
Date: Fri, 24 Nov 2017 09:17:12 -0800	[thread overview]
Message-ID: <20171124171712.GB681@bombadil.infradead.org> (raw)
In-Reply-To: <4866F643-97A1-4B80-B5E2-8EF5BEF8EE30@dilger.ca>

On Thu, Nov 23, 2017 at 09:30:21PM -0700, Andreas Dilger wrote:
> > Pointers to be stored in the XArray must have the bottom two bits clear
> > (ie must point to something which is 4-byte aligned).  This includes all
> > objects allocated by calling :c:func:`kmalloc` and :c:func:`alloc_page`,
> > but you cannot store pointers to arbitrary offsets within an object.
> > The XArray does not support storing :c:func:`IS_ERR` pointers; some
> > conflict with data values and others conflict with entries the XArray
> > uses for its own purposes.  If you need to store special values which
> > cannot be confused with real kernel pointers, the values 4, 8, ... 4092
> > are available.
> 
> Thought - if storing error values into the XArray in addition to regular
> pointers is important for some use case, it would be easy to make
> "ERR_PTR_XA()", "PTR_ERR_XA()", and "IS_ERR_XA()" macros that just shift
> the error values up and down by two bits to avoid the conflict.  That
> would still allow error values up (down) to -1023 to be stored without
> any chance of a pointer conflict, which should be enough.

We could do something along those lines.  It's not something anybody's
trying to do with the radix tree today, but I wanted to document it
just in case anybody got a clever idea in the future.  Another source
of ambiguity that I didn't mention is that xa_store() can return
ERR_PTR(-ENOMEM), so if we encode the xarray-error-pointers in some
way, they have to be distinguishable from errors returned by the xa_*
functions.  There's no ambiguity when using the xas_ functions as they
signal errors in the xa_state.

> > Each non-NULL entry in the array has three bits associated with it called
> > tags.  Each tag may be flipped on or off independently of the others.
> > You can search for entries with a given tag set.
> 
> How can it be 3 tag bits, if the pointers only need to be 4-byte aligned?

The same way the radix tree does it -- there are three bitfields in each
node.  So all your PAGECACHE_DIRTY tags are stored in the same bitfield.

(I'm taking these two questions as request for more information on the
implementation, which I don't want to put in this document.  I want
this to be the users manual.  I think the appropriate place to put this
information is in source code comments.)

> > Finally, you can remove all entries from an XArray by calling
> > :c:func:`xa_destroy`.  If the XArray entries are pointers, you may wish
> > to free the entries first.  You can do this by iterating over all non-NULL
> > entries in
> 
> ... the XArray using xa_for_each() ?

I swear I wrote that ... editing error.  New paragraph:

Finally, you can remove all entries from an XArray by calling
:c:func:`xa_destroy`.  If the XArray entries are pointers, you may wish
to free the entries first.  You can do this by iterating over all non-NULL
entries in the XArray using the :c:func:`xa_for_each` iterator.

> > The only error currently generated by the xarray code itself is
> > ENOMEM, but it supports arbitrary errors in case you want to call
> > :c:func:`xas_set_err` yourself.  If the xa_state is holding an ENOMEM
> > error, :c:func:`xas_nomem` will attempt to allocate a single xa_node using
> 
> .. calling :c:func:`xas_nomem` ... ?
> 
> > the specified gfp flags and store it in the xa_state for the next attempt.
> > The idea is that you take the xa_lock, attempt the operation and drop
> > the lock.  Then you allocate memory if there was a memory allocation
> 
> ... then you try to allocate ...

Fair point -- if xas_nomem() fails to allocate memory, it'll return 'false'
to indicate not to bother retrying the operation.

> > failure and retry the operation.  You must call :c:func:`xas_destroy`
> > if you call :c:func:`xas_nomem` in case it's not necessary to use the
> > memory that was allocated.
> 
> This last sentence is not totally clear.  How about:
> 
> If you called :c:func:`xas_nomem` to allocate memory, but didn't need
> to use the memory for some reason, you need to call :c:func:`xas_destroy`
> to free the allocated memory.
> 
> 
> The question is where the "allocated memory" is stored, if it isn't in
> the XArray?  Is it in the XA_STATE? 

That was earlier in the document:

:c:func:`xas_set_err` yourself.  If the xa_state is holding an ENOMEM
error, :c:func:`xas_nomem` will attempt to allocate a single xa_node using
the specified gfp flags and store it in the xa_state for the next attempt.

> How do you know if the allocated
> memory was needed, or is it always safe to call xas_destroy?

It's always safe to call xas_destroy.

> Is the
> allocated memory always consumed if xa_store/xa_cmpxchg are called?

Usually.  If another process added the same node that we were trying to
store to, we won't allocate memory.

> What if there was another process that also added the same entry while
> the xa_lock was dropped?

We'll overwrite it.

I've rewritten this whole section.  Thanks for the feedback!

The xa_state is also used to store errors.  You can call
:c:func:`xas_error` to retrieve the error.  All operations check whether
the xa_state is in an error state before proceeding, so there's no need 
for you to check for an error after each call; you can make multiple
calls in succession and only check at a convenient point.  The only error
currently generated by the xarray code itself is ENOMEM, but it supports
arbitrary errors in case you want to call :c:func:`xas_set_err` yourself.

If the xa_state is holding an ENOMEM error, calling :c:func:`xas_nomem`
will attempt to allocate more memory using the specified gfp flags and
cache it in the xa_state for the next attempt.  The idea is that you take
the xa_lock, attempt the operation and drop the lock.  The operation
attempts to allocate memory while holding the lock, but it is more
likely to fail.  Once you have dropped the lock, :c:func:`xas_nomem`
can try harder to allocate more memory.  It will return ``true`` if it
is worth retrying the operation (ie that there was a memory error *and*
more memory was allocated.  You must call :c:func:`xas_destroy` if you
call :c:func:`xas_nomem` in case memory was allocated and then turned
out not to be needed.

> > When using the advanced API, it's possible to see internal entries
> > in the XArray.  You should never see an :c:func:`xa_is_node` entry,
> > but you may see other internal entries, including sibling entries,
> > skip entries and retry entries.  The :c:func:`xas_retry` function is a
> > useful helper function for handling internal entries, particularly in
> > the middle of iterations.
> 
> How do you know if a returned value is an "internal entry"?  Is there
> some "xas_is_internal()" macro/function that tells you this?

I know I chose the right name for that function, since you guessed it :-)

I've rewritten this whole section.  I'll post the updated document in a bit.

  reply	other threads:[~2017-11-24 17:17 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-22 21:06 [PATCH 00/62] XArray November 2017 Edition Matthew Wilcox
2017-11-22 21:06 ` [PATCH 01/62] tools: Make __test_and_clear_bit available Matthew Wilcox
2017-11-22 21:06 ` [PATCH 02/62] radix tree test suite: Remove ARRAY_SIZE Matthew Wilcox
2017-11-22 21:06 ` [PATCH 03/62] radix tree test suite: Check reclaim bit Matthew Wilcox
2017-11-22 21:06 ` [PATCH 04/62] idr test suite: Fix ida_test_random() Matthew Wilcox
2017-11-22 21:06 ` [PATCH 05/62] radix tree: Add a missing cast to gfp_t Matthew Wilcox
2017-11-22 21:28   ` Luc Van Oostenryck
2017-11-22 22:24     ` Matthew Wilcox
2017-11-22 22:35       ` Luc Van Oostenryck
2017-11-22 21:06 ` [PATCH 06/62] idr: Make cursor explicit for cyclic allocation Matthew Wilcox
2017-11-22 21:06 ` [PATCH 07/62] idr: Rewrite extended IDR API Matthew Wilcox
2017-11-22 21:06 ` [PATCH 08/62] Explicitly include radix-tree.h Matthew Wilcox
2017-11-22 21:06 ` [PATCH 09/62] arm64: Turn flush_dcache_mmap_lock into a no-op Matthew Wilcox
2017-11-22 21:06 ` [PATCH 10/62] unicore32: " Matthew Wilcox
2017-11-22 21:06 ` [PATCH 11/62] Export __set_page_dirty Matthew Wilcox
2017-11-22 21:06 ` [PATCH 12/62] xfs: Rename xa_ elements to ail_ Matthew Wilcox
2017-11-22 21:06 ` [PATCH 13/62] fscache: Use appropriate radix tree accessors Matthew Wilcox
2017-11-22 21:06 ` [PATCH 14/62] xarray: Add the xa_lock to the radix_tree_root Matthew Wilcox
2017-11-22 21:06 ` [PATCH 15/62] page cache: Use xa_lock Matthew Wilcox
2017-11-22 21:06 ` [PATCH 16/62] xarray: Replace exceptional entries Matthew Wilcox
2017-11-22 21:06 ` [PATCH 17/62] xarray: Change definition of sibling entries Matthew Wilcox
2017-11-22 21:06 ` [PATCH 18/62] xarray: Add definition of struct xarray Matthew Wilcox
2017-11-22 21:06 ` [PATCH 19/62] xarray: Define struct xa_node Matthew Wilcox
2017-11-22 21:06 ` [PATCH 20/62] xarray: Add xa_load Matthew Wilcox
2017-11-22 21:06 ` [PATCH 21/62] xarray: Add xa_get_tag, xa_set_tag and xa_clear_tag Matthew Wilcox
2017-11-22 21:06 ` [PATCH 22/62] xarray: Add xa_store Matthew Wilcox
2017-11-22 21:07 ` [PATCH 23/62] xarray: Add xa_cmpxchg Matthew Wilcox
2017-11-22 21:07 ` [PATCH 24/62] xarray: Add xa_for_each Matthew Wilcox
2017-11-22 21:07 ` [PATCH 25/62] xarray: Add xa_init Matthew Wilcox
2017-11-22 21:07 ` [PATCH 26/62] xarray: Add xas_for_each_tag Matthew Wilcox
2017-11-22 21:07 ` [PATCH 27/62] xarray: Add xa_get_entries and xa_get_tagged Matthew Wilcox
2017-11-22 21:07 ` [PATCH 28/62] xarray: Add xa_destroy Matthew Wilcox
2017-11-22 21:07 ` [PATCH 29/62] xarray: Add xas_prev_any Matthew Wilcox
2017-11-22 21:07 ` [PATCH 30/62] xarray: Add xas_find_any / xas_next_any Matthew Wilcox
2017-11-22 21:07 ` [PATCH 31/62] Convert IDR to use xarray Matthew Wilcox
2017-11-22 21:07 ` [PATCH 32/62] ida: Convert to using xarray Matthew Wilcox
2017-11-22 21:07 ` [PATCH 33/62] page cache: Convert page_cache_next_hole to XArray Matthew Wilcox
2017-11-22 21:07 ` [PATCH 34/62] page cache: Use xarray for adding pages Matthew Wilcox
2017-11-22 21:07 ` [PATCH 35/62] page cache: Convert page_cache_tree_delete to xarray Matthew Wilcox
2017-11-22 21:07 ` [PATCH 36/62] page cache: Convert find_get_entry " Matthew Wilcox
2017-11-22 21:07 ` [PATCH 37/62] shmem: Convert replace " Matthew Wilcox
2017-11-22 21:07 ` [PATCH 38/62] shmem: Convert shmem_confirm_swap to XArray Matthew Wilcox
2017-11-22 21:07 ` [PATCH 39/62] shmem: Convert find_swap_entry " Matthew Wilcox
2017-11-22 21:07 ` [PATCH 40/62] shmem: Convert shmem_tag_pins " Matthew Wilcox
2017-11-22 21:07 ` [PATCH 41/62] shmem: Convert shmem_wait_for_pins " Matthew Wilcox
2017-11-22 21:07 ` [PATCH 42/62] vmalloc: Convert to xarray Matthew Wilcox
2017-11-22 21:07 ` [PATCH 43/62] brd: Convert to XArray Matthew Wilcox
2017-11-22 21:07 ` [PATCH 44/62] xfs: Convert m_perag_tree " Matthew Wilcox
2017-11-22 21:07 ` [PATCH 45/62] xfs: Convert pag_ici_root " Matthew Wilcox
2017-11-22 21:07 ` [PATCH 46/62] xfs: Convert xfs dquot " Matthew Wilcox
2017-11-22 21:07 ` [PATCH 47/62] xfs: Convert mru cache " Matthew Wilcox
2017-11-22 21:07 ` [PATCH 48/62] block: Remove IDR preloading Matthew Wilcox
2017-11-22 21:07 ` [PATCH 49/62] rxrpc: " Matthew Wilcox
2017-11-22 21:07 ` [PATCH 50/62] cgroup: Remove IDR wrappers Matthew Wilcox
2017-11-22 21:07 ` [PATCH 51/62] dca: Remove idr_preload calls Matthew Wilcox
2017-11-22 21:07 ` [PATCH 52/62] ipc: Remove call to idr_preload Matthew Wilcox
2017-11-22 21:07 ` [PATCH 53/62] irq: " Matthew Wilcox
2017-11-22 21:07 ` [PATCH 54/62] scsi: Remove idr_preload in st driver Matthew Wilcox
2017-11-22 21:07 ` [PATCH 55/62] firewire: Remove call to idr_preload Matthew Wilcox
2017-11-22 21:07 ` [PATCH 56/62] drm: Remove drm_minor_lock and idr_preload Matthew Wilcox
2017-11-22 21:07 ` [PATCH 57/62] drm: Remove drm_syncobj_fd_to_handle Matthew Wilcox
2017-11-22 21:07 ` [PATCH 58/62] drm: Remove qxl driver IDR locks Matthew Wilcox
2017-11-22 21:07 ` [PATCH 59/62] drm: Replace virtio IDRs with IDAs Matthew Wilcox
2017-11-22 21:07 ` [PATCH 60/62] drm: Replace vmwgfx " Matthew Wilcox
2017-11-22 21:07 ` [PATCH 61/62] net: Redesign act_api use of IDR Matthew Wilcox
2017-11-22 21:07 ` [PATCH 62/62] mm: Convert page-writeback to XArray Matthew Wilcox
2017-11-23  1:25 ` [PATCH 00/62] XArray November 2017 Edition Dave Chinner
2017-11-23  2:46   ` Matthew Wilcox
2017-11-24  1:16 ` XArray documentation Matthew Wilcox
2017-11-24  4:30   ` Andreas Dilger
2017-11-24 17:17     ` Matthew Wilcox [this message]
2017-11-24 16:50   ` Martin Steigerwald
2017-11-24 17:03     ` Matthew Wilcox
2017-11-24 18:01       ` Martin Steigerwald
2017-11-24 19:48         ` Shakeel Butt
2017-11-24 19:56           ` Matthew Wilcox
2017-11-24 21:18         ` Matthew Wilcox
2017-11-24 22:02           ` Martin Steigerwald
2017-11-24 22:08             ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171124171712.GB681@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=adilger@dilger.ca \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mawilcox@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).