All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Huang, Ying" <ying.huang@intel.com>
To: Chris Li <chrisl@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>,
	 Kairui Song <ryncsn@gmail.com>,
	linux-mm@kvack.org,  Kairui Song <kasong@tencent.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	 Barry Song <v-songbaohua@oppo.com>,
	 Ryan Roberts <ryan.roberts@arm.com>,  Neil Brown <neilb@suse.de>,
	 Minchan Kim <minchan@kernel.org>,
	 Hugh Dickins <hughd@google.com>,
	 David Hildenbrand <david@redhat.com>,
	 Yosry Ahmed <yosryahmed@google.com>,
	linux-fsdevel@vger.kernel.org,  linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/8] mm/swap: optimize swap cache search space
Date: Sun, 28 Apr 2024 09:14:25 +0800	[thread overview]
Message-ID: <87bk5uqoem.fsf@yhuang6-desk2.ccr.corp.intel.com> (raw)
In-Reply-To: <CANeU7Q=YYFWPBMHPPeOQDxO9=yAiQP8w90e2mO0U+hBuzCV1RQ@mail.gmail.com> (Chris Li's message of "Fri, 26 Apr 2024 16:16:01 -0700")

Chris Li <chrisl@kernel.org> writes:

> Hi Ying,
>
> On Tue, Apr 23, 2024 at 7:26 PM Huang, Ying <ying.huang@intel.com> wrote:
>>
>> Hi, Matthew,
>>
>> Matthew Wilcox <willy@infradead.org> writes:
>>
>> > On Mon, Apr 22, 2024 at 03:54:58PM +0800, Huang, Ying wrote:
>> >> Is it possible to add "start_offset" support in xarray, so "index"
>> >> will subtract "start_offset" before looking up / inserting?
>> >
>> > We kind of have that with XA_FLAGS_ZERO_BUSY which is used for
>> > XA_FLAGS_ALLOC1.  But that's just one bit for the entry at 0.  We could
>> > generalise it, but then we'd have to store that somewhere and there's
>> > no obvious good place to store it that wouldn't enlarge struct xarray,
>> > which I'd be reluctant to do.
>> >
>> >> Is it possible to use multiple range locks to protect one xarray to
>> >> improve the lock scalability?  This is why we have multiple "struct
>> >> address_space" for one swap device.  And, we may have same lock
>> >> contention issue for large files too.
>> >
>> > It's something I've considered.  The issue is search marks.  If we delete
>> > an entry, we may have to walk all the way up the xarray clearing bits as
>> > we go and I'd rather not grab a lock at each level.  There's a convenient
>> > 4 byte hole between nr_values and parent where we could put it.
>> >
>> > Oh, another issue is that we use i_pages.xa_lock to synchronise
>> > address_space.nrpages, so I'm not sure that a per-node lock will help.
>>
>> Thanks for looking at this.
>>
>> > But I'm conscious that there are workloads which show contention on
>> > xa_lock as their limiting factor, so I'm open to ideas to improve all
>> > these things.
>>
>> I have no idea so far because my very limited knowledge about xarray.
>
> For the swap file usage, I have been considering an idea to remove the
> index part of the xarray from swap cache. Swap cache is different from
> file cache in a few aspects.
> For one if we want to have a folio equivalent of "large swap entry".
> Then the natural alignment of those swap offset on does not make
> sense. Ideally we should be able to write the folio to un-aligned swap
> file locations.
>
> The other aspect for swap files is that, we already have different
> data structures organized around swap offset, swap_map and
> swap_cgroup. If we group the swap related data structure together. We
> can add a pointer to a union of folio or a shadow swap entry.

The shadow swap entry may be freed.  So we need to prepare for that.
And, in current design, only swap_map[] is allocated if the swap space
isn't used.  That needs to be considered too.

> We can use atomic updates on the swap struct member or breakdown the
> access lock by ranges just like swap cluster does.

The swap code uses xarray in a simple way.  That gives us opportunity to
optimize.  For example, it makes it easy to use multiple xarray
instances for one swap device.

> I want to discuss those ideas in the upcoming LSF/MM meet up as well.

Good!

--
Best Regards,
Huang, Ying

  reply	other threads:[~2024-04-28  1:16 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-17 16:08 [PATCH 0/8] mm/swap: optimize swap cache search space Kairui Song
2024-04-17 16:08 ` [PATCH 1/8] NFS: remove nfs_page_lengthg and usage of page_index Kairui Song
2024-04-17 16:08 ` [PATCH 2/8] nilfs2: drop " Kairui Song
2024-04-17 16:14   ` Matthew Wilcox
2024-04-18  2:42     ` Kairui Song
2024-04-17 16:08 ` [PATCH 3/8] f2fs: " Kairui Song
2024-04-17 16:08   ` [f2fs-dev] " Kairui Song
2024-04-17 16:08 ` [PATCH 4/8] ceph: " Kairui Song
2024-04-18  0:28   ` Xiubo Li
2024-04-18  1:30     ` Matthew Wilcox
2024-04-18  1:40       ` Xiubo Li
2024-04-22 15:34         ` Kairui Song
2024-04-23  0:32           ` Xiubo Li
2024-04-17 16:08 ` [PATCH 5/8] cifs: drop usage of page_file_offset Kairui Song
2024-04-17 16:25   ` Matthew Wilcox
2024-04-17 16:08 ` [PATCH 6/8] mm/swap: get the swap file offset directly Kairui Song
2024-04-18 18:43   ` kernel test robot
2024-04-23  1:41   ` Huang, Ying
2024-04-23 13:33     ` Kairui Song
2024-04-17 16:08 ` [PATCH 7/8] mm: drop page_index/page_file_offset and convert swap helpers to use folio Kairui Song
2024-04-18  1:55   ` Barry Song
2024-04-18  2:42     ` Kairui Song
2024-04-18 10:19       ` Barry Song
2024-04-18  3:30     ` Matthew Wilcox
2024-04-18  3:55       ` Barry Song
2024-04-17 16:08 ` [PATCH 8/8] mm/swap: reduce swap cache search space Kairui Song
2024-04-18 18:21   ` kernel test robot
2024-04-18 18:21   ` kernel test robot
2024-04-22  7:54 ` [PATCH 0/8] mm/swap: optimize " Huang, Ying
2024-04-22 15:20   ` Kairui Song
2024-04-23  1:29     ` Huang, Ying
2024-04-23  3:20   ` Matthew Wilcox
2024-04-24  2:24     ` Huang, Ying
2024-04-26 23:16       ` Chris Li
2024-04-28  1:14         ` Huang, Ying [this message]
2024-04-28  2:43           ` Chris Li
2024-04-28  3:21             ` Huang, Ying
2024-04-28 17:26               ` Chris Li
2024-04-28 17:37         ` Kairui Song
2024-04-28 17:45           ` Kairui Song
2024-04-29  5:50           ` Chris Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bk5uqoem.fsf@yhuang6-desk2.ccr.corp.intel.com \
    --to=ying.huang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=chrisl@kernel.org \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=kasong@tencent.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=neilb@suse.de \
    --cc=ryan.roberts@arm.com \
    --cc=ryncsn@gmail.com \
    --cc=v-songbaohua@oppo.com \
    --cc=willy@infradead.org \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.