linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Fabio M. De Francesco" <fmdefrancesco@gmail.com>
To: dsterba@suse.cz,
	"Fabio M. De Francesco" <fmdefrancesco@gmail.com>,
	Chris Mason <clm@fb.com>, Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>, Ira Weiny <ira.weiny@intel.com>,
	linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH 0/3] btrfs: Replace kmap() with kmap_local_page()
Date: Thu, 02 Jun 2022 20:01:04 +0200	[thread overview]
Message-ID: <4713391.F8r316W7xa@opensuse> (raw)
In-Reply-To: <20220601132545.GM20633@twin.jikos.cz>

On mercoledì 1 giugno 2022 15:25:45 CEST David Sterba wrote:
> On Tue, May 31, 2022 at 04:53:32PM +0200, Fabio M. De Francesco wrote:
> > This is the first series of patches aimed towards the conversion of 
Btrfs
> > filesystem from the use of kmap() to kmap_local_page().
> 
> We've already had patches converting kmaps and you're changing the last
> ones, so this is could be the last series, with two exceptions.
> 
> 1) You've changed lzo.c and zlib.

and inode.c.

> but the same kmap/kunmap pattern is
>    used in zstd.c.

I thought that these mappings I had worked on were safe to convert.

Instead I wasn't sure that the others I left untouched in zstd.c could so 
easily and mechanically be converted without prior code inspection and 
proper tests.

I did see those in zstd.c, but I decided to postpone those conversions 
because I'm not yet sure if and how the virtual addresses we get currently 
from kmap() are re-used.

I saw assignments like "workspace->in_buf.src = kmap(in_page);". Is 
"in_buf" re-used across different contexts? (I see that Btrfs uses a dozen 
workqueues). 

I also see that kunmap() is called in the same functions that assign 
virtual addresses to "in_buf" and this makes me think that those addresses 
are not handled across contexts, otherwise you already have bugs. But may 
very well be that somewhere in the calls chain the code flushes workqueues 
before returning to the callers (it would mean that when kunmap() is called 
we can be sure that those workqueues are done with using those addresses). 

Furthermore, what can you say about the tens of page_address() spread 
across the whole fs/btrfs? 

If those page_address() take pages from HIGHMEM which were mapped using 
kmap_local_page(), the filesystem will oops the kernel...

About this issue, please see a bug fix ("[PATCH v2] fs/ext2: Avoid 
page_address on pages returned by ext2_get_page") at 
https://lore.kernel.org/lkml/
20210714185448.8707ac239e9f12b3a7f5b9f9@urjc.es/#r

Do they only use physical memory from ZONE_NORMAL?

Can you please confirm that it is safe to convert those left kmap() to 
kmap_local_page() and that those page_address() are safe?

If so, I have no problems to convert what I had left for later. Otherwise 
I'll need to carefully inspect the code.

> 2) kmap_atomic in inode.c, so that's technically not kmap but it's said
>    to be deprecated and also can be replaced by kmap_local_page. The
>    context in check_compressed_csum is atomic (in end io) so the kmap
>    hasn't been used there.

I was not 100% sure about the preemption requirements for those call sites 
so I had not converted them yet. Are you saying that there is no need for 
preempt_disable() at the following sites?

# git grep kmap_atomic fs/btrfs
fs/btrfs/compression.c:                 kaddr = kmap_atomic(page);
fs/btrfs/inode.c:                       kaddr = kmap_atomic(cpage);
fs/btrfs/inode.c:               kaddr = kmap_atomic(page);
fs/btrfs/inode.c:       kaddr = kmap_atomic(page);

> > tweed32:~ # btrfs check -p ~zoek/dev/btrfs.file
> 
> That won't verify if the kmap conversion is OK, it's a runtime thing
> while 'check' verifies the data on device. Have you run any kind of
> stress test with enabled compression before running the check?

Ah, thanks. I didn't know this thing.

I installed (x)fstests a couple of days ago. I think it helps to test this 
and other conversions to local mappings, but I haven't yet had time to 
learn how to use it.

Does (x)fstests cover the compression code? Are there any specific tests I 
should target?

> Please send patches converting zstd.c and the remaining kmap_atomic
> usage in inode.c, otherwise the 3 patches are now in misc-next, thanks.

New version is required in any case because LKP reported four uninitialized 
variables in patch 3/3. 

I'm just reading the reports that both you and Christoph hit. At first 
sight they seem to be due to page_address() calls (but I may be wrong, just 
had few minutes to reply so late) :( 

I was wrong in thinking that these call sites could be converted safely. 
I'll do the tests before posting v2.

Thanks,

Fabio

P.S.: I've just read a message from Ira Weiny about something he saw in the 
unmapping order in zstd_compress_pages(). We must think how to address 
mapping /unmapping order properly.





  parent reply	other threads:[~2022-06-02 18:01 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-31 14:53 [PATCH 0/3] btrfs: Replace kmap() with kmap_local_page() Fabio M. De Francesco
2022-05-31 14:53 ` [PATCH 1/3] btrfs: Replace kmap() with kmap_local_page() in inode.c Fabio M. De Francesco
2022-05-31 15:46   ` Christoph Hellwig
2022-05-31 14:53 ` [PATCH 2/3] btrfs: Replace kmap() with kmap_local_page() in lzo.c Fabio M. De Francesco
2022-05-31 15:53   ` Christoph Hellwig
2022-05-31 14:53 ` [PATCH 3/3] btrfs: Replace kmap() with kmap_local_page() in zlib.c Fabio M. De Francesco
2022-05-31 15:53   ` Christoph Hellwig
2022-05-31 20:35   ` kernel test robot
2022-06-01 19:57     ` Fabio M. De Francesco
2022-06-06 12:11   ` Dan Carpenter
2022-06-01 13:25 ` [PATCH 0/3] btrfs: Replace kmap() with kmap_local_page() David Sterba
2022-06-02 16:22   ` Ira Weiny
2022-06-02 16:46     ` Matthew Wilcox
2022-06-02 18:01   ` Fabio M. De Francesco [this message]
2022-06-02 15:20 ` Christoph Hellwig
2022-06-02 15:55   ` Ira Weiny
2022-06-02 16:28   ` David Sterba
2022-06-05 15:11     ` Fabio M. De Francesco
2022-06-06 10:32       ` David Sterba
2022-06-06 14:32         ` Fabio M. De Francesco

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4713391.F8r316W7xa@opensuse \
    --to=fmdefrancesco@gmail.com \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=dsterba@suse.cz \
    --cc=hch@infradead.org \
    --cc=ira.weiny@intel.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).