From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81263C433F5 for ; Wed, 6 Apr 2022 14:53:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235333AbiDFOzC (ORCPT ); Wed, 6 Apr 2022 10:55:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235238AbiDFOyW (ORCPT ); Wed, 6 Apr 2022 10:54:22 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE741608C05; Wed, 6 Apr 2022 04:32:50 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id A39EF1F858; Wed, 6 Apr 2022 11:32:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1649244767; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3Xisg031nvUyjGa8rCi6g+/z5EvYZTh85G/9xqOb5KY=; b=P/2Ym6ZL6GdwRefQpv/AEkV8+b4JtOemLvBrqv7RjuD2PfNcJpOfI564pSYltQEcy5BwsY Eu6IOAw83soFaggPo3L0EPH/DUt22OI4vFkCZaFECXqiPUsGpfH4pF6bPfpqnY7LnvcFED 1ayP1Qwm9A/UrdjRKmxvE7Yfzzonp98= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1649244767; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3Xisg031nvUyjGa8rCi6g+/z5EvYZTh85G/9xqOb5KY=; b=vigZobKfTvpGZEYPPCz/hre6y0dvlKsE+qj6qu4Xqu+34dqKdh1/zdH4dAAVBqd/Ov4KA+ WonnjSEUQfLHxdAQ== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 3D96A139F5; Wed, 6 Apr 2022 11:32:47 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id ziL9C196TWIkRgAAMHmgww (envelope-from ); Wed, 06 Apr 2022 11:32:47 +0000 Received: from localhost (brahms.olymp [local]) by brahms.olymp (OpenSMTPD) with ESMTPA id b51e56ad; Wed, 6 Apr 2022 11:33:11 +0000 (UTC) From: =?utf-8?Q?Lu=C3=ADs_Henriques?= To: Xiubo Li Cc: Jeff Layton , Ilya Dryomov , ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] ceph: invalidate pages when doing DIO in encrypted inodes References: <20220401133243.1075-1-lhenriques@suse.de> <878rsia391.fsf@brahms.olymp> <6ba91390-83e8-8702-2729-dc432abd3cc5@redhat.com> Date: Wed, 06 Apr 2022 12:33:11 +0100 In-Reply-To: <6ba91390-83e8-8702-2729-dc432abd3cc5@redhat.com> (Xiubo Li's message of "Wed, 6 Apr 2022 19:18:04 +0800") Message-ID: <87zgky8n0o.fsf@brahms.olymp> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Xiubo Li writes: > On 4/6/22 6:57 PM, Lu=C3=ADs Henriques wrote: >> Xiubo Li writes: >> >>> On 4/1/22 9:32 PM, Lu=C3=ADs Henriques wrote: >>>> When doing DIO on an encrypted node, we need to invalidate the page ca= che in >>>> the range being written to, otherwise the cache will include invalid d= ata. >>>> >>>> Signed-off-by: Lu=C3=ADs Henriques >>>> --- >>>> fs/ceph/file.c | 11 ++++++++++- >>>> 1 file changed, 10 insertions(+), 1 deletion(-) >>>> >>>> Changes since v1: >>>> - Replaced truncate_inode_pages_range() by invalidate_inode_pages2_ran= ge >>>> - Call fscache_invalidate with FSCACHE_INVAL_DIO_WRITE if we're doing = DIO >>>> >>>> Note: I'm not really sure this last change is required, it doesn't rea= lly >>>> affect generic/647 result, but seems to be the most correct. >>>> >>>> diff --git a/fs/ceph/file.c b/fs/ceph/file.c >>>> index 5072570c2203..b2743c342305 100644 >>>> --- a/fs/ceph/file.c >>>> +++ b/fs/ceph/file.c >>>> @@ -1605,7 +1605,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_i= ter *from, loff_t pos, >>>> if (ret < 0) >>>> return ret; >>>> - ceph_fscache_invalidate(inode, false); >>>> + ceph_fscache_invalidate(inode, (iocb->ki_flags & IOCB_DIRECT)); >>>> ret =3D invalidate_inode_pages2_range(inode->i_mapping, >>>> pos >> PAGE_SHIFT, >>>> (pos + count - 1) >> PAGE_SHIFT); >>> The above has already invalidated the pages, why doesn't it work ? >> I suspect the reason is because later on we loop through the number of >> pages, call copy_page_from_iter() and then ceph_fscrypt_encrypt_pages(). > > Checked the 'copy_page_from_iter()', it will do the kmap for the pages bu= t will > kunmap them again later. And they shouldn't update the i_mapping if I did= n't > miss something important. > > For 'ceph_fscrypt_encrypt_pages()' it will encrypt/dencrypt the context i= nplace, > IMO if it needs to map the page and it should also unmap it just like in > 'copy_page_from_iter()'. > > I thought it possibly be when we need to do RMW, it may will update the > i_mapping when reading contents, but I checked the code didn't find any=20 > place is doing this. So I am wondering where tha page caches come from ? = If that > page caches really from reading the contents, then we should discard it i= nstead > of flushing it back ? > > BTW, what's the problem without this fixing ? xfstest fails ? Yes, generic/647 fails if you run it with test_dummy_encryption. And I've also checked that the RMW code was never executed in this test. But yeah I have assumed (perhaps wrongly) that the kmap/kunmap could change the inode->i_mapping. In my debugging this seemed to be the case for the O_DIRECT path. That's why I added this extra call here. Cheers, --=20 Lu=C3=ADs