From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EB6CC07E9C for ; Wed, 7 Jul 2021 03:38:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F397161C88 for ; Wed, 7 Jul 2021 03:38:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230081AbhGGDk5 (ORCPT ); Tue, 6 Jul 2021 23:40:57 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:41096 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229996AbhGGDk4 (ORCPT ); Tue, 6 Jul 2021 23:40:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625629096; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h2aWbDZbFo4kzS9y74gUrqgwbmoseq7BXNDY/LziZbQ=; b=NUoUKNW2FVnOOdjA1KP+B0It2cXEcwEL5lWS4pm3H9P9A80+k9/8rQm9ITxLOfXARgaXC7 5sY3XsiB5p2SKbVhBHctwtkNvVEBAAAnczQLsuJNfnC+z2IWwKO+G52Fv9qZOf0grKfz0O 6efOhAXbw3K080+wRgzZg5zXbDKm7so= Received: from mail-pf1-f197.google.com (mail-pf1-f197.google.com [209.85.210.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-64-FYzmq9IVMX2RudyFtVHwyQ-1; Tue, 06 Jul 2021 23:38:15 -0400 X-MC-Unique: FYzmq9IVMX2RudyFtVHwyQ-1 Received: by mail-pf1-f197.google.com with SMTP id j10-20020a056a00174ab029031e1e93e88dso665061pfc.3 for ; Tue, 06 Jul 2021 20:38:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=h2aWbDZbFo4kzS9y74gUrqgwbmoseq7BXNDY/LziZbQ=; b=Ozj39JVb4VmuUJOqPTt1Ixqmffn7GvHqsSRZvHBWtNPDJ8L6rqGijGbCJha4vO+ZHw xov7l/5p5LFeMqH5xacPEULkcjgDqW6KJYoJU5Dqu2i0+RMNcS90ijuBLaDmdP5HNyzF 5KgjY/zXTB5sVIJUgQiJj3lE7v+VeZOf6QN62z3O9GVYeMqpWnDTB1C8iR1KeQyb4BkV X0CjtE9mK9mh89jkoftZIvk7BVfHCgncMNRInrtwtlgSDigyOkWNybS9zuWbxAQn7xfw dJEyTI7TsoDwfUtO5j8+AkuJd7ocx7FEskv0Hcc4EWH/Pu7E/D3AmjFtIzaBbD/aQ22N AN5A== X-Gm-Message-State: AOAM5304pSGKzeAs1nz0b6G1k6IkLr0whwrL/0ppVHRKEoePT/RiIRx7 q94YYByehL005P42GAKc/wzZTvJpqOPo+FFsYfvdqjroDEx+hjTnNFPHJ84OThy2rp7ah0oHd0t KZLa7AeiNT/LWxONVNEh2Qw== X-Received: by 2002:a17:90a:f2c2:: with SMTP id gt2mr24309144pjb.86.1625629094155; Tue, 06 Jul 2021 20:38:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwp9tsBWasHibXpYhkN407hti3EvzOBBo0XHjAt+5NtZ/0PszQJRKWn58zOIE/JFHpmpi+E2Q== X-Received: by 2002:a17:90a:f2c2:: with SMTP id gt2mr24309124pjb.86.1625629093907; Tue, 06 Jul 2021 20:38:13 -0700 (PDT) Received: from [10.72.13.191] ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id v4sm19471875pgr.65.2021.07.06.20.38.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 06 Jul 2021 20:38:13 -0700 (PDT) Subject: Re: [RFC PATCH v7 05/24] ceph: preallocate inode for ops that may create one To: Jeff Layton , ceph-devel@vger.kernel.org Cc: lhenriques@suse.de, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, dhowells@redhat.com References: <20210625135834.12934-1-jlayton@kernel.org> <20210625135834.12934-6-jlayton@kernel.org> From: Xiubo Li Message-ID: <83dcbc5c-7a87-b6cd-b364-2ca4aa5bd440@redhat.com> Date: Wed, 7 Jul 2021 11:37:53 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: <20210625135834.12934-6-jlayton@kernel.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org On 6/25/21 9:58 PM, Jeff Layton wrote: > When creating a new inode, we need to determine the crypto context > before we can transmit the RPC. The fscrypt API has a routine for getting > a crypto context before a create occurs, but it requires an inode. > > Change the ceph code to preallocate an inode in advance of a create of > any sort (open(), mknod(), symlink(), etc). Move the existing code that > generates the ACL and SELinux blobs into this routine since that's > mostly common across all the different codepaths. > > In most cases, we just want to allow ceph_fill_trace to use that inode > after the reply comes in, so add a new field to the MDS request for it > (r_new_inode). > > The async create codepath is a bit different though. In that case, we > want to hash the inode in advance of the RPC so that it can be used > before the reply comes in. If the call subsequently fails with > -EJUKEBOX, then just put the references and clean up the as_ctx. Note > that with this change, we now need to regenerate the as_ctx when this > occurs, but it's quite rare for it to happen. > > Signed-off-by: Jeff Layton > --- > fs/ceph/dir.c | 70 ++++++++++++++++++++----------------- > fs/ceph/file.c | 62 ++++++++++++++++++++------------- > fs/ceph/inode.c | 82 ++++++++++++++++++++++++++++++++++++++++---- > fs/ceph/mds_client.c | 3 +- > fs/ceph/mds_client.h | 1 + > fs/ceph/super.h | 7 +++- > 6 files changed, 160 insertions(+), 65 deletions(-) > [...] > diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c > index eb562e259347..f62785e4dbcb 100644 > --- a/fs/ceph/inode.c > +++ b/fs/ceph/inode.c > @@ -52,17 +52,85 @@ static int ceph_set_ino_cb(struct inode *inode, void *data) > return 0; > } > > -struct inode *ceph_get_inode(struct super_block *sb, struct ceph_vino vino) > +/** > + * ceph_new_inode - allocate a new inode in advance of an expected create > + * @dir: parent directory for new inode > + * @dentry: dentry that may eventually point to new inode > + * @mode: mode of new inode > + * @as_ctx: pointer to inherited security context > + * > + * Allocate a new inode in advance of an operation to create a new inode. > + * This allocates the inode and sets up the acl_sec_ctx with appropriate > + * info for the new inode. > + * > + * Returns a pointer to the new inode or an ERR_PTR. > + */ > +struct inode *ceph_new_inode(struct inode *dir, struct dentry *dentry, > + umode_t *mode, struct ceph_acl_sec_ctx *as_ctx) > +{ > + int err; > + struct inode *inode; > + > + inode = new_inode_pseudo(dir->i_sb); > + if (!inode) > + return ERR_PTR(-ENOMEM); > + > + if (!S_ISLNK(*mode)) { > + err = ceph_pre_init_acls(dir, mode, as_ctx); > + if (err < 0) > + goto out_err; > + } > + > + err = ceph_security_init_secctx(dentry, *mode, as_ctx); > + if (err < 0) > + goto out_err; > + > + inode->i_state = 0; > + inode->i_mode = *mode; > + return inode; > +out_err: > + iput(inode); > + return ERR_PTR(err); > +} > + > +void ceph_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as_ctx) > +{ > + if (as_ctx->pagelist) { > + req->r_pagelist = as_ctx->pagelist; > + as_ctx->pagelist = NULL; > + } > +} > + > +/** > + * ceph_get_inode - find or create/hash a new inode > + * @sb: superblock to search and allocate in > + * @vino: vino to search for > + * @newino: optional new inode to insert if one isn't found (may be NULL) > + * > + * Search for or insert a new inode into the hash for the given vino, and return a > + * reference to it. If new is non-NULL, its reference is consumed. > + */ > +struct inode *ceph_get_inode(struct super_block *sb, struct ceph_vino vino, struct inode *newino) > { > struct inode *inode; > > if (ceph_vino_is_reserved(vino)) > return ERR_PTR(-EREMOTEIO); > > - inode = iget5_locked(sb, (unsigned long)vino.ino, ceph_ino_compare, > - ceph_set_ino_cb, &vino); > - if (!inode) > + if (newino) { > + inode = inode_insert5(newino, (unsigned long)vino.ino, ceph_ino_compare, > + ceph_set_ino_cb, &vino); > + if (inode != newino) > + iput(newino); > + } else { > + inode = iget5_locked(sb, (unsigned long)vino.ino, ceph_ino_compare, > + ceph_set_ino_cb, &vino); > + } > + > + if (!inode) { > + dout("No inode found for %llx.%llx\n", vino.ino, vino.snap); > return ERR_PTR(-ENOMEM); > + } > > dout("get_inode on %llu=%llx.%llx got %p new %d\n", ceph_present_inode(inode), > ceph_vinop(inode), inode, !!(inode->i_state & I_NEW)); > @@ -78,7 +146,7 @@ struct inode *ceph_get_snapdir(struct inode *parent) > .ino = ceph_ino(parent), > .snap = CEPH_SNAPDIR, > }; > - struct inode *inode = ceph_get_inode(parent->i_sb, vino); > + struct inode *inode = ceph_get_inode(parent->i_sb, vino, NULL); > struct ceph_inode_info *ci = ceph_inode(inode); > > if (IS_ERR(inode)) Should we always check this just before using it before 'struct ceph_inode_info *ci = ceph_inode(inode);' ? But it seems the 'ceph_inode()' won't introduce any issue here. Thanks, > @@ -1546,7 +1614,7 @@ static int readdir_prepopulate_inodes_only(struct ceph_mds_request *req, > vino.ino = le64_to_cpu(rde->inode.in->ino); > vino.snap = le64_to_cpu(rde->inode.in->snapid); > > - in = ceph_get_inode(req->r_dentry->d_sb, vino); > + in = ceph_get_inode(req->r_dentry->d_sb, vino, NULL); > if (IS_ERR(in)) { > err = PTR_ERR(in); > dout("new_inode badness got %d\n", err); > @@ -1748,7 +1816,7 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, > if (d_really_is_positive(dn)) { > in = d_inode(dn); > } else { > - in = ceph_get_inode(parent->d_sb, tvino); > + in = ceph_get_inode(parent->d_sb, tvino, NULL); > if (IS_ERR(in)) { > dout("new_inode badness\n"); > d_drop(dn); [...]