lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
* [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020
@ 2020-07-02  0:04 James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 01/18] lnet: restore an maximal fragments count James Simmons
                   ` (18 more replies)
  0 siblings, 19 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

Port of patches that landed to the OpenSFS branch. A few patches added
the were missing that enables potential lustre utilities building
against the Linux client. Please review to make sure everything is
okay.

Alexander Boyko (1):
  lustre: llite: don't hold inode_lock for security notify

Alexey Lyashkov (2):
  lnet: restore an maximal fragments count
  lnet: o2ib: fix page mapping error

Amir Shehata (1):
  lnet: handle undefined parameters

Andriy Skulysh (1):
  lustre: llite: truncate deadlock with DoM files

Chris Horn (1):
  lnet: Skip health and resends for single rail configs

Emoly Liu (1):
  lustre: obd: add new LPROCFS_TYPE_*

Gregoire Pichon (1):
  lnet: define new network driver ptl4lnd

Hongchao Zhang (1):
  lustre: mdc: chlg device could be used after free

James Simmons (2):
  lustre: llite: bind kthread thread to accepted node set
  lustre: lov: use lov_pattern_support() to verify lmm

Lai Siyao (1):
  lustre: mdt: don't fetch LOOKUP lock for remote object

Mikhail Pershin (1):
  lustre: ptlrpc: limit rate of lock replays

Sebastien Buisson (5):
  lustre: sec: encryption for write path
  lustre: sec: decryption for read path
  lustre: sec: deal with encrypted object size
  lustre: sec: support truncate for encrypted files
  lustre: sec: ioctls to handle encryption policies

 fs/lustre/include/lprocfs_status.h         |   9 +-
 fs/lustre/include/lustre_import.h          |   2 +
 fs/lustre/include/lustre_osc.h             |   1 +
 fs/lustre/include/obd.h                    |  19 ++-
 fs/lustre/include/obd_class.h              |   3 +-
 fs/lustre/include/obd_support.h            |   5 +
 fs/lustre/ldlm/ldlm_request.c              |  69 ++++++++++-
 fs/lustre/llite/crypto.c                   |  15 ++-
 fs/lustre/llite/dir.c                      |  50 +++++++-
 fs/lustre/llite/file.c                     |  19 ++-
 fs/lustre/llite/llite_internal.h           |   1 +
 fs/lustre/llite/llite_lib.c                | 187 ++++++++++++++++++++++++++++-
 fs/lustre/llite/lproc_llite.c              |   8 +-
 fs/lustre/llite/namei.c                    | 105 +++++++++++++---
 fs/lustre/llite/rw.c                       |  13 +-
 fs/lustre/llite/rw26.c                     |   4 +
 fs/lustre/llite/statahead.c                |  11 +-
 fs/lustre/llite/vvp_io.c                   |  17 ++-
 fs/lustre/lmv/lmv_intent.c                 |  19 ++-
 fs/lustre/lmv/lmv_internal.h               |   1 +
 fs/lustre/lmv/lmv_obd.c                    |   3 +-
 fs/lustre/lov/lov_ea.c                     |   6 +-
 fs/lustre/mdc/mdc_changelog.c              |  46 ++++---
 fs/lustre/mdc/mdc_internal.h               |   1 +
 fs/lustre/mdc/mdc_request.c                |   8 +-
 fs/lustre/obdclass/genops.c                |   1 +
 fs/lustre/obdecho/echo_client.c            |   2 +
 fs/lustre/obdecho/echo_internal.h          |   3 +
 fs/lustre/osc/osc_internal.h               |   1 +
 fs/lustre/osc/osc_request.c                | 121 ++++++++++++++++++-
 fs/lustre/ptlrpc/import.c                  |   8 +-
 include/linux/lnet/lib-lnet.h              |   4 +-
 include/linux/lnet/lib-types.h             |   2 +-
 include/uapi/linux/lnet/nidstr.h           |   1 +
 include/uapi/linux/lustre/lustre_user.h    |   8 ++
 net/lnet/klnds/o2iblnd/o2iblnd.c           |   7 +-
 net/lnet/klnds/o2iblnd/o2iblnd_cb.c        |   3 +-
 net/lnet/klnds/o2iblnd/o2iblnd_modparams.c |   4 +-
 net/lnet/klnds/socklnd/socklnd_modparams.c |   4 +-
 net/lnet/lnet/api-ni.c                     |  26 +++-
 net/lnet/lnet/lib-msg.c                    |  65 +++++++---
 net/lnet/lnet/nidstrings.c                 |   9 ++
 42 files changed, 759 insertions(+), 132 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 01/18] lnet: restore an maximal fragments count
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 02/18] lnet: o2ib: fix page mapping error James Simmons
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Alexey Lyashkov <c17817@cray.com>

Lowering a number of fragments blocks a connection from older clients
who wants to use 256 fragments to transfer. Let's restore this number
to the original value.

Fixes: ab7e089b8eda ("lustre: lnet: make LNET_MAX_IOV dependent on page size")

Cray-bug-id: LUS-8139
WC-bug-id: https://jira.whamcloud.com/browse/LU-10157
Lustre-commit: 4072d863c240f ("LU-10157 lnet: restore an maximal fragments count")
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-on: https://review.whamcloud.com/37385
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-types.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h
index 6aa691e..1c016fd 100644
--- a/include/linux/lnet/lib-types.h
+++ b/include/linux/lnet/lib-types.h
@@ -51,7 +51,7 @@
 /* Max payload size */
 #define LNET_MAX_PAYLOAD	LNET_MTU
 
-#define LNET_MAX_IOV		(LNET_MAX_PAYLOAD >> PAGE_SHIFT)
+#define LNET_MAX_IOV		256
 
 /*
  * This is the maximum health value.
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 02/18] lnet: o2ib: fix page mapping error
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 01/18] lnet: restore an maximal fragments count James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 03/18] lustre: sec: encryption for write path James Simmons
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Alexey Lyashkov <c17817@cray.com>

IB DMA mapping can merge a physically continues page region into
single one.
It's confused a kiblnd_fmr_pool_map function who expect to see all
fragments mapped.
It's generate a error
 (o2iblnd.c:1926:kiblnd_fmr_pool_map()) Failed to map mr 1/16 elements

By study an IB code, it looks ib_map_mr_sg return code should checked
against of result of ib_dma_map_sg instead of original fragments
count, same data should be used as argument of ib_map_mr_sg function.

Cray-bug-id: LUS-8139
WC-bug-id: https://jira.whamcloud.com/browse/LU-13181
Lustre-commit: 40385cda7afbd ("LU-13181 o2ib: fix page mapping error")
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-on: https://review.whamcloud.com/37388
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/klnds/o2iblnd/o2iblnd.c    | 7 ++++---
 net/lnet/klnds/o2iblnd/o2iblnd_cb.c | 3 ++-
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.c b/net/lnet/klnds/o2iblnd/o2iblnd.c
index 3a76447..16edfba 100644
--- a/net/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/net/lnet/klnds/o2iblnd/o2iblnd.c
@@ -1737,10 +1737,11 @@ int kiblnd_fmr_pool_map(struct kib_fmr_poolset *fps, struct kib_tx *tx,
 				}
 
 				n = ib_map_mr_sg(mr, tx->tx_frags,
-						 tx->tx_nfrags, NULL, PAGE_SIZE);
-				if (unlikely(n != tx->tx_nfrags)) {
+						 rd->rd_nfrags, NULL,
+						 PAGE_SIZE);
+				if (unlikely(n != rd->rd_nfrags)) {
 					CERROR("Failed to map mr %d/%d elements\n",
-					       n, tx->tx_nfrags);
+					       n, rd->rd_nfrags);
 					return n < 0 ? n : -EINVAL;
 				}
 
diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 40e196d..3b9d10d 100644
--- a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -595,7 +595,8 @@ static int kiblnd_init_rdma(struct kib_conn *conn, struct kib_tx *tx, int type,
 	fps = net->ibn_fmr_ps[cpt];
 	rc = kiblnd_fmr_pool_map(fps, tx, rd, nob, 0, &tx->tx_fmr);
 	if (rc) {
-		CERROR("Can't map %u bytes: %d\n", nob, rc);
+		CERROR("Can't map %u bytes (%u/%u)s: %d\n", nob,
+		       tx->tx_nfrags, rd->rd_nfrags, rc);
 		return rc;
 	}
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 03/18] lustre: sec: encryption for write path
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 01/18] lnet: restore an maximal fragments count James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 02/18] lnet: o2ib: fix page mapping error James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 04/18] lustre: sec: decryption for read path James Simmons
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Sebastien Buisson <sbuisson@ddn.com>

First aspect is to make sure encryption context is properly set on
files/dirs that are created or opened/looked up.
Then encryption itself is carried out in osc_brw_prep_request(), just
before pages are added to the request to be sent. Because pages in
the page cache must hold clear text data, we have to use bounce pages
for encryption. The allocation is handled by fscrypt, and for
deallocation we call fscrypt_pullback_bio_page() and/or
fscrypt_pullback_bio_page().

WC-bug-id: https://jira.whamcloud.com/browse/LU-12275
Lustre-commit: a9ed5b149646f ("LU-12275 sec: encryption for write path")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/36144
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/lustre_osc.h    |  1 +
 fs/lustre/include/obd.h           | 17 ++++++++
 fs/lustre/include/obd_support.h   |  5 +++
 fs/lustre/llite/crypto.c          |  5 ++-
 fs/lustre/llite/dir.c             | 16 +++++++
 fs/lustre/llite/namei.c           | 87 ++++++++++++++++++++++++++++++++++-----
 fs/lustre/llite/rw26.c            |  4 ++
 fs/lustre/obdecho/echo_client.c   |  2 +
 fs/lustre/obdecho/echo_internal.h |  3 ++
 fs/lustre/osc/osc_internal.h      |  1 +
 fs/lustre/osc/osc_request.c       | 77 +++++++++++++++++++++++++++++++++-
 11 files changed, 206 insertions(+), 12 deletions(-)

diff --git a/fs/lustre/include/lustre_osc.h b/fs/lustre/include/lustre_osc.h
index 4b448b9..11b7e92 100644
--- a/fs/lustre/include/lustre_osc.h
+++ b/fs/lustre/include/lustre_osc.h
@@ -52,6 +52,7 @@
 #include <obd.h>
 #include <cl_object.h>
 #include <linux/libcfs/libcfs_hash.h>
+#include <lustre_crypto.h>
 
 struct osc_quota_info {
 	/* linkage for quota hash table */
diff --git a/fs/lustre/include/obd.h b/fs/lustre/include/obd.h
index 0ff19c8..f9e0920 100644
--- a/fs/lustre/include/obd.h
+++ b/fs/lustre/include/obd.h
@@ -116,6 +116,11 @@ struct brw_page {
 	struct page	       *pg;
 	unsigned int		count;
 	u32			flag;
+	/* used for encryption: difference with offset in clear text page */
+	u16			bp_off_diff;
+	/* used for encryption: difference with count in clear text page */
+	u16			bp_count_diff;
+	u32			bp_padding;
 };
 
 struct timeout_item {
@@ -1161,4 +1166,16 @@ static inline void client_adjust_max_dirty(struct client_obd *cli)
 					   1 << (20 - PAGE_SHIFT));
 }
 
+static inline struct inode *page2inode(struct page *page)
+{
+	if (page->mapping) {
+		if (PageAnon(page))
+			return NULL;
+		else
+			return page->mapping->host;
+	} else {
+		return NULL;
+	}
+}
+
 #endif /* __OBD_H */
diff --git a/fs/lustre/include/obd_support.h b/fs/lustre/include/obd_support.h
index b5736f8..35c7ef3 100644
--- a/fs/lustre/include/obd_support.h
+++ b/fs/lustre/include/obd_support.h
@@ -583,4 +583,9 @@ struct obd_heat_instance {
 	u64 ohi_count;
 };
 
+/* Define a fixed 4096-byte encryption unit size */
+#define LUSTRE_ENCRYPTION_BLOCKBITS	12
+#define LUSTRE_ENCRYPTION_UNIT_SIZE	((size_t)1 << LUSTRE_ENCRYPTION_BLOCKBITS)
+#define LUSTRE_ENCRYPTION_MASK		(~(LUSTRE_ENCRYPTION_UNIT_SIZE - 1))
+
 #endif
diff --git a/fs/lustre/llite/crypto.c b/fs/lustre/llite/crypto.c
index 94189c9..f411343 100644
--- a/fs/lustre/llite/crypto.c
+++ b/fs/lustre/llite/crypto.c
@@ -52,7 +52,7 @@ static int ll_set_context(struct inode *inode, const void *ctx, size_t len,
 	struct ptlrpc_request *req = NULL;
 	int rc;
 
-	if (inode == NULL)
+	if (!inode)
 		return 0;
 
 	ext_flags = ll_inode_to_ext_flags(inode->i_flags) | LUSTRE_ENCRYPT_FL;
@@ -80,6 +80,9 @@ static int ll_set_context(struct inode *inode, const void *ctx, size_t len,
 	if (rc)
 		return rc;
 
+	/* used as encryption unit size */
+	if (S_ISREG(inode->i_mode))
+		inode->i_blkbits = LUSTRE_ENCRYPTION_BLOCKBITS;
 	ll_update_inode_flags(inode, ext_flags);
 	return 0;
 }
diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c
index 0ffe134..2c93908 100644
--- a/fs/lustre/llite/dir.c
+++ b/fs/lustre/llite/dir.c
@@ -388,6 +388,7 @@ static int ll_dir_setdirstripe(struct dentry *dparent, struct lmv_user_md *lump,
 					       strlen(dirname)),
 		},
 	};
+	bool encrypt = false;
 	int err;
 
 	if (unlikely(!lmv_user_magic_supported(lump->lum_magic)))
@@ -446,6 +447,18 @@ static int ll_dir_setdirstripe(struct dentry *dparent, struct lmv_user_md *lump,
 	if (IS_ERR(op_data))
 		return PTR_ERR(op_data);
 
+	if (IS_ENCRYPTED(parent) ||
+	    unlikely(llcrypt_dummy_context_enabled(parent))) {
+		err = llcrypt_get_encryption_info(parent);
+		if (err)
+			goto out_op_data;
+		if (!llcrypt_has_encryption_key(parent)) {
+			err = -ENOKEY;
+			goto out_op_data;
+		}
+		encrypt = true;
+	}
+
 	if (sbi->ll_flags & LL_SBI_FILE_SECCTX) {
 		/*
 		 * selinux_dentry_init_security() uses dentry->d_parent and name
@@ -484,6 +497,9 @@ static int ll_dir_setdirstripe(struct dentry *dparent, struct lmv_user_md *lump,
 		err = ll_inode_init_security(&dentry, inode, parent);
 	}
 
+	if (encrypt)
+		err = llcrypt_inherit_context(parent, inode, NULL, false);
+
 out_inode:
 	iput(inode);
 out_request:
diff --git a/fs/lustre/llite/namei.c b/fs/lustre/llite/namei.c
index aa2dd13..2353a8f 100644
--- a/fs/lustre/llite/namei.c
+++ b/fs/lustre/llite/namei.c
@@ -47,7 +47,8 @@
 #include "llite_internal.h"
 
 static int ll_create_it(struct inode *dir, struct dentry *dentry,
-			struct lookup_intent *it, void *secctx, u32 secctxlen);
+			struct lookup_intent *it,
+			void *secctx, u32 secctxlen, bool encrypt);
 
 /* called from iget5_locked->find_inode() under inode_hash_lock spinlock */
 static int ll_test_inode(struct inode *inode, void *opaque)
@@ -605,7 +606,7 @@ static int ll_lookup_it_finish(struct ptlrpc_request *request,
 			       struct lookup_intent *it,
 			       struct inode *parent, struct dentry **de,
 			       void *secctx, u32 secctxlen,
-			       ktime_t kstart)
+			       ktime_t kstart, bool encrypt)
 {
 	struct inode *inode = NULL;
 	u64 bits = 0;
@@ -679,6 +680,16 @@ static int ll_lookup_it_finish(struct ptlrpc_request *request,
 		/* We have the "lookup" lock, so unhide dentry */
 		if (bits & MDS_INODELOCK_LOOKUP)
 			d_lustre_revalidate(*de);
+
+		if (encrypt) {
+			rc = llcrypt_get_encryption_info(inode);
+			if (rc)
+				goto out;
+			if (!llcrypt_has_encryption_key(inode)) {
+				rc = -ENOKEY;
+				goto out;
+			}
+		}
 	} else if (!it_disposition(it, DISP_OPEN_CREATE)) {
 		/*
 		 * If file was created on the server, the dentry is revalidated
@@ -725,7 +736,8 @@ static int ll_lookup_it_finish(struct ptlrpc_request *request,
 static struct dentry *ll_lookup_it(struct inode *parent, struct dentry *dentry,
 				   struct lookup_intent *it, void **secctx,
 				   u32 *secctxlen,
-				   struct pcc_create_attach *pca)
+				   struct pcc_create_attach *pca,
+				   bool encrypt)
 {
 	ktime_t kstart = ktime_get();
 	struct lookup_intent lookup_it = { .it_op = IT_LOOKUP };
@@ -894,7 +906,7 @@ static struct dentry *ll_lookup_it(struct inode *parent, struct dentry *dentry,
 	rc = ll_lookup_it_finish(req, it, parent, &dentry,
 				 secctx ? *secctx : NULL,
 				 secctxlen ? *secctxlen : 0,
-				kstart);
+				 kstart, encrypt);
 	if (rc != 0) {
 		ll_intent_release(it);
 		retval = ERR_PTR(rc);
@@ -952,7 +964,7 @@ static struct dentry *ll_lookup_nd(struct inode *parent, struct dentry *dentry,
 		itp = NULL;
 	else
 		itp = &it;
-	de = ll_lookup_it(parent, dentry, itp, NULL, NULL, NULL);
+	de = ll_lookup_it(parent, dentry, itp, NULL, NULL, NULL, false);
 
 	if (itp)
 		ll_intent_release(itp);
@@ -972,8 +984,9 @@ static int ll_atomic_open(struct inode *dir, struct dentry *dentry,
 	void *secctx = NULL;
 	u32 secctxlen = 0;
 	struct dentry *de;
-	struct ll_sb_info *sbi;
+	struct ll_sb_info *sbi = NULL;
 	struct pcc_create_attach pca = { NULL, NULL };
+	bool encrypt = false;
 	int rc = 0;
 
 	CDEBUG(D_VFSTRACE,
@@ -1025,8 +1038,23 @@ static int ll_atomic_open(struct inode *dir, struct dentry *dentry,
 	it->it_flags = (open_flags & ~O_ACCMODE) | OPEN_FMODE(open_flags);
 	it->it_flags &= ~MDS_OPEN_FL_INTERNAL;
 
+	if (IS_ENCRYPTED(dir)) {
+		/* we know that we are going to create a regular file because
+		 * we set S_IFREG bit on it->it_create_mode above
+		 */
+		rc = llcrypt_get_encryption_info(dir);
+		if (rc)
+			goto out_release;
+		if (!llcrypt_has_encryption_key(dir)) {
+			rc = -ENOKEY;
+			goto out_release;
+		}
+		encrypt = true;
+		rc = 0;
+	}
+
 	/* Dentry added to dcache tree in ll_lookup_it */
-	de = ll_lookup_it(dir, dentry, it, &secctx, &secctxlen, &pca);
+	de = ll_lookup_it(dir, dentry, it, &secctx, &secctxlen, &pca, encrypt);
 	if (IS_ERR(de))
 		rc = PTR_ERR(de);
 	else if (de)
@@ -1035,7 +1063,8 @@ static int ll_atomic_open(struct inode *dir, struct dentry *dentry,
 	if (!rc) {
 		if (it_disposition(it, DISP_OPEN_CREATE)) {
 			/* Dentry instantiated in ll_create_it. */
-			rc = ll_create_it(dir, dentry, it, secctx, secctxlen);
+			rc = ll_create_it(dir, dentry, it, secctx, secctxlen,
+					  encrypt);
 			security_release_secctx(secctx, secctxlen);
 			if (rc) {
 				/* We dget in ll_splice_alias. */
@@ -1150,7 +1179,8 @@ static struct inode *ll_create_node(struct inode *dir, struct lookup_intent *it)
  * with d_instantiate().
  */
 static int ll_create_it(struct inode *dir, struct dentry *dentry,
-			struct lookup_intent *it, void *secctx, u32 secctxlen)
+			struct lookup_intent *it,
+			void *secctx, u32 secctxlen, bool encrypt)
 {
 	struct inode *inode;
 	u64 bits = 0;
@@ -1185,6 +1215,12 @@ static int ll_create_it(struct inode *dir, struct dentry *dentry,
 
 	d_instantiate(dentry, inode);
 
+	if (encrypt) {
+		rc = llcrypt_inherit_context(dir, inode, dentry, true);
+		if (rc)
+			return rc;
+	}
+
 	if (!(ll_i2sbi(inode)->ll_flags & LL_SBI_FILE_SECCTX))
 		rc = ll_inode_init_security(dentry, inode, dir);
 
@@ -1214,10 +1250,11 @@ static int ll_new_node(struct inode *dir, struct dentry *dentry,
 		       u32 opc)
 {
 	struct ptlrpc_request *request = NULL;
-	struct md_op_data *op_data;
+	struct md_op_data *op_data = NULL;
 	struct inode *inode = NULL;
 	struct ll_sb_info *sbi = ll_i2sbi(dir);
 	int tgt_len = 0;
+	int encrypt = 0;
 	int err;
 
 	if (unlikely(tgt))
@@ -1241,6 +1278,19 @@ static int ll_new_node(struct inode *dir, struct dentry *dentry,
 			goto err_exit;
 	}
 
+	if ((IS_ENCRYPTED(dir) &&
+	    (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode))) ||
+	    (unlikely(llcrypt_dummy_context_enabled(dir)) && S_ISDIR(mode))) {
+		err = llcrypt_get_encryption_info(dir);
+		if (err)
+			goto err_exit;
+		if (!llcrypt_has_encryption_key(dir)) {
+			err = -ENOKEY;
+			goto err_exit;
+		}
+		encrypt = 1;
+	}
+
 	err = md_create(sbi->ll_md_exp, op_data, tgt, tgt_len, mode,
 			from_kuid(&init_user_ns, current_fsuid()),
 			from_kgid(&init_user_ns, current_fsgid()),
@@ -1335,6 +1385,12 @@ static int ll_new_node(struct inode *dir, struct dentry *dentry,
 
 	d_instantiate(dentry, inode);
 
+	if (encrypt) {
+		err = llcrypt_inherit_context(dir, inode, NULL, true);
+		if (err)
+			goto err_exit;
+	}
+
 	if (!(sbi->ll_flags & LL_SBI_FILE_SECCTX))
 		err = ll_inode_init_security(dentry, inode, dir);
 err_exit:
@@ -1547,6 +1603,10 @@ static int ll_link(struct dentry *old_dentry, struct inode *dir,
 	       PFID(ll_inode2fid(src)), src, PFID(ll_inode2fid(dir)), dir,
 	       new_dentry);
 
+	err = llcrypt_prepare_link(old_dentry, dir, new_dentry);
+	if (err)
+		return err;
+
 	op_data = ll_prep_md_op_data(NULL, src, dir, new_dentry->d_name.name,
 				     new_dentry->d_name.len,
 				     0, LUSTRE_OPC_ANY, NULL);
@@ -1584,6 +1644,13 @@ static int ll_rename(struct inode *src, struct dentry *src_dchild,
 	       src_dchild, PFID(ll_inode2fid(src)), src,
 	       tgt_dchild, PFID(ll_inode2fid(tgt)), tgt);
 
+	if (unlikely(d_mountpoint(src_dchild) || d_mountpoint(tgt_dchild)))
+		return -EBUSY;
+
+	err = llcrypt_prepare_rename(src, src_dchild, tgt, tgt_dchild, flags);
+	if (err)
+		return err;
+
 	op_data = ll_prep_md_op_data(NULL, src, tgt, NULL, 0, 0,
 				     LUSTRE_OPC_ANY, NULL);
 	if (IS_ERR(op_data))
diff --git a/fs/lustre/llite/rw26.c b/fs/lustre/llite/rw26.c
index 5e7aa6e..0971185 100644
--- a/fs/lustre/llite/rw26.c
+++ b/fs/lustre/llite/rw26.c
@@ -291,6 +291,10 @@ static ssize_t ll_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 	loff_t file_offset = iocb->ki_pos;
 	int rw = iov_iter_rw(iter);
 
+	/* if file is encrypted, return 0 so that we fall back to buffered IO */
+	if (IS_ENCRYPTED(inode))
+		return 0;
+
 	/* Check EOF by ourselves */
 	if (rw == READ && file_offset >= i_size_read(inode))
 		return 0;
diff --git a/fs/lustre/obdecho/echo_client.c b/fs/lustre/obdecho/echo_client.c
index 2324e38..a52e0362 100644
--- a/fs/lustre/obdecho/echo_client.c
+++ b/fs/lustre/obdecho/echo_client.c
@@ -1317,6 +1317,8 @@ static int echo_client_kbrw(struct echo_device *ed, int rw, struct obdo *oa,
 		if (!pgp->pg)
 			goto out;
 
+		/* set mapping so page is not considered encrypted */
+		pgp->pg->mapping = ECHO_MAPPING_UNENCRYPTED;
 		pages[i] = pgp->pg;
 		pgp->count = PAGE_SIZE;
 		pgp->off = off;
diff --git a/fs/lustre/obdecho/echo_internal.h b/fs/lustre/obdecho/echo_internal.h
index f9bb0b91..95b0149 100644
--- a/fs/lustre/obdecho/echo_internal.h
+++ b/fs/lustre/obdecho/echo_internal.h
@@ -43,4 +43,7 @@
 int block_debug_setup(void *addr, int len, u64 off, u64 id);
 int block_debug_check(char *who, void *addr, int len, u64 off, u64 id);
 
+/* mapping value to tell page is not encrypted */
+#define ECHO_MAPPING_UNENCRYPTED ((void *)1)
+
 #endif
diff --git a/fs/lustre/osc/osc_internal.h b/fs/lustre/osc/osc_internal.h
index d05595a..6bec6bf 100644
--- a/fs/lustre/osc/osc_internal.h
+++ b/fs/lustre/osc/osc_internal.h
@@ -216,4 +216,5 @@ static inline void osc_set_io_portal(struct ptlrpc_request *req)
 	else
 		req->rq_request_portal = OST_IO_PORTAL;
 }
+
 #endif /* OSC_INTERNAL_H */
diff --git a/fs/lustre/osc/osc_request.c b/fs/lustre/osc/osc_request.c
index b1bf8c6..db97d37 100644
--- a/fs/lustre/osc/osc_request.c
+++ b/fs/lustre/osc/osc_request.c
@@ -36,6 +36,7 @@
 #include <linux/workqueue.h>
 #include <linux/falloc.h>
 #include <linux/highmem.h>
+#include <linux/pagemap.h>
 #include <linux/sched/mm.h>
 
 #include <lustre_dlm.h>
@@ -1354,6 +1355,26 @@ static int osc_checksum_bulk_rw(const char *obd_name,
 	return rc;
 }
 
+static inline void osc_release_bounce_pages(struct brw_page **pga,
+					    u32 page_count)
+{
+#ifdef CONFIG_FS_ENCRYPTION
+	int i;
+
+	for (i = 0; i < page_count; i++) {
+		if (pga[i]->pg->mapping)
+			/* bounce pages are unmapped */
+			continue;
+		if (pga[i]->flag & OBD_BRW_SYNC)
+			/* sync transfer cannot have encrypted pages */
+			continue;
+		llcrypt_finalize_bounce_page(&pga[i]->pg);
+		pga[i]->count -= pga[i]->bp_count_diff;
+		pga[i]->off += pga[i]->bp_off_diff;
+	}
+#endif
+}
+
 static int osc_brw_prep_request(int cmd, struct client_obd *cli,
 				struct obdo *oa, u32 page_count,
 				struct brw_page **pga,
@@ -1371,7 +1392,9 @@ static int osc_brw_prep_request(int cmd, struct client_obd *cli,
 	struct brw_page *pg_prev;
 	void *short_io_buf;
 	const char *obd_name = cli->cl_import->imp_obd->obd_name;
+	struct inode *inode;
 
+	inode = page2inode(pga[0]->pg);
 	if (OBD_FAIL_CHECK(OBD_FAIL_OSC_BRW_PREP_REQ))
 		return -ENOMEM; /* Recoverable */
 	if (OBD_FAIL_CHECK(OBD_FAIL_OSC_BRW_PREP_REQ2))
@@ -1389,6 +1412,51 @@ static int osc_brw_prep_request(int cmd, struct client_obd *cli,
 	if (!req)
 		return -ENOMEM;
 
+	if (opc == OST_WRITE && inode && IS_ENCRYPTED(inode)) {
+		for (i = 0; i < page_count; i++) {
+			struct brw_page *pg = pga[i];
+			struct page *data_page = NULL;
+			bool retried = false;
+			bool lockedbymyself;
+
+retry_encrypt:
+			/* The page can already be locked when we arrive here.
+			 * This is possible when cl_page_assume/vvp_page_assume
+			 * is stuck on wait_on_page_writeback with page lock
+			 * held. In this case there is no risk for the lock to
+			 * be released while we are doing our encryption
+			 * processing, because writeback against that page will
+			 * end in vvp_page_completion_write/cl_page_completion,
+			 * which means only once the page is fully processed.
+			 */
+			lockedbymyself = trylock_page(pg->pg);
+			data_page =
+				llcrypt_encrypt_pagecache_blocks(pg->pg,
+								 PAGE_SIZE, 0,
+								 GFP_NOFS);
+			if (lockedbymyself)
+				unlock_page(pg->pg);
+			if (IS_ERR(data_page)) {
+				rc = PTR_ERR(data_page);
+				if (rc == -ENOMEM && !retried) {
+					retried = true;
+					rc = 0;
+					goto retry_encrypt;
+				}
+				ptlrpc_request_free(req);
+				return rc;
+			}
+			/* len is forced to PAGE_SIZE, and poff to 0
+			 * so store the old, clear text info
+			 */
+			pg->pg = data_page;
+			pg->bp_count_diff = PAGE_SIZE - pg->count;
+			pg->count = PAGE_SIZE;
+			pg->bp_off_diff = pg->off & ~PAGE_MASK;
+			pg->off = pg->off & PAGE_MASK;
+		}
+	}
+
 	for (niocount = i = 1; i < page_count; i++) {
 		if (!can_merge_pages(pga[i - 1], pga[i]))
 			niocount++;
@@ -2115,6 +2183,10 @@ static int brw_interpret(const struct lu_env *env,
 
 	rc = osc_brw_fini_request(req, rc);
 	CDEBUG(D_INODE, "request %p aa %p rc %d\n", req, aa, rc);
+
+	/* restore clear text pages */
+	osc_release_bounce_pages(aa->aa_ppga, aa->aa_page_count);
+
 	/*
 	 * When server returns -EINPROGRESS, client should always retry
 	 * regardless of the number of times the bulk was resent already.
@@ -2430,7 +2502,10 @@ int osc_build_rpc(const struct lu_env *env, struct client_obd *cli,
 		LASSERT(!req);
 
 		kmem_cache_free(osc_obdo_kmem, oa);
-		kfree(pga);
+		if (pga) {
+			osc_release_bounce_pages(pga, page_count);
+			osc_release_ppga(pga, page_count);
+		}
 		/* this should happen rarely and is pretty bad, it makes the
 		 * pending list not follow the dirty order
 		 */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 04/18] lustre: sec: decryption for read path
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (2 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 03/18] lustre: sec: encryption for write path James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 05/18] lustre: sec: deal with encrypted object size James Simmons
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Sebastien Buisson <sbuisson@ddn.com>

With the support for encryption, all files need to be opened with
fscrypt_file_open(). fscrypt will retrieve encryption context if
file is encrypted, or immediately return if not.
Decryption itself is carried out in osc_brw_fini_request(), right
after the reply has been received from the server.

WC-bug-id: https://jira.whamcloud.com/browse/LU-12275
Lustre-commit: eecf86131d099 ("LU-12275 sec: decryption for read path")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/36145
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/crypto.c    | 10 ++++++++--
 fs/lustre/llite/file.c      |  6 ++++++
 fs/lustre/osc/osc_request.c | 31 +++++++++++++++++++++++++++++++
 3 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/fs/lustre/llite/crypto.c b/fs/lustre/llite/crypto.c
index f411343..157017f 100644
--- a/fs/lustre/llite/crypto.c
+++ b/fs/lustre/llite/crypto.c
@@ -32,6 +32,7 @@
 static int ll_get_context(struct inode *inode, void *ctx, size_t len)
 {
 	struct dentry *dentry;
+	int rc;
 
 	if (hlist_empty(&inode->i_dentry))
 		return -ENODATA;
@@ -39,8 +40,13 @@ static int ll_get_context(struct inode *inode, void *ctx, size_t len)
 	hlist_for_each_entry(dentry, &inode->i_dentry, d_u.d_alias)
 		break;
 
-	return __vfs_getxattr(dentry, inode, LL_XATTR_NAME_ENCRYPTION_CONTEXT,
-			      ctx, len);
+	rc = __vfs_getxattr(dentry, inode, LL_XATTR_NAME_ENCRYPTION_CONTEXT,
+			    ctx, len);
+
+	/* used as encryption unit size */
+	if (S_ISREG(inode->i_mode))
+		inode->i_blkbits = LUSTRE_ENCRYPTION_BLOCKBITS;
+	return rc;
 }
 
 static int ll_set_context(struct inode *inode, const void *ctx, size_t len,
diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c
index 8264b86..3b04952 100644
--- a/fs/lustre/llite/file.c
+++ b/fs/lustre/llite/file.c
@@ -714,6 +714,12 @@ int ll_file_open(struct inode *inode, struct file *file)
 	it = file->private_data; /* XXX: compat macro */
 	file->private_data = NULL; /* prevent ll_local_open assertion */
 
+	if (S_ISREG(inode->i_mode)) {
+		rc = llcrypt_file_open(inode, file);
+		if (rc)
+			goto out_nofiledata;
+	}
+
 	fd = ll_file_data_get();
 	if (!fd) {
 		rc = -ENOMEM;
diff --git a/fs/lustre/osc/osc_request.c b/fs/lustre/osc/osc_request.c
index db97d37..65d17a8 100644
--- a/fs/lustre/osc/osc_request.c
+++ b/fs/lustre/osc/osc_request.c
@@ -1865,6 +1865,7 @@ static int osc_brw_fini_request(struct ptlrpc_request *req, int rc)
 	const char *obd_name = cli->cl_import->imp_obd->obd_name;
 	struct ost_body *body;
 	u32 client_cksum = 0;
+	struct inode *inode;
 
 	if (rc < 0 && rc != -EDQUOT) {
 		DEBUG_REQ(D_INFO, req, "Failed request: rc = %d", rc);
@@ -2055,6 +2056,36 @@ static int osc_brw_fini_request(struct ptlrpc_request *req, int rc)
 	} else {
 		rc = 0;
 	}
+
+	inode = page2inode(aa->aa_ppga[0]->pg);
+	if (inode && IS_ENCRYPTED(inode)) {
+		int idx;
+
+		if (!llcrypt_has_encryption_key(inode)) {
+			CDEBUG(D_SEC, "no enc key for ino %lu\n", inode->i_ino);
+			goto out;
+		}
+		for (idx = 0; idx < aa->aa_page_count; idx++) {
+			struct brw_page *pg = aa->aa_ppga[idx];
+			u64 *p, *q;
+
+			/* do not decrypt if page is all 0s */
+			p = q = page_address(pg->pg);
+			while (p - q < PAGE_SIZE / sizeof(*p)) {
+				if (*p != 0)
+					break;
+				p++;
+			}
+			if (p - q == PAGE_SIZE / sizeof(*p))
+				continue;
+
+			rc = llcrypt_decrypt_pagecache_blocks(pg->pg,
+							      PAGE_SIZE, 0);
+			if (rc)
+				goto out;
+		}
+	}
+
 out:
 	if (rc >= 0)
 		lustre_get_wire_obdo(&req->rq_import->imp_connect_data,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 05/18] lustre: sec: deal with encrypted object size
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (3 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 04/18] lustre: sec: decryption for read path James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 06/18] lustre: sec: support truncate for encrypted files James Simmons
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Sebastien Buisson <sbuisson@ddn.com>

Problem with size of encrypted file comes from the fact that
an encrypted page will always contain PAGE_SIZE bytes of data,
even if clear text page is only a few bytes. And server infers
object size from content of encrypted page.

The way to address this is the following. Upon writing, when the
client encrypts the page representing the end of the file, it puts
into o_size info of the request's body, the size of the clear text
version of the file. On server side, this information is used to
adjust isize of the object, but still storing the complete pages
on disk.

WC-bug-id: https://jira.whamcloud.com/browse/LU-12275
Lustre-commit: 83d660436a164 ("LU-12275 sec: deal with encrypted object size")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/36146
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/osc/osc_request.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/lustre/osc/osc_request.c b/fs/lustre/osc/osc_request.c
index 65d17a8..b27a259 100644
--- a/fs/lustre/osc/osc_request.c
+++ b/fs/lustre/osc/osc_request.c
@@ -1446,10 +1446,18 @@ static int osc_brw_prep_request(int cmd, struct client_obd *cli,
 				ptlrpc_request_free(req);
 				return rc;
 			}
+			pg->pg = data_page;
+			/* there should be no gap in the middle of page array */
+			if (i == page_count - 1) {
+				struct osc_async_page *oap = brw_page2oap(pg);
+
+				oa->o_size = oap->oap_count +
+					     oap->oap_obj_off +
+					     oap->oap_page_off;
+			}
 			/* len is forced to PAGE_SIZE, and poff to 0
 			 * so store the old, clear text info
 			 */
-			pg->pg = data_page;
 			pg->bp_count_diff = PAGE_SIZE - pg->count;
 			pg->count = PAGE_SIZE;
 			pg->bp_off_diff = pg->off & ~PAGE_MASK;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 06/18] lustre: sec: support truncate for encrypted files
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (4 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 05/18] lustre: sec: deal with encrypted object size James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 07/18] lustre: ptlrpc: limit rate of lock replays James Simmons
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Sebastien Buisson <sbuisson@ddn.com>

Truncation of encrypted files is not a trivial operation. The page
corresponding to the point where truncation occurs must be read,
decrypted, zeroed after truncation point, re-encrypted and then
written back.

WC-bug-id: https://jira.whamcloud.com/browse/LU-12275
Lustre-commit: adf46db962f65 ("LU-12275 sec: support truncate for encrypted files")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/37794
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/file.c      |   7 ++
 fs/lustre/llite/llite_lib.c | 182 +++++++++++++++++++++++++++++++++++++++++++-
 fs/lustre/llite/rw.c        |  13 +++-
 fs/lustre/llite/vvp_io.c    |   9 ++-
 fs/lustre/osc/osc_request.c |   7 +-
 5 files changed, 211 insertions(+), 7 deletions(-)

diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c
index 3b04952..55ae2b3 100644
--- a/fs/lustre/llite/file.c
+++ b/fs/lustre/llite/file.c
@@ -2086,6 +2086,13 @@ static int ll_lov_setstripe(struct inode *inode, struct file *file,
 			goto out;
 
 		rc = ll_file_getstripe(inode, arg, lum_size);
+		if (S_ISREG(inode->i_mode) && IS_ENCRYPTED(inode) &&
+		    ll_i2info(inode)->lli_clob) {
+			struct iattr attr = { 0 };
+
+			rc = cl_setattr_ost(ll_i2info(inode)->lli_clob, &attr,
+					    OP_XVALID_FLAGS, LUSTRE_ENCRYPT_FL);
+		}
 	}
 
 	cl_lov_delay_create_clear(&file->f_flags);
diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c
index aad19a2..0db9eae 100644
--- a/fs/lustre/llite/llite_lib.c
+++ b/fs/lustre/llite/llite_lib.c
@@ -1665,6 +1665,164 @@ static int ll_md_setattr(struct dentry *dentry, struct md_op_data *op_data)
 	return rc;
 }
 
+/**
+ * Zero portion of page that is part of @inode.
+ * This implies, if necessary:
+ * - taking cl_lock on range corresponding to concerned page
+ * - grabbing vm page
+ * - associating cl_page
+ * - proceeding to clio read
+ * - zeroing range in page
+ * - proceeding to cl_page flush
+ * - releasing cl_lock
+ *
+ * @inode	inode
+ * @inde	page index
+ * @offset	offset in page to start zero from
+ * @len	len to zero
+ *
+ * Return:	0 on success
+ *		errno on failure
+ */
+int ll_io_zero_page(struct inode *inode, pgoff_t index, pgoff_t offset,
+		    unsigned int len)
+{
+	struct ll_inode_info *lli = ll_i2info(inode);
+	struct cl_object *clob = lli->lli_clob;
+	u16 refcheck;
+	struct lu_env *env = NULL;
+	struct cl_io *io = NULL;
+	struct cl_page *clpage = NULL;
+	struct page *vmpage = NULL;
+	unsigned int from = index << PAGE_SHIFT;
+	struct cl_lock *lock = NULL;
+	struct cl_lock_descr *descr = NULL;
+	struct cl_2queue *queue = NULL;
+	struct cl_sync_io *anchor = NULL;
+	bool holdinglock = false;
+	bool lockedbymyself = true;
+	int rc;
+
+	env = cl_env_get(&refcheck);
+	if (IS_ERR(env))
+		return PTR_ERR(env);
+
+	io = vvp_env_thread_io(env);
+	io->ci_obj = clob;
+	rc = cl_io_rw_init(env, io, CIT_WRITE, from, PAGE_SIZE);
+	if (rc)
+		goto putenv;
+
+	lock = vvp_env_lock(env);
+	descr = &lock->cll_descr;
+	descr->cld_obj = io->ci_obj;
+	descr->cld_start = cl_index(io->ci_obj, from);
+	descr->cld_end = cl_index(io->ci_obj, from + PAGE_SIZE - 1);
+	descr->cld_mode = CLM_WRITE;
+	descr->cld_enq_flags = CEF_MUST | CEF_NONBLOCK;
+
+	/* request lock for page */
+	rc = cl_lock_request(env, io, lock);
+	/* -ECANCELED indicates a matching lock with a different extent
+	 * was already present, and -EEXIST indicates a matching lock
+	 * on exactly the same extent was already present.
+	 * In both cases it means we are covered.
+	 */
+	if (rc == -ECANCELED || rc == -EEXIST)
+		rc = 0;
+	else if (rc < 0)
+		goto iofini;
+	else
+		holdinglock = true;
+
+	/* grab page */
+	vmpage = grab_cache_page_nowait(inode->i_mapping, index);
+	if (!vmpage) {
+		rc = -EOPNOTSUPP;
+		goto rellock;
+	}
+
+	if (!PageDirty(vmpage)) {
+		/* associate cl_page */
+		clpage = cl_page_find(env, clob, vmpage->index,
+				      vmpage, CPT_CACHEABLE);
+		if (IS_ERR(clpage)) {
+			rc = PTR_ERR(clpage);
+			goto pagefini;
+		}
+
+		cl_page_assume(env, io, clpage);
+	}
+
+	if (!PageUptodate(vmpage) && !PageDirty(vmpage) &&
+	    !PageWriteback(vmpage)) {
+		/* read page */
+		/* set PagePrivate2 to detect special case of empty page
+		 * in osc_brw_fini_request()
+		 */
+		SetPagePrivate2(vmpage);
+		rc = ll_io_read_page(env, io, clpage, NULL);
+		if (!PagePrivate2(vmpage))
+			/* PagePrivate2 was cleared in osc_brw_fini_request()
+			 * meaning we read an empty page. In this case, in order
+			 * to avoid allocating unnecessary block in truncated
+			 * file, we must not zero and write as below. Subsequent
+			 * server-side truncate will handle things correctly.
+			 */
+			goto clpfini;
+		ClearPagePrivate2(vmpage);
+		if (rc)
+			goto clpfini;
+		lockedbymyself = trylock_page(vmpage);
+		cl_page_assume(env, io, clpage);
+	}
+
+	/* zero range in page */
+	zero_user(vmpage, offset, len);
+
+	if (holdinglock && clpage) {
+		/* explicitly write newly modified page */
+		queue = &io->ci_queue;
+		cl_2queue_init(queue);
+		anchor = &vvp_env_info(env)->vti_anchor;
+		cl_sync_io_init(anchor, 1);
+		clpage->cp_sync_io = anchor;
+		cl_page_list_add(&queue->c2_qin, clpage);
+		rc = cl_io_submit_rw(env, io, CRT_WRITE, queue);
+		if (rc)
+			goto queuefini1;
+		rc = cl_sync_io_wait(env, anchor, 0);
+		if (rc)
+			goto queuefini2;
+		cl_page_assume(env, io, clpage);
+
+queuefini2:
+		cl_2queue_discard(env, io, queue);
+queuefini1:
+		cl_2queue_disown(env, io, queue);
+		cl_2queue_fini(env, queue);
+	}
+
+clpfini:
+	if (clpage)
+		cl_page_put(env, clpage);
+pagefini:
+	if (lockedbymyself) {
+		unlock_page(vmpage);
+		put_page(vmpage);
+	}
+rellock:
+	if (holdinglock)
+		cl_lock_release(env, lock);
+iofini:
+	cl_io_fini(env, io);
+putenv:
+	if (env)
+		cl_env_put(env, &refcheck);
+
+	return rc;
+}
+
 /* If this inode has objects allocated to it (lsm != NULL), then the OST
  * object(s) determine the file size and mtime.  Otherwise, the MDS will
  * keep these values until such a time that objects are allocated for it.
@@ -1798,6 +1956,8 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr,
 				goto out;
 			}
 		} else {
+			unsigned int flags = 0;
+
 			/* For truncate and utimes sending attributes to OSTs,
 			 * setting mtime/atime to the past will be performed
 			 * under PW [0:EOF] extent lock (new_size:EOF for
@@ -1806,8 +1966,23 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr,
 			 * it is necessary due to possible time
 			 * de-synchronization between MDT inode and OST objects
 			 */
+			if (S_ISREG(inode->i_mode) && IS_ENCRYPTED(inode) &&
+			    attr->ia_valid & ATTR_SIZE) {
+				xvalid |= OP_XVALID_FLAGS;
+				flags = LUSTRE_ENCRYPT_FL;
+				if (attr->ia_size & ~PAGE_MASK) {
+					pgoff_t offset;
+
+					offset = attr->ia_size & (PAGE_SIZE - 1);
+					rc = ll_io_zero_page(inode,
+							     attr->ia_size >> PAGE_SHIFT,
+							     offset, PAGE_SIZE - offset);
+					if (rc)
+						goto out;
+				}
+			}
 			rc = cl_setattr_ost(ll_i2info(inode)->lli_clob,
-					    attr, xvalid, 0);
+					    attr, xvalid, flags);
 		}
 	}
 
@@ -1875,6 +2050,11 @@ int ll_setattr(struct dentry *de, struct iattr *attr)
 {
 	int mode = d_inode(de)->i_mode;
 	enum op_xvalid xvalid = 0;
+	int rc;
+
+	rc = llcrypt_prepare_setattr(de, attr);
+	if (rc)
+		return rc;
 
 	if ((attr->ia_valid & (ATTR_CTIME | ATTR_SIZE | ATTR_MODE)) ==
 			      (ATTR_CTIME | ATTR_SIZE | ATTR_MODE))
diff --git a/fs/lustre/llite/rw.c b/fs/lustre/llite/rw.c
index ff8f3c6..54f0b9a 100644
--- a/fs/lustre/llite/rw.c
+++ b/fs/lustre/llite/rw.c
@@ -1453,8 +1453,8 @@ int ll_io_read_page(const struct lu_env *env, struct cl_io *io,
 			   struct cl_page *page, struct file *file)
 {
 	struct inode *inode = vvp_object_inode(page->cp_obj);
-	struct ll_file_data *fd = file->private_data;
-	struct ll_readahead_state *ras = &fd->fd_ras;
+	struct ll_file_data *fd = NULL;
+	struct ll_readahead_state *ras = NULL;
 	struct cl_2queue *queue = &io->ci_queue;
 	struct ll_sb_info *sbi = ll_i2sbi(inode);
 	struct cl_sync_io *anchor = NULL;
@@ -1464,10 +1464,15 @@ int ll_io_read_page(const struct lu_env *env, struct cl_io *io,
 	struct vvp_page *vpg;
 	bool uptodate;
 
+	if (file) {
+		fd = file->private_data;
+		ras = &fd->fd_ras;
+	}
+
 	vpg = cl2vvp_page(cl_object_page_slice(page->cp_obj, page));
 	uptodate = vpg->vpg_defer_uptodate;
 
-	if (ll_readahead_enabled(sbi) && !vpg->vpg_ra_updated) {
+	if (ll_readahead_enabled(sbi) && !vpg->vpg_ra_updated && ras) {
 		struct vvp_io *vio = vvp_env_io(env);
 		enum ras_update_flags flags = 0;
 
@@ -1494,7 +1499,7 @@ int ll_io_read_page(const struct lu_env *env, struct cl_io *io,
 	io_start_index = cl_index(io->ci_obj, io->u.ci_rw.crw_pos);
 	io_end_index = cl_index(io->ci_obj, io->u.ci_rw.crw_pos +
 				io->u.ci_rw.crw_count - 1);
-	if (ll_readahead_enabled(sbi)) {
+	if (ll_readahead_enabled(sbi) && ras) {
 		rc2 = ll_readahead(env, io, &queue->c2_qin, ras,
 				   uptodate, file);
 		CDEBUG(D_READA, DFID " %d pages read ahead at %lu\n",
diff --git a/fs/lustre/llite/vvp_io.c b/fs/lustre/llite/vvp_io.c
index 371d988..8df5d39 100644
--- a/fs/lustre/llite/vvp_io.c
+++ b/fs/lustre/llite/vvp_io.c
@@ -620,7 +620,14 @@ static int vvp_io_setattr_lock(const struct lu_env *env,
 	u32 enqflags = 0;
 
 	if (cl_io_is_trunc(io)) {
-		if (io->u.ci_setattr.sa_attr.lvb_size == 0)
+		struct inode *inode = vvp_object_inode(io->ci_obj);
+
+		/* set enqueue flags to CEF_MUST in case of encrypted file,
+		 * to prevent lockless truncate
+		 */
+		if (S_ISREG(inode->i_mode) && IS_ENCRYPTED(inode))
+			enqflags = CEF_MUST;
+		else if (io->u.ci_setattr.sa_attr.lvb_size == 0)
 			enqflags = CEF_DISCARD_DATA;
 	} else if (cl_io_is_fallocate(io)) {
 		lock_start = io->u.ci_setattr.sa_falloc_offset;
diff --git a/fs/lustre/osc/osc_request.c b/fs/lustre/osc/osc_request.c
index b27a259..1968d62 100644
--- a/fs/lustre/osc/osc_request.c
+++ b/fs/lustre/osc/osc_request.c
@@ -2084,8 +2084,13 @@ static int osc_brw_fini_request(struct ptlrpc_request *req, int rc)
 					break;
 				p++;
 			}
-			if (p - q == PAGE_SIZE / sizeof(*p))
+			if (p - q == PAGE_SIZE / sizeof(*p)) {
+				/* if page is empty forward info to upper layers
+				 * (ll_io_zero_page) by clearing PagePrivate2
+				 */
+				ClearPagePrivate2(pg->pg);
 				continue;
+			}
 
 			rc = llcrypt_decrypt_pagecache_blocks(pg->pg,
 							      PAGE_SIZE, 0);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 07/18] lustre: ptlrpc: limit rate of lock replays
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (5 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 06/18] lustre: sec: support truncate for encrypted files James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 08/18] lustre: mdc: chlg device could be used after free James Simmons
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Mikhail Pershin <mpershin@whamcloud.com>

Clients send all lock replays at once and that may overwhelm
server with huge amount of replays in recovery queue causing
OOM effects.

Patch adds rate control for lock replays on client

WC-bug-id: https://jira.whamcloud.com/browse/LU-13600
Lustre-commit: 3b613a442b869 ("LU-13600 ptlrpc: limit rate of lock replays")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38920
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/lustre_import.h |  2 ++
 fs/lustre/ldlm/ldlm_request.c     | 69 +++++++++++++++++++++++++++++++++++----
 fs/lustre/obdclass/genops.c       |  1 +
 fs/lustre/ptlrpc/import.c         |  8 ++---
 4 files changed, 70 insertions(+), 10 deletions(-)

diff --git a/fs/lustre/include/lustre_import.h b/fs/lustre/include/lustre_import.h
index 4e9a228..72a303e 100644
--- a/fs/lustre/include/lustre_import.h
+++ b/fs/lustre/include/lustre_import.h
@@ -226,6 +226,8 @@ struct obd_import {
 	atomic_t			imp_unregistering;
 	/** Number of replay requests inflight */
 	atomic_t			imp_replay_inflight;
+	/** In-flight replays rate control */
+	wait_queue_head_t		imp_replay_waitq;
 	/** Number of currently happening import invalidations */
 	atomic_t			imp_inval_count;
 	/** Numbner of request timeouts */
diff --git a/fs/lustre/ldlm/ldlm_request.c b/fs/lustre/ldlm/ldlm_request.c
index 12ee241..e1ba596 100644
--- a/fs/lustre/ldlm/ldlm_request.c
+++ b/fs/lustre/ldlm/ldlm_request.c
@@ -2098,6 +2098,8 @@ static int replay_lock_interpret(const struct lu_env *env,
 	struct obd_export *exp;
 
 	atomic_dec(&req->rq_import->imp_replay_inflight);
+	wake_up(&req->rq_import->imp_replay_waitq);
+
 	if (rc != ELDLM_OK)
 		goto out;
 
@@ -2205,7 +2207,7 @@ static int replay_one_lock(struct obd_import *imp, struct ldlm_lock *lock)
 
 	LDLM_DEBUG(lock, "replaying lock:");
 
-	atomic_inc(&req->rq_import->imp_replay_inflight);
+	atomic_inc(&imp->imp_replay_inflight);
 	aa = ptlrpc_req_async_args(aa, req);
 	aa->lock_handle = body->lock_handle[0];
 	req->rq_interpret_reply = replay_lock_interpret;
@@ -2245,22 +2247,32 @@ static void ldlm_cancel_unused_locks_for_replay(struct ldlm_namespace *ns)
 	       canceled, ldlm_ns_name(ns));
 }
 
-int ldlm_replay_locks(struct obd_import *imp)
+static int lock_can_replay(struct obd_import *imp)
+{
+	struct client_obd *cli = &imp->imp_obd->u.cli;
+
+	CDEBUG(D_HA, "check lock replay limit, inflights = %u(%u)\n",
+	       atomic_read(&imp->imp_replay_inflight) - 1,
+	       cli->cl_max_rpcs_in_flight);
+
+	/* +1 due to ldlm_lock_replay() increment */
+	return atomic_read(&imp->imp_replay_inflight) <
+	       1 + min_t(u32, cli->cl_max_rpcs_in_flight, 8);
+}
+
+int __ldlm_replay_locks(struct obd_import *imp, bool rate_limit)
 {
 	struct ldlm_namespace *ns = imp->imp_obd->obd_namespace;
 	LIST_HEAD(list);
 	struct ldlm_lock *lock;
 	int rc = 0;
 
-	LASSERT(atomic_read(&imp->imp_replay_inflight) == 0);
+	LASSERT(atomic_read(&imp->imp_replay_inflight) == 1);
 
 	/* don't replay locks if import failed recovery */
 	if (imp->imp_vbr_failed)
 		return 0;
 
-	/* ensure this doesn't fall to 0 before all have been queued */
-	atomic_inc(&imp->imp_replay_inflight);
-
 	if (ldlm_cancel_unused_locks_before_replay)
 		ldlm_cancel_unused_locks_for_replay(ns);
 
@@ -2276,9 +2288,54 @@ int ldlm_replay_locks(struct obd_import *imp)
 		}
 		rc = replay_one_lock(imp, lock);
 		LDLM_LOCK_RELEASE(lock);
+
+		if (rate_limit)
+			wait_event_idle_exclusive(imp->imp_replay_waitq,
+						  lock_can_replay(imp));
 	}
 
+	return rc;
+}
+
+/**
+ * Lock replay uses rate control and can sleep waiting so
+ * must be in separate thread from ptlrpcd itself
+ */
+static int ldlm_lock_replay_thread(void *data)
+{
+	struct obd_import *imp = data;
+
+	CDEBUG(D_HA, "lock replay thread %s to %s@%s\n",
+	       imp->imp_obd->obd_name, obd2cli_tgt(imp->imp_obd),
+	       imp->imp_connection->c_remote_uuid.uuid);
+
+	__ldlm_replay_locks(imp, true);
 	atomic_dec(&imp->imp_replay_inflight);
+	ptlrpc_import_recovery_state_machine(imp);
+	class_import_put(imp);
+
+	return 0;
+}
+
+int ldlm_replay_locks(struct obd_import *imp)
+{
+	struct task_struct *task;
+	int rc = 0;
+
+	class_import_get(imp);
+	/* ensure this doesn't fall to 0 before all have been queued */
+	atomic_inc(&imp->imp_replay_inflight);
+
+	task = kthread_run(ldlm_lock_replay_thread, imp, "ldlm_lock_replay");
+	if (IS_ERR(task)) {
+		rc = PTR_ERR(task);
+		CDEBUG(D_HA, "can't start lock replay thread: rc = %d\n", rc);
+
+		/* run lock replay without rate control */
+		rc = __ldlm_replay_locks(imp, false);
+		atomic_dec(&imp->imp_replay_inflight);
+		class_import_put(imp);
+	}
 
 	return rc;
 }
diff --git a/fs/lustre/obdclass/genops.c b/fs/lustre/obdclass/genops.c
index 1647fe9..759d97e 100644
--- a/fs/lustre/obdclass/genops.c
+++ b/fs/lustre/obdclass/genops.c
@@ -1001,6 +1001,7 @@ struct obd_import *class_new_import(struct obd_device *obd)
 	atomic_set(&imp->imp_reqs, 0);
 	atomic_set(&imp->imp_inflight, 0);
 	atomic_set(&imp->imp_replay_inflight, 0);
+	init_waitqueue_head(&imp->imp_replay_waitq);
 	atomic_set(&imp->imp_inval_count, 0);
 	INIT_LIST_HEAD(&imp->imp_conn_list);
 	init_imp_at(&imp->imp_at);
diff --git a/fs/lustre/ptlrpc/import.c b/fs/lustre/ptlrpc/import.c
index 6b0b115..7ec3638 100644
--- a/fs/lustre/ptlrpc/import.c
+++ b/fs/lustre/ptlrpc/import.c
@@ -1486,6 +1486,8 @@ int ptlrpc_import_recovery_state_machine(struct obd_import *imp)
 	int target_len;
 
 	if (imp->imp_state == LUSTRE_IMP_EVICTED) {
+		struct task_struct *task;
+
 		deuuidify(obd2cli_tgt(imp->imp_obd), NULL,
 			  &target_start, &target_len);
 		/* Don't care about MGC eviction */
@@ -1505,8 +1507,6 @@ int ptlrpc_import_recovery_state_machine(struct obd_import *imp)
 		imp->imp_vbr_failed = 0;
 		spin_unlock(&imp->imp_lock);
 
-		{
-		struct task_struct *task;
 		/* bug 17802:  XXX client_disconnect_export vs connect request
 		 * race. if client is evicted at this time, we start
 		 * invalidate thread without reference to import and import can
@@ -1517,13 +1517,13 @@ int ptlrpc_import_recovery_state_machine(struct obd_import *imp)
 				   "ll_imp_inval");
 		if (IS_ERR(task)) {
 			class_import_put(imp);
-			CERROR("error starting invalidate thread: %d\n", rc);
 			rc = PTR_ERR(task);
+			CERROR("%s: can't start invalidate thread: rc = %d\n",
+			       imp->imp_obd->obd_name, rc);
 		} else {
 			rc = 0;
 		}
 		return rc;
-		}
 	}
 
 	if (imp->imp_state == LUSTRE_IMP_REPLAY) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 08/18] lustre: mdc: chlg device could be used after free
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (6 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 07/18] lustre: ptlrpc: limit rate of lock replays James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 09/18] lustre: llite: bind kthread thread to accepted node set James Simmons
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Hongchao Zhang <hongchao@whamcloud.com>

There are some issue of the usage of dynamic devices used by
the changelog in MDC, which could cause the device to be used
after it is freed.

WC-bug-id: https://jira.whamcloud.com/browse/LU-13508
Lustre-commit: 1e992e94eaf8a ("LU-13508 mdc: chlg device could be used after free")
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38658
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/mdc/mdc_changelog.c | 46 ++++++++++++++++++++-----------------------
 fs/lustre/mdc/mdc_internal.h  |  1 +
 fs/lustre/mdc/mdc_request.c   |  8 +++++---
 3 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/fs/lustre/mdc/mdc_changelog.c b/fs/lustre/mdc/mdc_changelog.c
index 3aace7e..8531edb 100644
--- a/fs/lustre/mdc/mdc_changelog.c
+++ b/fs/lustre/mdc/mdc_changelog.c
@@ -61,7 +61,7 @@ struct chlg_registered_dev {
 	char			ced_name[32];
 	/* changelog char device */
 	struct cdev		ced_cdev;
-	struct device		*ced_device;
+	struct device		ced_device;
 	/* OBDs referencing this device (multiple mount point) */
 	struct list_head	ced_obds;
 	/* Reference counter for proper deregistration */
@@ -112,7 +112,7 @@ enum {
 	CDEV_CHLG_MAX_PREFETCH = 1024,
 };
 
-static DEFINE_IDR(chlg_minor_idr);
+DEFINE_IDR(mdc_changelog_minor_idr);
 static DEFINE_SPINLOCK(chlg_minor_lock);
 
 static int chlg_minor_alloc(int *pminor)
@@ -122,7 +122,7 @@ static int chlg_minor_alloc(int *pminor)
 
 	idr_preload(GFP_KERNEL);
 	spin_lock(&chlg_minor_lock);
-	minor = idr_alloc(&chlg_minor_idr, minor_allocated, 0,
+	minor = idr_alloc(&mdc_changelog_minor_idr, minor_allocated, 0,
 			  MDC_CHANGELOG_DEV_COUNT, GFP_NOWAIT);
 	spin_unlock(&chlg_minor_lock);
 	idr_preload_end();
@@ -137,7 +137,7 @@ static int chlg_minor_alloc(int *pminor)
 static void chlg_minor_free(int minor)
 {
 	spin_lock(&chlg_minor_lock);
-	idr_remove(&chlg_minor_idr, minor);
+	idr_remove(&mdc_changelog_minor_idr, minor);
 	spin_unlock(&chlg_minor_lock);
 }
 
@@ -160,8 +160,8 @@ static void chlg_dev_clear(struct kref *kref)
 			     ced_refs);
 
 	list_del(&entry->ced_link);
-	cdev_del(&entry->ced_cdev);
-	device_destroy(mdc_changelog_class, entry->ced_cdev.dev);
+	cdev_device_del(&entry->ced_cdev, &entry->ced_device);
+	put_device(&entry->ced_device);
 }
 
 static inline struct obd_device *chlg_obd_get(struct chlg_registered_dev *dev)
@@ -790,8 +790,6 @@ int mdc_changelog_cdev_init(struct obd_device *obd)
 {
 	struct chlg_registered_dev *exist;
 	struct chlg_registered_dev *entry;
-	struct device *device;
-	dev_t dev;
 	int minor, rc;
 
 	entry = kzalloc(sizeof(*entry), GFP_KERNEL);
@@ -816,35 +814,33 @@ int mdc_changelog_cdev_init(struct obd_device *obd)
 	list_add_tail(&obd->u.cli.cl_chg_dev_linkage, &entry->ced_obds);
 	list_add_tail(&entry->ced_link, &chlg_registered_devices);
 
-	/* Register new character device */
-	cdev_init(&entry->ced_cdev, &chlg_fops);
-	entry->ced_cdev.owner = THIS_MODULE;
-
 	rc = chlg_minor_alloc(&minor);
 	if (rc)
 		goto out_unlock;
 
-	dev = MKDEV(MAJOR(mdc_changelog_dev), minor);
-	rc = cdev_add(&entry->ced_cdev, dev, 1);
+	device_initialize(&entry->ced_device);
+	entry->ced_device.devt = MKDEV(MAJOR(mdc_changelog_dev), minor);
+	entry->ced_device.class = mdc_changelog_class;
+	entry->ced_device.release = chlg_device_release;
+	dev_set_drvdata(&entry->ced_device, entry);
+	rc = dev_set_name(&entry->ced_device, "%s-%s", MDC_CHANGELOG_DEV_NAME,
+			  entry->ced_name);
 	if (rc)
 		goto out_minor;
 
-	device = device_create(mdc_changelog_class, NULL, dev, entry, "%s-%s",
-			       MDC_CHANGELOG_DEV_NAME, entry->ced_name);
-	if (IS_ERR(device)) {
-		rc = PTR_ERR(device);
-		goto out_cdev;
-	}
-
-	device->release = chlg_device_release;
-	entry->ced_device = device;
+	/* Register new character device */
+	cdev_init(&entry->ced_cdev, &chlg_fops);
+	entry->ced_cdev.owner = THIS_MODULE;
+	rc = cdev_device_add(&entry->ced_cdev, &entry->ced_device);
+	if (rc)
+		goto out_device_name;
 
 	entry = NULL;	/* prevent it from being freed below */
 	rc = 0;
 	goto out_unlock;
 
-out_cdev:
-	cdev_del(&entry->ced_cdev);
+out_device_name:
+	kfree_const(entry->ced_device.kobj.name);
 
 out_minor:
 	chlg_minor_free(minor);
diff --git a/fs/lustre/mdc/mdc_internal.h b/fs/lustre/mdc/mdc_internal.h
index 9656231..b7ccc58 100644
--- a/fs/lustre/mdc/mdc_internal.h
+++ b/fs/lustre/mdc/mdc_internal.h
@@ -142,6 +142,7 @@ enum ldlm_mode mdc_lock_match(struct obd_export *exp, u64 flags,
 #define MDC_CHANGELOG_DEV_NAME "changelog"
 extern struct class *mdc_changelog_class;
 extern dev_t mdc_changelog_dev;
+extern struct idr mdc_changelog_minor_idr;
 
 int mdc_changelog_cdev_init(struct obd_device *obd);
 
diff --git a/fs/lustre/mdc/mdc_request.c b/fs/lustre/mdc/mdc_request.c
index 369114b..d6d9f43 100644
--- a/fs/lustre/mdc/mdc_request.c
+++ b/fs/lustre/mdc/mdc_request.c
@@ -3013,10 +3013,12 @@ static int __init mdc_init(void)
 	rc = class_register_type(&mdc_obd_ops, &mdc_md_ops,
 				 LUSTRE_MDC_NAME, &mdc_device_type);
 	if (rc)
-		goto out_dev;
+		goto out_class;
 
 	return 0;
 
+out_class:
+	class_destroy(mdc_changelog_class);
 out_dev:
 	unregister_chrdev_region(mdc_changelog_dev, MDC_CHANGELOG_DEV_COUNT);
 	return rc;
@@ -3024,9 +3026,9 @@ static int __init mdc_init(void)
 
 static void __exit mdc_exit(void)
 {
-	class_destroy(mdc_changelog_class);
-	unregister_chrdev_region(mdc_changelog_dev, MDC_CHANGELOG_DEV_COUNT);
 	class_unregister_type(LUSTRE_MDC_NAME);
+	class_destroy(mdc_changelog_class);
+	idr_destroy(&mdc_changelog_minor_idr);
 }
 
 MODULE_AUTHOR("OpenSFS, Inc. <http://www.lustre.org/>");
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 09/18] lustre: llite: bind kthread thread to accepted node set
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (7 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 08/18] lustre: mdc: chlg device could be used after free James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 10/18] lustre: lov: use lov_pattern_support() to verify lmm James Simmons
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

Bind both the agl and statahead kernel threads to a node that is
apart of the cpt table that Lustre use. This limits the polluting
of the cache of HPC applications.

WC-bug-id: https://jira.whamcloud.com/browse/LU-9441
Lustre-commit: d6e103e6950d9 ("LU-9441 llite: bind kthread thread to accepted node set")
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38730
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
---
 fs/lustre/llite/statahead.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/lustre/llite/statahead.c b/fs/lustre/llite/statahead.c
index fb25520..895e496 100644
--- a/fs/lustre/llite/statahead.c
+++ b/fs/lustre/llite/statahead.c
@@ -39,6 +39,7 @@
 
 #define DEBUG_SUBSYSTEM S_LLITE
 
+#include <linux/libcfs/libcfs_cpu.h>
 #include <obd_support.h>
 #include <lustre_dlm.h>
 #include "llite_internal.h"
@@ -954,6 +955,7 @@ static void ll_stop_agl(struct ll_statahead_info *sai)
 /* start agl thread */
 static void ll_start_agl(struct dentry *parent, struct ll_statahead_info *sai)
 {
+	int node = cfs_cpt_spread_node(cfs_cpt_tab, CFS_CPT_ANY);
 	struct ll_inode_info *plli;
 	struct task_struct *task;
 
@@ -961,8 +963,8 @@ static void ll_start_agl(struct dentry *parent, struct ll_statahead_info *sai)
 	       sai, parent);
 
 	plli = ll_i2info(d_inode(parent));
-	task = kthread_create(ll_agl_thread, parent, "ll_agl_%u",
-			      plli->lli_opendir_pid);
+	task = kthread_create_on_node(ll_agl_thread, parent, node, "ll_agl_%u",
+				      plli->lli_opendir_pid);
 	if (IS_ERR(task)) {
 		CERROR("can't start ll_agl thread, rc: %ld\n", PTR_ERR(task));
 		sai->sai_agl_valid = 0;
@@ -1535,6 +1537,7 @@ static int revalidate_statahead_dentry(struct inode *dir,
 static int start_statahead_thread(struct inode *dir, struct dentry *dentry,
 				  bool agl)
 {
+	int node = cfs_cpt_spread_node(cfs_cpt_tab, CFS_CPT_ANY);
 	struct ll_inode_info *lli = ll_i2info(dir);
 	struct ll_statahead_info *sai = NULL;
 	struct task_struct *task;
@@ -1586,8 +1589,8 @@ static int start_statahead_thread(struct inode *dir, struct dentry *dentry,
 	CDEBUG(D_READA, "start statahead thread: [pid %d] [parent %pd]\n",
 	       current->pid, parent);
 
-	task = kthread_create(ll_statahead_thread, parent, "ll_sa_%u",
-			      lli->lli_opendir_pid);
+	task = kthread_create_on_node(ll_statahead_thread, parent, node,
+				      "ll_sa_%u", lli->lli_opendir_pid);
 	if (IS_ERR(task)) {
 		spin_lock(&lli->lli_sa_lock);
 		lli->lli_sai = NULL;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 10/18] lustre: lov: use lov_pattern_support() to verify lmm
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (8 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 09/18] lustre: llite: bind kthread thread to accepted node set James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 11/18] lustre: llite: truncate deadlock with DoM files James Simmons
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

We can use lov_pattern_support(), which is used by the server
and userland code, to ensure lmm is valid instead of open coding.

WC-bug-id: https://jira.whamcloud.com/browse/LU-12511
Lustre-commit: 0f607f22696ff ("LU-12511 lov: use lov_pattern_support() to verify lmm")
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38791
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
---
 fs/lustre/lov/lov_ea.c                  | 6 ++----
 include/uapi/linux/lustre/lustre_user.h | 8 ++++++++
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/fs/lustre/lov/lov_ea.c b/fs/lustre/lov/lov_ea.c
index 39ff50b..e198536 100644
--- a/fs/lustre/lov/lov_ea.c
+++ b/fs/lustre/lov/lov_ea.c
@@ -84,6 +84,7 @@ static loff_t lov_tgt_maxbytes(struct lov_tgt_desc *tgt)
 static int lsm_lmm_verify_v1v3(struct lov_mds_md *lmm, size_t lmm_size,
 			       u16 stripe_count)
 {
+	u32 pattern = le32_to_cpu(lmm->lmm_pattern);
 	int rc = 0;
 
 	if (stripe_count > LOV_V1_INSANE_STRIPE_COUNT) {
@@ -101,10 +102,7 @@ static int lsm_lmm_verify_v1v3(struct lov_mds_md *lmm, size_t lmm_size,
 		goto out;
 	}
 
-	if (lov_pattern(le32_to_cpu(lmm->lmm_pattern)) != LOV_PATTERN_MDT &&
-	    lov_pattern(le32_to_cpu(lmm->lmm_pattern)) != LOV_PATTERN_RAID0 &&
-	    lov_pattern(le32_to_cpu(lmm->lmm_pattern)) !=
-			(LOV_PATTERN_RAID0 | LOV_PATTERN_OVERSTRIPING)) {
+	if (!lov_pattern_supported(lov_pattern(pattern))) {
 		rc = -EINVAL;
 		CERROR("lov: unrecognized striping pattern: rc = %d\n", rc);
 		lov_dump_lmm_common(D_WARNING, lmm);
diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h
index 121c064..6a2d5f9 100644
--- a/include/uapi/linux/lustre/lustre_user.h
+++ b/include/uapi/linux/lustre/lustre_user.h
@@ -443,6 +443,14 @@ static inline bool lov_pattern_supported_normal_comp(__u32 pattern)
 #define LOV_OFFSET_DEFAULT      ((__u16)-1)
 #define LMV_OFFSET_DEFAULT      ((__u32)-1)
 
+static inline bool lov_pattern_supported(__u32 pattern)
+{
+	return (pattern & ~LOV_PATTERN_F_RELEASED) == LOV_PATTERN_RAID0 ||
+	       (pattern & ~LOV_PATTERN_F_RELEASED) ==
+			(LOV_PATTERN_RAID0 | LOV_PATTERN_OVERSTRIPING) ||
+	       (pattern & ~LOV_PATTERN_F_RELEASED) == LOV_PATTERN_MDT;
+}
+
 #define LOV_MIN_STRIPE_BITS	16	/* maximum PAGE_SIZE (ia64), power of 2 */
 #define LOV_MIN_STRIPE_SIZE	(1 << LOV_MIN_STRIPE_BITS)
 #define LOV_MAX_STRIPE_COUNT_OLD 160
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 11/18] lustre: llite: truncate deadlock with DoM files
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (9 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 10/18] lustre: lov: use lov_pattern_support() to verify lmm James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 12/18] lnet: Skip health and resends for single rail configs James Simmons
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Andriy Skulysh <c17819@cray.com>

All MDT intent RPCs are sent with inode mutex locked
while read/write and setattr unlocks inode mutex on entry,
takes LDLM lock and locks inode mutex again and sends the RPC.
So a deadlock can occur since LDLM lock is the same in case of DoM.

In fact read/write and setattr takes lli_trunc_sem, so
inode mutex can be ommited in truncate case.

Replace inode_lock with new lli_setattr_mutex to keep protection
from concurrent setattr time updates.

HPE-bug-id: LUS-8455
WC-bug-id: https://jira.whamcloud.com/browse/LU-13467
Lustre-commit: 8958ecee22010 ("LU-13467 llite: truncate deadlock with DoM files")
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/38288
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/llite_internal.h | 1 +
 fs/lustre/llite/llite_lib.c      | 1 +
 fs/lustre/llite/vvp_io.c         | 8 ++++----
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h
index a688bd8..2556dd8 100644
--- a/fs/lustre/llite/llite_internal.h
+++ b/fs/lustre/llite/llite_internal.h
@@ -191,6 +191,7 @@ struct ll_inode_info {
 			char			       *lli_symlink_name;
 			struct ll_trunc_sem		lli_trunc_sem;
 			struct range_lock_tree		lli_write_tree;
+			struct mutex			lli_setattr_mutex;
 
 			struct rw_semaphore		lli_glimpse_sem;
 			ktime_t				lli_glimpse_time;
diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c
index 0db9eae..b30feb0 100644
--- a/fs/lustre/llite/llite_lib.c
+++ b/fs/lustre/llite/llite_lib.c
@@ -1014,6 +1014,7 @@ void ll_lli_init(struct ll_inode_info *lli)
 		init_rwsem(&lli->lli_lsm_sem);
 	} else {
 		mutex_init(&lli->lli_size_mutex);
+		mutex_init(&lli->lli_setattr_mutex);
 		lli->lli_symlink_name = NULL;
 		ll_trunc_sem_init(&lli->lli_trunc_sem);
 		range_lock_tree_init(&lli->lli_write_tree);
diff --git a/fs/lustre/llite/vvp_io.c b/fs/lustre/llite/vvp_io.c
index 8df5d39..8edd3c1 100644
--- a/fs/lustre/llite/vvp_io.c
+++ b/fs/lustre/llite/vvp_io.c
@@ -702,13 +702,13 @@ static int vvp_io_setattr_start(const struct lu_env *env,
 
 	if (cl_io_is_trunc(io)) {
 		trunc_sem_down_write(&lli->lli_trunc_sem);
-		inode_lock(inode);
+		mutex_lock(&lli->lli_setattr_mutex);
 		inode_dio_wait(inode);
 	} else if (cl_io_is_fallocate(io)) {
 		inode_lock(inode);
 		inode_dio_wait(inode);
 	} else {
-		inode_lock(inode);
+		mutex_lock(&lli->lli_setattr_mutex);
 	}
 
 	if (io->u.ci_setattr.sa_avalid & TIMES_SET_FLAGS)
@@ -729,12 +729,12 @@ static void vvp_io_setattr_end(const struct lu_env *env,
 		 * because osc has already notified to destroy osc_extents.
 		 */
 		vvp_do_vmtruncate(inode, io->u.ci_setattr.sa_attr.lvb_size);
-		inode_unlock(inode);
+		mutex_unlock(&lli->lli_setattr_mutex);
 		trunc_sem_up_write(&lli->lli_trunc_sem);
 	} else if (cl_io_is_fallocate(io)) {
 		inode_unlock(inode);
 	} else {
-		inode_unlock(inode);
+		mutex_unlock(&lli->lli_setattr_mutex);
 	}
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 12/18] lnet: Skip health and resends for single rail configs
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (10 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 11/18] lustre: llite: truncate deadlock with DoM files James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 13/18] lustre: sec: ioctls to handle encryption policies James Simmons
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Chris Horn <hornc@cray.com>

If the sender of a message only has a single interface it doesn't
make sense to have LNet track the health of that interface, nor
should it attempt to resend a message when it encounters a local
error. There aren't any alternative interfaces to use for a resend.

Similarly, we needn't track health values of a peer's NIs if the peer
only has a single interface. Nor do we need to attempt to resend
a message to a peer with a single interface. There's an exception for
routers. We rely on NI health to determine route aliveness, so even
if a router only has a single interface we still need to track its
health.

We can use the ln_ping_target to get the count of local NIs, and the
lnet_peer struct already contains a count of the number of peer NIs.

HPE-bug-id: LUS-8826
WC-bug-id: https://jira.whamcloud.com/browse/LU-13501
Lustre-commit: c5381d73b1d83 ("LU-13501 lnet: Skip health and resends for single rail configs")
Signed-off-by: Chris Horn <hornc@cray.com>
Reviewed-on: https://review.whamcloud.com/38448
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 net/lnet/lnet/lib-msg.c | 65 ++++++++++++++++++++++++++++++++++---------------
 1 file changed, 46 insertions(+), 19 deletions(-)

diff --git a/net/lnet/lnet/lib-msg.c b/net/lnet/lnet/lib-msg.c
index 7ce9c47..f759b2d 100644
--- a/net/lnet/lnet/lib-msg.c
+++ b/net/lnet/lnet/lib-msg.c
@@ -774,6 +774,10 @@
 	struct lnet_peer_ni *lpni;
 	struct lnet_ni *ni;
 	bool lo = false;
+	bool attempt_local_resend;
+	bool attempt_remote_resend;
+	bool handle_local_health;
+	bool handle_remote_health;
 
 	/* if we're shutting down no point in handling health. */
 	if (the_lnet.ln_mt_state != LNET_MT_STATE_RUNNING)
@@ -800,9 +804,45 @@
 	if (msg->msg_tx_committed) {
 		ni = msg->msg_txni;
 		lpni = msg->msg_txpeer;
+		attempt_local_resend = true;
+		attempt_remote_resend = true;
 	} else {
 		ni = msg->msg_rxni;
 		lpni = msg->msg_rxpeer;
+		attempt_local_resend = false;
+		attempt_remote_resend = false;
+	}
+
+	/* Don't further decrement the health value if a recovery message
+	 * failed.
+	 */
+	if (msg->msg_recovery) {
+		handle_local_health = false;
+		handle_remote_health = false;
+	} else {
+		handle_local_health = false;
+		handle_remote_health = true;
+	}
+
+	/* For local failures, health/recovery/resends are not needed if I only
+	 * have a single (non-lolnd) interface. NB: pb_nnis includes the lolnd
+	 * interface, so a single-rail node would have pb_nnis == 2.
+	 */
+	if (the_lnet.ln_ping_target->pb_nnis <= 2) {
+		handle_local_health = false;
+		attempt_local_resend = false;
+	}
+
+	/* For remote failures, health/recovery/resends are not needed if the
+	 * peer only has a single interface. Special case for routers where we
+	 * rely on health feature to manage route aliveness. NB: unlike pb_nnis
+	 * above, lp_nnis does _not_ include the lolnd, so a single-rail node
+	 * would have lp_nnis == 1.
+	 */
+	if (lpni && lpni->lpni_peer_net->lpn_peer->lp_nnis <= 1) {
+		attempt_remote_resend = false;
+		if (!lnet_isrouter(lpni))
+			handle_remote_health = false;
 	}
 
 	if (!lo)
@@ -865,41 +905,28 @@
 	case LNET_MSG_STATUS_LOCAL_ABORTED:
 	case LNET_MSG_STATUS_LOCAL_NO_ROUTE:
 	case LNET_MSG_STATUS_LOCAL_TIMEOUT:
-		/* don't further decrement the health value if the
-		 * recovery message failed.
-		 */
-		if (!msg->msg_recovery)
+		if (handle_local_health)
 			lnet_handle_local_failure(ni);
-		if (msg->msg_tx_committed)
-			/* add to the re-send queue */
+		if (attempt_local_resend)
 			return lnet_attempt_msg_resend(msg);
 		break;
 
-	/* These errors will not trigger a resend so simply
-	 * finalize the message
-	 */
 	case LNET_MSG_STATUS_LOCAL_ERROR:
-		/* don't further decrement the health value if the
-		 * recovery message failed.
-		 */
-		if (!msg->msg_recovery)
+		if (handle_local_health)
 			lnet_handle_local_failure(ni);
 		return -1;
 
-	/* TODO: since the remote dropped the message we can
-	 * attempt a resend safely.
-	 */
 	case LNET_MSG_STATUS_REMOTE_DROPPED:
-		if (!msg->msg_recovery)
+		if (handle_remote_health)
 			lnet_handle_remote_failure(lpni);
-		if (msg->msg_tx_committed)
+		if (attempt_remote_resend)
 			return lnet_attempt_msg_resend(msg);
 		break;
 
 	case LNET_MSG_STATUS_REMOTE_ERROR:
 	case LNET_MSG_STATUS_REMOTE_TIMEOUT:
 	case LNET_MSG_STATUS_NETWORK_TIMEOUT:
-		if (!msg->msg_recovery)
+		if (handle_remote_health)
 			lnet_handle_remote_failure(lpni);
 		return -1;
 	default:
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 13/18] lustre: sec: ioctls to handle encryption policies
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (11 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 12/18] lnet: Skip health and resends for single rail configs James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 14/18] lnet: define new network driver ptl4lnd James Simmons
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Sebastien Buisson <sbuisson@ddn.com>

Introduce support for fscrypt IOCTLs that handle encryption
policies v2. It enables setting/getting encryption policies on
individual directories, letting users decide how they want to
encrypt specific directories.

fscrypt encryption policies v2 are supported from Linux 5.4.

WC-bug-id: https://jira.whamcloud.com/browse/LU-12275
Lustre-commit: 3973cf8dc955c ("LU-12275 sec: ioctls to handle encryption policies")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/37673
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/dir.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c
index 2c93908..463c5d7 100644
--- a/fs/lustre/llite/dir.c
+++ b/fs/lustre/llite/dir.c
@@ -48,6 +48,7 @@
 
 #include <obd_support.h>
 #include <obd_class.h>
+#include <uapi/linux/fscrypt.h>
 #include <uapi/linux/lustre/lustre_idl.h>
 #include <uapi/linux/lustre/lustre_ioctl.h>
 #include <lustre_lib.h>
@@ -2103,6 +2104,33 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		kfree(detach);
 		return rc;
 	}
+#ifdef CONFIG_FS_ENCRYPTION
+	case FS_IOC_SET_ENCRYPTION_POLICY:
+		if (!ll_sbi_has_encrypt(ll_i2sbi(inode)))
+			return -EOPNOTSUPP;
+		return llcrypt_ioctl_set_policy(file, (const void __user *)arg);
+	case FS_IOC_GET_ENCRYPTION_POLICY_EX:
+		if (!ll_sbi_has_encrypt(ll_i2sbi(inode)))
+			return -EOPNOTSUPP;
+		return llcrypt_ioctl_get_policy_ex(file, (void __user *)arg);
+	case FS_IOC_ADD_ENCRYPTION_KEY:
+		if (!ll_sbi_has_encrypt(ll_i2sbi(inode)))
+			return -EOPNOTSUPP;
+		return llcrypt_ioctl_add_key(file, (void __user *)arg);
+	case FS_IOC_REMOVE_ENCRYPTION_KEY:
+		if (!ll_sbi_has_encrypt(ll_i2sbi(inode)))
+			return -EOPNOTSUPP;
+		return llcrypt_ioctl_remove_key(file, (void __user *)arg);
+	case FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS:
+		if (!ll_sbi_has_encrypt(ll_i2sbi(inode)))
+			return -EOPNOTSUPP;
+		return llcrypt_ioctl_remove_key_all_users(file,
+							  (void __user *)arg);
+	case FS_IOC_GET_ENCRYPTION_KEY_STATUS:
+		if (!ll_sbi_has_encrypt(ll_i2sbi(inode)))
+			return -EOPNOTSUPP;
+		return llcrypt_ioctl_get_key_status(file, (void __user *)arg);
+#endif
 	default:
 		return obd_iocontrol(cmd, sbi->ll_dt_exp, 0, NULL,
 				     (void __user *)arg);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 14/18] lnet: define new network driver ptl4lnd
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (12 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 13/18] lustre: sec: ioctls to handle encryption policies James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 15/18] lustre: llite: don't hold inode_lock for security notify James Simmons
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Gregoire Pichon <gregoire.pichon@bull.net>

Assign an ID to the new network driver ptl4lnd developed by Bull
that implements a LND based on Portals 4 API. It is intended to be
used with BXI, the Bull interconnect hardware.

WC-bug-id: https://jira.whamcloud.com/browse/LU-8932
Lustre-commit: c54bf3faa29f5 ("LU-8932 lnet: define new network driver ptl4lnd")
Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Reviewed-on: https://review.whamcloud.com/24768
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/uapi/linux/lnet/nidstr.h | 1 +
 net/lnet/lnet/nidstrings.c       | 9 +++++++++
 2 files changed, 10 insertions(+)

diff --git a/include/uapi/linux/lnet/nidstr.h b/include/uapi/linux/lnet/nidstr.h
index 9133641..34ba497 100644
--- a/include/uapi/linux/lnet/nidstr.h
+++ b/include/uapi/linux/lnet/nidstr.h
@@ -53,6 +53,7 @@ enum {
 	/*MXLND		= 12, removed v2_7_50_0-34-g8be9e41	*/
 	GNILND		= 13,
 	GNIIPLND	= 14,
+	PTL4LND		= 15,
 
 	NUM_LNDS
 };
diff --git a/net/lnet/lnet/nidstrings.c b/net/lnet/lnet/nidstrings.c
index eca5092..fb8d3e2 100644
--- a/net/lnet/lnet/nidstrings.c
+++ b/net/lnet/lnet/nidstrings.c
@@ -692,6 +692,15 @@ int cfs_print_nidlist(char *buffer, int count, struct list_head *nidlist)
 	  .nf_print_addrlist	= libcfs_ip_addr_range_print,
 	  .nf_match_addr	= cfs_ip_addr_match
 	},
+	{ .nf_type		= PTL4LND,
+	  .nf_name		= "ptlf",
+	  .nf_modname		= "kptl4lnd",
+	  .nf_addr2str		= libcfs_decnum_addr2str,
+	  .nf_str2addr		= libcfs_num_str2addr,
+	  .nf_parse_addrlist	= libcfs_num_parse,
+	  .nf_print_addrlist	= libcfs_num_addr_range_print,
+	  .nf_match_addr	= libcfs_num_match
+	},
 };
 
 static const size_t libcfs_nnetstrfns = ARRAY_SIZE(libcfs_netstrfns);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 15/18] lustre: llite: don't hold inode_lock for security notify
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (13 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 14/18] lnet: define new network driver ptl4lnd James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 16/18] lustre: mdt: don't fetch LOOKUP lock for remote object James Simmons
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Alexander Boyko <alexander.boyko@hpe.com>

With selinux enabled client has a dead lock which leads to
client eviction from MDS.
1 thread                    2 thread
do file open                do stat
inode_lock(parend dir)
                            got LDLM_PR(parent dir)
enqueue LDLM_CW(parent dir) waits on inode_lock to notify security
waits
timeout on enqueue
and client eviction because client didn't cancel a LDLM_PR lock

security_inode_notifysecctx()->selinux_inode_notifysecctx()->
selinux_inode_setsecurity()
The call of selinux_inode_setsecurity doesn't need to hold
inode_lock.

Fixes: f4d3cf7642 ("lustre: llite: set sec ctx on client's inode at create time")
Cray-bug-id: LUS-8924
WC-bug-id: https://jira.whamcloud.com/browse/LU-13617
Lustre-commit: f87359b51f61a ("LU-13617 llite: don't hold inode_lock for security notify")
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/38792
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/llite/dir.c   |  6 ++++--
 fs/lustre/llite/namei.c | 18 ++++++++++++------
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c
index 463c5d7..e3305f7 100644
--- a/fs/lustre/llite/dir.c
+++ b/fs/lustre/llite/dir.c
@@ -489,11 +489,13 @@ static int ll_dir_setdirstripe(struct dentry *dparent, struct lmv_user_md *lump,
 	dentry.d_inode = inode;
 
 	if (sbi->ll_flags & LL_SBI_FILE_SECCTX) {
-		inode_lock(inode);
+		/* no need to protect selinux_inode_setsecurity() by
+		 * inode_lock. Taking it would lead to a client deadlock
+		 * LU-13617
+		 */
 		err = security_inode_notifysecctx(inode,
 						  op_data->op_file_secctx,
 						  op_data->op_file_secctx_size);
-		inode_unlock(inode);
 	} else {
 		err = ll_inode_init_security(&dentry, inode, parent);
 	}
diff --git a/fs/lustre/llite/namei.c b/fs/lustre/llite/namei.c
index 2353a8f..251d6be 100644
--- a/fs/lustre/llite/namei.c
+++ b/fs/lustre/llite/namei.c
@@ -659,10 +659,12 @@ static int ll_lookup_it_finish(struct ptlrpc_request *request,
 		}
 
 		if (secctx && secctxlen != 0) {
-			inode_lock(inode);
+			/* no need to protect selinux_inode_setsecurity() by
+			 * inode_lock. Taking it would lead to a client deadlock
+			 * LU-13617
+			 */
 			rc = security_inode_notifysecctx(inode, secctx,
 							 secctxlen);
-			inode_unlock(inode);
 			if (rc)
 				CWARN("cannot set security context for " DFID ": rc = %d\n",
 				      PFID(ll_inode2fid(inode)), rc);
@@ -1198,13 +1200,15 @@ static int ll_create_it(struct inode *dir, struct dentry *dentry,
 		return PTR_ERR(inode);
 
 	if ((ll_i2sbi(inode)->ll_flags & LL_SBI_FILE_SECCTX) && secctx) {
-		inode_lock(inode);
 		/* must be done before d_instantiate, because it calls
 		 * security_d_instantiate, which means a getxattr if security
 		 * context is not set yet
 		 */
+		/* no need to protect selinux_inode_setsecurity() by
+		 * inode_lock. Taking it would lead to a client deadlock
+		 * LU-13617
+		 */
 		rc = security_inode_notifysecctx(inode, secctx, secctxlen);
-		inode_unlock(inode);
 		if (rc)
 			return rc;
 	}
@@ -1370,15 +1374,17 @@ static int ll_new_node(struct inode *dir, struct dentry *dentry,
 		goto err_exit;
 
 	if (sbi->ll_flags & LL_SBI_FILE_SECCTX) {
-		inode_lock(inode);
 		/* must be done before d_instantiate, because it calls
 		 * security_d_instantiate, which means a getxattr if security
 		 * context is not set yet
 		 */
+		/* no need to protect selinux_inode_setsecurity() by
+		 * inode_lock. Taking it would lead to a client deadlock
+		 * LU-13617
+		 */
 		err = security_inode_notifysecctx(inode,
 						  op_data->op_file_secctx,
 						  op_data->op_file_secctx_size);
-		inode_unlock(inode);
 		if (err)
 			goto err_exit;
 	}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 16/18] lustre: mdt: don't fetch LOOKUP lock for remote object
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (14 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 15/18] lustre: llite: don't hold inode_lock for security notify James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 17/18] lustre: obd: add new LPROCFS_TYPE_* James Simmons
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Lai Siyao <lai.siyao@whamcloud.com>

Pack parent FID in getattr by FID, which will be used to check whether
child is remote object on parent. The helper function is called
mdt_is_remote_object(). NB, directory shard is not treated as remote
object, because if so, client needs to revalidate shards when dir is
accessed, which will hurt performance much.

For getattr by FID, if object is remote file on parent, don't fetch
LOOKUP lock, otherwise client may see stale dir entries.

WC-bug-id: https://jira.whamcloud.com/browse/LU-13437
Lustre-commit: f9a2da63abab5 ("LU-13437 mdt: don't fetch LOOKUP lock for remote object")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38561
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/obd.h       |  2 +-
 fs/lustre/include/obd_class.h |  3 ++-
 fs/lustre/llite/file.c        |  6 +++---
 fs/lustre/llite/llite_lib.c   |  4 ++--
 fs/lustre/lmv/lmv_intent.c    | 19 +++++++++++++------
 fs/lustre/lmv/lmv_internal.h  |  1 +
 fs/lustre/lmv/lmv_obd.c       |  3 ++-
 7 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/fs/lustre/include/obd.h b/fs/lustre/include/obd.h
index f9e0920..438f4ca 100644
--- a/fs/lustre/include/obd.h
+++ b/fs/lustre/include/obd.h
@@ -1004,7 +1004,7 @@ struct md_ops {
 
 	int (*free_lustre_md)(struct obd_export *, struct lustre_md *);
 
-	int (*merge_attr)(struct obd_export *,
+	int (*merge_attr)(struct obd_export *, const struct lu_fid *fid,
 			  const struct lmv_stripe_md *lsm,
 			  struct cl_attr *attr, ldlm_blocking_callback);
 
diff --git a/fs/lustre/include/obd_class.h b/fs/lustre/include/obd_class.h
index 746782b..78f7b16 100644
--- a/fs/lustre/include/obd_class.h
+++ b/fs/lustre/include/obd_class.h
@@ -1458,6 +1458,7 @@ static inline int md_free_lustre_md(struct obd_export *exp,
 }
 
 static inline int md_merge_attr(struct obd_export *exp,
+				const struct lu_fid *fid,
 				const struct lmv_stripe_md *lsm,
 				struct cl_attr *attr,
 				ldlm_blocking_callback cb)
@@ -1468,7 +1469,7 @@ static inline int md_merge_attr(struct obd_export *exp,
 	if (rc)
 		return rc;
 
-	return MDP(exp->exp_obd, merge_attr)(exp, lsm, attr, cb);
+	return MDP(exp->exp_obd, merge_attr)(exp, fid, lsm, attr, cb);
 }
 
 static inline int md_setxattr(struct obd_export *exp, const struct lu_fid *fid,
diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c
index 55ae2b3..1849229 100644
--- a/fs/lustre/llite/file.c
+++ b/fs/lustre/llite/file.c
@@ -4500,8 +4500,8 @@ static int ll_inode_revalidate(struct dentry *dentry, enum ldlm_intent_flags op)
 	       PFID(ll_inode2fid(inode)), inode, dentry);
 
 	/* Call getattr by fid, so do not provide name at all. */
-	op_data = ll_prep_md_op_data(NULL, inode, inode, NULL, 0, 0,
-				     LUSTRE_OPC_ANY, NULL);
+	op_data = ll_prep_md_op_data(NULL, dentry->d_parent->d_inode, inode,
+				     NULL, 0, 0, LUSTRE_OPC_ANY, NULL);
 	if (IS_ERR(op_data))
 		return PTR_ERR(op_data);
 
@@ -4548,7 +4548,7 @@ static int ll_merge_md_attr(struct inode *inode)
 		return 0;
 
 	down_read(&lli->lli_lsm_sem);
-	rc = md_merge_attr(ll_i2mdexp(inode), ll_i2info(inode)->lli_lsm_md,
+	rc = md_merge_attr(ll_i2mdexp(inode), &lli->lli_fid, lli->lli_lsm_md,
 			   &attr, ll_md_blocking_ast);
 	up_read(&lli->lli_lsm_sem);
 	if (rc)
diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c
index b30feb0..1a7d805 100644
--- a/fs/lustre/llite/llite_lib.c
+++ b/fs/lustre/llite/llite_lib.c
@@ -1525,8 +1525,8 @@ static int ll_update_lsm_md(struct inode *inode, struct lustre_md *md)
 	}
 
 	/* validate the lsm */
-	rc = md_merge_attr(ll_i2mdexp(inode), lli->lli_lsm_md, attr,
-			   ll_md_blocking_ast);
+	rc = md_merge_attr(ll_i2mdexp(inode), &lli->lli_fid, lli->lli_lsm_md,
+			   attr, ll_md_blocking_ast);
 	if (!rc) {
 		if (md->body->mbo_valid & OBD_MD_FLNLINK)
 			md->body->mbo_nlink = attr->cat_nlink;
diff --git a/fs/lustre/lmv/lmv_intent.c b/fs/lustre/lmv/lmv_intent.c
index a847770..4af449e 100644
--- a/fs/lustre/lmv/lmv_intent.c
+++ b/fs/lustre/lmv/lmv_intent.c
@@ -153,6 +153,7 @@ static int lmv_intent_remote(struct obd_export *exp, struct lookup_intent *it,
 }
 
 int lmv_revalidate_slaves(struct obd_export *exp,
+			  const struct lu_fid *pfid,
 			  const struct lmv_stripe_md *lsm,
 			  ldlm_blocking_callback cb_blocking,
 			  int extra_lock_flags)
@@ -196,7 +197,7 @@ int lmv_revalidate_slaves(struct obd_export *exp,
 		 * which is not needed here.
 		 */
 		memset(op_data, 0, sizeof(*op_data));
-		op_data->op_fid1 = fid;
+		op_data->op_fid1 = *pfid;
 		op_data->op_fid2 = fid;
 
 		tgt = lmv_tgt(lmv, lsm->lsm_md_oinfo[i].lmo_mds);
@@ -444,13 +445,18 @@ static int lmv_intent_lookup(struct obd_export *exp,
 	}
 
 retry:
-	tgt = lmv_locate_tgt(lmv, op_data);
+	if (op_data->op_name) {
+		tgt = lmv_locate_tgt(lmv, op_data);
+		if (!fid_is_sane(&op_data->op_fid2))
+			fid_zero(&op_data->op_fid2);
+	} else if (fid_is_sane(&op_data->op_fid2)) {
+		tgt = lmv_fid2tgt(lmv, &op_data->op_fid2);
+	} else {
+		tgt = lmv_fid2tgt(lmv, &op_data->op_fid1);
+	}
 	if (IS_ERR(tgt))
 		return PTR_ERR(tgt);
 
-	if (!fid_is_sane(&op_data->op_fid2))
-		fid_zero(&op_data->op_fid2);
-
 	CDEBUG(D_INODE,
 	       "LOOKUP_INTENT with fid1=" DFID ", fid2=" DFID ", name='%s' -> mds #%u\n",
 	       PFID(&op_data->op_fid1), PFID(&op_data->op_fid2),
@@ -470,7 +476,8 @@ static int lmv_intent_lookup(struct obd_export *exp,
 		 * during update_inode process (see ll_update_lsm_md)
 		 */
 		if (lmv_dir_striped(op_data->op_mea2)) {
-			rc = lmv_revalidate_slaves(exp, op_data->op_mea2,
+			rc = lmv_revalidate_slaves(exp, &op_data->op_fid2,
+						   op_data->op_mea2,
 						   cb_blocking,
 						   extra_lock_flags);
 			if (rc != 0)
diff --git a/fs/lustre/lmv/lmv_internal.h b/fs/lustre/lmv/lmv_internal.h
index e42b141..756fa27 100644
--- a/fs/lustre/lmv/lmv_internal.h
+++ b/fs/lustre/lmv/lmv_internal.h
@@ -53,6 +53,7 @@ int lmv_fid_alloc(const struct lu_env *env, struct obd_export *exp,
 		  struct lu_fid *fid, struct md_op_data *op_data);
 
 int lmv_revalidate_slaves(struct obd_export *exp,
+			  const struct lu_fid *pfid,
 			  const struct lmv_stripe_md *lsm,
 			  ldlm_blocking_callback cb_blocking,
 			  int extra_lock_flags);
diff --git a/fs/lustre/lmv/lmv_obd.c b/fs/lustre/lmv/lmv_obd.c
index c5f21cd..4131b49 100644
--- a/fs/lustre/lmv/lmv_obd.c
+++ b/fs/lustre/lmv/lmv_obd.c
@@ -3477,6 +3477,7 @@ static int lmv_quotactl(struct obd_device *unused, struct obd_export *exp,
 }
 
 static int lmv_merge_attr(struct obd_export *exp,
+			  const struct lu_fid *fid,
 			  const struct lmv_stripe_md *lsm,
 			  struct cl_attr *attr,
 			  ldlm_blocking_callback cb_blocking)
@@ -3486,7 +3487,7 @@ static int lmv_merge_attr(struct obd_export *exp,
 	if (!lmv_dir_striped(lsm))
 		return 0;
 
-	rc = lmv_revalidate_slaves(exp, lsm, cb_blocking, 0);
+	rc = lmv_revalidate_slaves(exp, fid, lsm, cb_blocking, 0);
 	if (rc < 0)
 		return rc;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 17/18] lustre: obd: add new LPROCFS_TYPE_*
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (15 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 16/18] lustre: mdt: don't fetch LOOKUP lock for remote object James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  0:04 ` [lustre-devel] [PATCH 18/18] lnet: handle undefined parameters James Simmons
  2020-07-02  4:47 ` [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 NeilBrown
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Emoly Liu <emoly@whamcloud.com>

Move LPROCFS_TYPE_LATENCY from llite later to lprocfs_status.h.
Create new LPROCFS_TYPE_BYTES_FULL settings.

WC-bug-id: https://jira.whamcloud.com/browse/LU-13597
Lustre-commit: cd8fb1e8d300c ("LU-13597 ofd: add more information to job_stats")
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38816
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 fs/lustre/include/lprocfs_status.h | 9 +++++++--
 fs/lustre/llite/lproc_llite.c      | 8 ++------
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/fs/lustre/include/lprocfs_status.h b/fs/lustre/include/lprocfs_status.h
index 0d90b31..759e66b 100644
--- a/fs/lustre/include/lprocfs_status.h
+++ b/fs/lustre/include/lprocfs_status.h
@@ -143,6 +143,13 @@ enum {
 	LPROCFS_TYPE_BYTES		= 0x0200,
 	LPROCFS_TYPE_PAGES		= 0x0400,
 	LPROCFS_TYPE_USEC		= 0x0800,
+
+	LPROCFS_TYPE_LATENCY		= LPROCFS_TYPE_USEC |
+					  LPROCFS_CNTR_AVGMINMAX |
+					  LPROCFS_CNTR_STDDEV,
+	LPROCFS_TYPE_BYTES_FULL		= LPROCFS_TYPE_BYTES |
+					  LPROCFS_CNTR_AVGMINMAX |
+					  LPROCFS_CNTR_STDDEV,
 };
 
 #define LC_MIN_INIT ((~(u64)0) >> 1)
@@ -364,8 +371,6 @@ enum {
 #define JOBSTATS_SESSION		"session"
 
 /* obd_config.c */
-void lustre_register_client_process_config(int (*cpc)(struct lustre_cfg *lcfg));
-
 int lprocfs_stats_alloc_one(struct lprocfs_stats *stats,
 			    unsigned int cpuid);
 int lprocfs_stats_lock(struct lprocfs_stats *stats,
diff --git a/fs/lustre/llite/lproc_llite.c b/fs/lustre/llite/lproc_llite.c
index 4bce3a6..f5a1940 100644
--- a/fs/lustre/llite/lproc_llite.c
+++ b/fs/lustre/llite/lproc_llite.c
@@ -1552,18 +1552,14 @@ static void sbi_kobj_release(struct kobject *kobj)
 	.release	= sbi_kobj_release,
 };
 
-#define LPROCFS_TYPE_LATENCY \
-	(LPROCFS_TYPE_USEC | LPROCFS_CNTR_AVGMINMAX | LPROCFS_CNTR_STDDEV)
 static const struct llite_file_opcode {
 	u32		opcode;
 	u32		type;
 	const char	*opname;
 } llite_opcode_table[LPROC_LL_FILE_OPCODES] = {
 	/* file operation */
-	{ LPROC_LL_READ_BYTES,	LPROCFS_CNTR_AVGMINMAX | LPROCFS_TYPE_BYTES,
-		"read_bytes" },
-	{ LPROC_LL_WRITE_BYTES,	LPROCFS_CNTR_AVGMINMAX | LPROCFS_TYPE_BYTES,
-		"write_bytes" },
+	{ LPROC_LL_READ_BYTES,	LPROCFS_TYPE_BYTES_FULL, "read_bytes" },
+	{ LPROC_LL_WRITE_BYTES,	LPROCFS_TYPE_BYTES_FULL, "write_bytes" },
 	{ LPROC_LL_READ,	LPROCFS_TYPE_LATENCY,	"read" },
 	{ LPROC_LL_WRITE,	LPROCFS_TYPE_LATENCY,	"write" },
 	{ LPROC_LL_IOCTL,	LPROCFS_TYPE_REQS,	"ioctl" },
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 18/18] lnet: handle undefined parameters
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (16 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 17/18] lustre: obd: add new LPROCFS_TYPE_* James Simmons
@ 2020-07-02  0:04 ` James Simmons
  2020-07-02  4:47 ` [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 NeilBrown
  18 siblings, 0 replies; 20+ messages in thread
From: James Simmons @ 2020-07-02  0:04 UTC (permalink / raw)
  To: lustre-devel

From: Amir Shehata <ashehata@whamcloud.com>

If peer_tx_credits or peer_credits are 0, they should be
defaulted to the system defaults 8 and 256 respectively

WC-bug-id: https://jira.whamcloud.com/browse/LU-13662
Lustre-commit: d934eb3c4f638 ("LU-13662 lnet: handle undefined parameters")
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38894
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 include/linux/lnet/lib-lnet.h              |  4 +++-
 net/lnet/klnds/o2iblnd/o2iblnd_modparams.c |  4 ++--
 net/lnet/klnds/socklnd/socklnd_modparams.c |  4 ++--
 net/lnet/lnet/api-ni.c                     | 26 +++++++++++++++++++++++---
 4 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h
index def0923..75c0da7 100644
--- a/include/linux/lnet/lib-lnet.h
+++ b/include/linux/lnet/lib-lnet.h
@@ -81,8 +81,10 @@
 #define LNET_ACCEPTOR_MIN_RESERVED_PORT    512
 #define LNET_ACCEPTOR_MAX_RESERVED_PORT    1023
 
-/* default timeout */
+/* default timeout and credits */
 #define DEFAULT_PEER_TIMEOUT    180
+#define DEFAULT_PEER_CREDITS	8
+#define DEFAULT_CREDITS	256
 
 int choose_ipv4_src(u32 *ret, int interface, u32 dst_ipaddr, struct net *ns);
 
diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_modparams.c b/net/lnet/klnds/o2iblnd/o2iblnd_modparams.c
index 7407ced..f341376 100644
--- a/net/lnet/klnds/o2iblnd/o2iblnd_modparams.c
+++ b/net/lnet/klnds/o2iblnd/o2iblnd_modparams.c
@@ -67,11 +67,11 @@
 MODULE_PARM_DESC(ntx, "# of message descriptors allocated for each pool");
 
 /* NB: this value is shared by all CPTs */
-static int credits = 256;
+static int credits = DEFAULT_CREDITS;
 module_param(credits, int, 0444);
 MODULE_PARM_DESC(credits, "# concurrent sends");
 
-static int peer_credits = 8;
+static int peer_credits = DEFAULT_PEER_CREDITS;
 module_param(peer_credits, int, 0444);
 MODULE_PARM_DESC(peer_credits, "# concurrent sends to 1 peer");
 
diff --git a/net/lnet/klnds/socklnd/socklnd_modparams.c b/net/lnet/klnds/socklnd/socklnd_modparams.c
index b511e54..017627f 100644
--- a/net/lnet/klnds/socklnd/socklnd_modparams.c
+++ b/net/lnet/klnds/socklnd/socklnd_modparams.c
@@ -28,11 +28,11 @@
 module_param(sock_timeout, int, 0644);
 MODULE_PARM_DESC(sock_timeout, "dead socket timeout (seconds)");
 
-static int credits = 256;
+static int credits = DEFAULT_CREDITS;
 module_param(credits, int, 0444);
 MODULE_PARM_DESC(credits, "# concurrent sends");
 
-static int peer_credits = 8;
+static int peer_credits = DEFAULT_PEER_CREDITS;
 module_param(peer_credits, int, 0444);
 MODULE_PARM_DESC(peer_credits, "# concurrent sends to 1 peer");
 
diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c
index d6694cb..3e69435 100644
--- a/net/lnet/lnet/api-ni.c
+++ b/net/lnet/lnet/api-ni.c
@@ -3093,6 +3093,19 @@ static int lnet_add_net_common(struct lnet_net *net,
 	return rc;
 }
 
+static void
+lnet_set_tune_defaults(struct lnet_ioctl_config_lnd_tunables *tun)
+{
+	if (tun) {
+		if (!tun->lt_cmn.lct_peer_timeout)
+			tun->lt_cmn.lct_peer_timeout = DEFAULT_PEER_TIMEOUT;
+		if (!tun->lt_cmn.lct_peer_tx_credits)
+			tun->lt_cmn.lct_peer_tx_credits = DEFAULT_PEER_CREDITS;
+		if (!tun->lt_cmn.lct_max_tx_credits)
+			tun->lt_cmn.lct_max_tx_credits = DEFAULT_CREDITS;
+	}
+}
+
 static int lnet_handle_legacy_ip2nets(char *ip2nets,
 				      struct lnet_ioctl_config_lnd_tunables *tun)
 {
@@ -3109,6 +3122,8 @@ static int lnet_handle_legacy_ip2nets(char *ip2nets,
 	if (rc < 0)
 		return rc;
 
+	lnet_set_tune_defaults(tun);
+
 	mutex_lock(&the_lnet.ln_api_mutex);
 	while ((net = list_first_entry_or_null(&net_head,
 					       struct lnet_net,
@@ -3172,6 +3187,8 @@ int lnet_dyn_add_ni(struct lnet_ioctl_config_ni *conf)
 	if (!ni)
 		return -ENOMEM;
 
+	lnet_set_tune_defaults(tun);
+
 	mutex_lock(&the_lnet.ln_api_mutex);
 
 	rc = lnet_add_net_common(net, tun);
@@ -3304,13 +3321,16 @@ int lnet_dyn_del_ni(struct lnet_ioctl_config_ni *conf)
 	memset(&tun, 0, sizeof(tun));
 
 	tun.lt_cmn.lct_peer_timeout =
-		conf->cfg_config_u.cfg_net.net_peer_timeout;
+	 (!conf->cfg_config_u.cfg_net.net_peer_timeout) ? DEFAULT_PEER_TIMEOUT :
+	  conf->cfg_config_u.cfg_net.net_peer_timeout;
 	tun.lt_cmn.lct_peer_tx_credits =
-		conf->cfg_config_u.cfg_net.net_peer_tx_credits;
+	 (!conf->cfg_config_u.cfg_net.net_peer_tx_credits) ? DEFAULT_PEER_CREDITS :
+	  conf->cfg_config_u.cfg_net.net_peer_tx_credits;
 	tun.lt_cmn.lct_peer_rtr_credits =
 		conf->cfg_config_u.cfg_net.net_peer_rtr_credits;
 	tun.lt_cmn.lct_max_tx_credits =
-		conf->cfg_config_u.cfg_net.net_max_tx_credits;
+	 (!conf->cfg_config_u.cfg_net.net_max_tx_credits) ? DEFAULT_CREDITS :
+	  conf->cfg_config_u.cfg_net.net_max_tx_credits;
 
 	rc = lnet_add_net_common(net, &tun);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020
  2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
                   ` (17 preceding siblings ...)
  2020-07-02  0:04 ` [lustre-devel] [PATCH 18/18] lnet: handle undefined parameters James Simmons
@ 2020-07-02  4:47 ` NeilBrown
  18 siblings, 0 replies; 20+ messages in thread
From: NeilBrown @ 2020-07-02  4:47 UTC (permalink / raw)
  To: lustre-devel

On Wed, Jul 01 2020, James Simmons wrote:

> Port of patches that landed to the OpenSFS branch. A few patches added
> the were missing that enables potential lustre utilities building
> against the Linux client. Please review to make sure everything is
> okay.

Hi James,
 could you please always be explicit about which Linux client tree
 these are ported to?  It avoids confusion.

Thanks,
NeilBrown


>
> Alexander Boyko (1):
>   lustre: llite: don't hold inode_lock for security notify
>
> Alexey Lyashkov (2):
>   lnet: restore an maximal fragments count
>   lnet: o2ib: fix page mapping error
>
> Amir Shehata (1):
>   lnet: handle undefined parameters
>
> Andriy Skulysh (1):
>   lustre: llite: truncate deadlock with DoM files
>
> Chris Horn (1):
>   lnet: Skip health and resends for single rail configs
>
> Emoly Liu (1):
>   lustre: obd: add new LPROCFS_TYPE_*
>
> Gregoire Pichon (1):
>   lnet: define new network driver ptl4lnd
>
> Hongchao Zhang (1):
>   lustre: mdc: chlg device could be used after free
>
> James Simmons (2):
>   lustre: llite: bind kthread thread to accepted node set
>   lustre: lov: use lov_pattern_support() to verify lmm
>
> Lai Siyao (1):
>   lustre: mdt: don't fetch LOOKUP lock for remote object
>
> Mikhail Pershin (1):
>   lustre: ptlrpc: limit rate of lock replays
>
> Sebastien Buisson (5):
>   lustre: sec: encryption for write path
>   lustre: sec: decryption for read path
>   lustre: sec: deal with encrypted object size
>   lustre: sec: support truncate for encrypted files
>   lustre: sec: ioctls to handle encryption policies
>
>  fs/lustre/include/lprocfs_status.h         |   9 +-
>  fs/lustre/include/lustre_import.h          |   2 +
>  fs/lustre/include/lustre_osc.h             |   1 +
>  fs/lustre/include/obd.h                    |  19 ++-
>  fs/lustre/include/obd_class.h              |   3 +-
>  fs/lustre/include/obd_support.h            |   5 +
>  fs/lustre/ldlm/ldlm_request.c              |  69 ++++++++++-
>  fs/lustre/llite/crypto.c                   |  15 ++-
>  fs/lustre/llite/dir.c                      |  50 +++++++-
>  fs/lustre/llite/file.c                     |  19 ++-
>  fs/lustre/llite/llite_internal.h           |   1 +
>  fs/lustre/llite/llite_lib.c                | 187 ++++++++++++++++++++++++++++-
>  fs/lustre/llite/lproc_llite.c              |   8 +-
>  fs/lustre/llite/namei.c                    | 105 +++++++++++++---
>  fs/lustre/llite/rw.c                       |  13 +-
>  fs/lustre/llite/rw26.c                     |   4 +
>  fs/lustre/llite/statahead.c                |  11 +-
>  fs/lustre/llite/vvp_io.c                   |  17 ++-
>  fs/lustre/lmv/lmv_intent.c                 |  19 ++-
>  fs/lustre/lmv/lmv_internal.h               |   1 +
>  fs/lustre/lmv/lmv_obd.c                    |   3 +-
>  fs/lustre/lov/lov_ea.c                     |   6 +-
>  fs/lustre/mdc/mdc_changelog.c              |  46 ++++---
>  fs/lustre/mdc/mdc_internal.h               |   1 +
>  fs/lustre/mdc/mdc_request.c                |   8 +-
>  fs/lustre/obdclass/genops.c                |   1 +
>  fs/lustre/obdecho/echo_client.c            |   2 +
>  fs/lustre/obdecho/echo_internal.h          |   3 +
>  fs/lustre/osc/osc_internal.h               |   1 +
>  fs/lustre/osc/osc_request.c                | 121 ++++++++++++++++++-
>  fs/lustre/ptlrpc/import.c                  |   8 +-
>  include/linux/lnet/lib-lnet.h              |   4 +-
>  include/linux/lnet/lib-types.h             |   2 +-
>  include/uapi/linux/lnet/nidstr.h           |   1 +
>  include/uapi/linux/lustre/lustre_user.h    |   8 ++
>  net/lnet/klnds/o2iblnd/o2iblnd.c           |   7 +-
>  net/lnet/klnds/o2iblnd/o2iblnd_cb.c        |   3 +-
>  net/lnet/klnds/o2iblnd/o2iblnd_modparams.c |   4 +-
>  net/lnet/klnds/socklnd/socklnd_modparams.c |   4 +-
>  net/lnet/lnet/api-ni.c                     |  26 +++-
>  net/lnet/lnet/lib-msg.c                    |  65 +++++++---
>  net/lnet/lnet/nidstrings.c                 |   9 ++
>  42 files changed, 759 insertions(+), 132 deletions(-)
>
> -- 
> 1.8.3.1
>
> _______________________________________________
> lustre-devel mailing list
> lustre-devel at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20200702/d4f03367/attachment.sig>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-07-02  4:47 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-02  0:04 [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 01/18] lnet: restore an maximal fragments count James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 02/18] lnet: o2ib: fix page mapping error James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 03/18] lustre: sec: encryption for write path James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 04/18] lustre: sec: decryption for read path James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 05/18] lustre: sec: deal with encrypted object size James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 06/18] lustre: sec: support truncate for encrypted files James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 07/18] lustre: ptlrpc: limit rate of lock replays James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 08/18] lustre: mdc: chlg device could be used after free James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 09/18] lustre: llite: bind kthread thread to accepted node set James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 10/18] lustre: lov: use lov_pattern_support() to verify lmm James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 11/18] lustre: llite: truncate deadlock with DoM files James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 12/18] lnet: Skip health and resends for single rail configs James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 13/18] lustre: sec: ioctls to handle encryption policies James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 14/18] lnet: define new network driver ptl4lnd James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 15/18] lustre: llite: don't hold inode_lock for security notify James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 16/18] lustre: mdt: don't fetch LOOKUP lock for remote object James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 17/18] lustre: obd: add new LPROCFS_TYPE_* James Simmons
2020-07-02  0:04 ` [lustre-devel] [PATCH 18/18] lnet: handle undefined parameters James Simmons
2020-07-02  4:47 ` [lustre-devel] [PATCH 00/18] Port of OpenSFS landing as of July 1, 2020 NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).