All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH -next v2 0/5] rhashtable: guarantee first allocation
@ 2018-06-01 16:01 Davidlohr Bueso
  2018-06-01 16:01 ` [PATCH 1/5] lib/rhashtable: convert param sanitations to WARN_ON Davidlohr Bueso
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Davidlohr Bueso @ 2018-06-01 16:01 UTC (permalink / raw)
  To: akpm, torvalds
  Cc: tgraf, herbert, manfred, mhocko, guillaume.knispel, linux-api,
	linux-kernel, Davidlohr Bueso

Changes from v1
lkml.kernel.org/r/20180524211135.27760-1-dave@stgolabs.net

- patch 2 is reworked a bit based on the commments from Herbert Xu.
  o upon failure, retry immediately with GFP_NOFAIL (simpler)
  o the caller now passes the needed semantics, not bucket_table_alloc().
  o we consider min_size when resizing, not just HASH_MIN_SIZE.

- removed patch 3; not need after Michal's patch.

Hi,

This series is the result of the discussion with Linus around ipc
subsystem initialization and how it behaves with error return when
calling rhashtable_init()[1]. Instead of caring about the error
or calling the infamous BUG_ON, Linus suggested we guarantee the
rhashtable allocation.

First two patches modify rhashtable_init() to just return 0, future
patches will update more callers, particularly those that use BUG_ON.

patch 3+4 remove some ipc hacks we no longer need.

patch 5 updates the rhashtable test module. Trivial.

Please consider for v4.18.

Thanks!

[0] https://lkml.org/lkml/2018/5/23/758

Davidlohr Bueso (5):
  lib/rhashtable: convert param sanitations to WARN_ON
  lib/rhashtable: guarantee initial hashtable allocation
  ipc: get rid of ids->tables_initialized hack
  ipc: simplify ipc initialization
  lib/test_rhashtable: rhashtable_init() can no longer fail

 include/linux/ipc_namespace.h |  1 -
 ipc/msg.c                     |  9 ++++-----
 ipc/namespace.c               | 20 ++++----------------
 ipc/sem.c                     | 10 ++++------
 ipc/shm.c                     |  9 ++++-----
 ipc/util.c                    | 41 +++++++++++++----------------------------
 ipc/util.h                    | 18 +++++++++---------
 lib/rhashtable.c              | 22 ++++++++++++++--------
 lib/test_rhashtable.c         |  6 +-----
 9 files changed, 53 insertions(+), 83 deletions(-)

-- 
2.16.3

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/5] lib/rhashtable: convert param sanitations to WARN_ON
  2018-06-01 16:01 [PATCH -next v2 0/5] rhashtable: guarantee first allocation Davidlohr Bueso
@ 2018-06-01 16:01 ` Davidlohr Bueso
  2018-06-01 16:09   ` Herbert Xu
  2018-06-01 16:01 ` [PATCH 2/5] lib/rhashtable: guarantee initial hashtable allocation Davidlohr Bueso
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Davidlohr Bueso @ 2018-06-01 16:01 UTC (permalink / raw)
  To: akpm, torvalds
  Cc: tgraf, herbert, manfred, mhocko, guillaume.knispel, linux-api,
	linux-kernel, Davidlohr Bueso, Davidlohr Bueso

For the purpose of making rhashtable_init() unable to fail,
we can replace the returning -EINVAL with WARN_ONs whenever
the caller passes bogus parameters during initialization.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 lib/rhashtable.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 9427b5766134..05a4b1b8b8ce 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -1024,12 +1024,11 @@ int rhashtable_init(struct rhashtable *ht,
 
 	size = HASH_DEFAULT_SIZE;
 
-	if ((!params->key_len && !params->obj_hashfn) ||
-	    (params->obj_hashfn && !params->obj_cmpfn))
-		return -EINVAL;
+	WARN_ON((!params->key_len && !params->obj_hashfn) ||
+		(params->obj_hashfn && !params->obj_cmpfn));
 
-	if (params->nulls_base && params->nulls_base < (1U << RHT_BASE_SHIFT))
-		return -EINVAL;
+	WARN_ON(params->nulls_base &&
+		params->nulls_base < (1U << RHT_BASE_SHIFT));
 
 	memset(ht, 0, sizeof(*ht));
 	mutex_init(&ht->mutex);
-- 
2.16.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] lib/rhashtable: guarantee initial hashtable allocation
  2018-06-01 16:01 [PATCH -next v2 0/5] rhashtable: guarantee first allocation Davidlohr Bueso
  2018-06-01 16:01 ` [PATCH 1/5] lib/rhashtable: convert param sanitations to WARN_ON Davidlohr Bueso
@ 2018-06-01 16:01 ` Davidlohr Bueso
  2018-06-02  4:41   ` Herbert Xu
  2018-06-04 11:46   ` Michal Hocko
  2018-06-01 16:01 ` [PATCH 3/5] ipc: get rid of ids->tables_initialized hack Davidlohr Bueso
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 15+ messages in thread
From: Davidlohr Bueso @ 2018-06-01 16:01 UTC (permalink / raw)
  To: akpm, torvalds
  Cc: tgraf, herbert, manfred, mhocko, guillaume.knispel, linux-api,
	linux-kernel, Davidlohr Bueso, Davidlohr Bueso

rhashtable_init() may fail due to -ENOMEM, thus making the
entire api unusable. This patch removes this scenario,
however unlikely. In order to guarantee memory allocation,
this patch always ends up doing GFP_KERNEL|__GFP_NOFAIL
for both the tbl as well as alloc_bucket_spinlocks().

Upon the first table allocation failure, we shrink the
size to the smallest value that makes sense retry with
__GFP_NOFAIL semantics. With the defaults, this means that
from 64 buckets, we retry with only 4. Any later issues
regarding performance due to collisions or larger table
resizing (when more memory becomes available) is the last
of our problems.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 lib/rhashtable.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/lib/rhashtable.c b/lib/rhashtable.c
index 05a4b1b8b8ce..ae17da6f0c75 100644
--- a/lib/rhashtable.c
+++ b/lib/rhashtable.c
@@ -175,7 +175,7 @@ static struct bucket_table *bucket_table_alloc(struct rhashtable *ht,
 	int i;
 
 	size = sizeof(*tbl) + nbuckets * sizeof(tbl->buckets[0]);
-	if (gfp != GFP_KERNEL)
+	if ((gfp & ~__GFP_NOFAIL) != GFP_KERNEL)
 		tbl = kzalloc(size, gfp | __GFP_NOWARN | __GFP_NORETRY);
 	else
 		tbl = kvzalloc(size, gfp);
@@ -1067,9 +1067,16 @@ int rhashtable_init(struct rhashtable *ht,
 		}
 	}
 
+	/*
+	 * This is api initialization and thus we need to guarantee the
+	 * initial rhashtable allocation. Upon failure, retry with the
+	 * smallest possible size with __GFP_NOFAIL semantics.
+	 */
 	tbl = bucket_table_alloc(ht, size, GFP_KERNEL);
-	if (tbl == NULL)
-		return -ENOMEM;
+	if (unlikely(tbl == NULL)) {
+		size = min_t(u16, ht->p.min_size, HASH_MIN_SIZE);
+		tbl = bucket_table_alloc(ht, size, GFP_KERNEL | __GFP_NOFAIL);
+	}
 
 	atomic_set(&ht->nelems, 0);
 
-- 
2.16.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] ipc: get rid of ids->tables_initialized hack
  2018-06-01 16:01 [PATCH -next v2 0/5] rhashtable: guarantee first allocation Davidlohr Bueso
  2018-06-01 16:01 ` [PATCH 1/5] lib/rhashtable: convert param sanitations to WARN_ON Davidlohr Bueso
  2018-06-01 16:01 ` [PATCH 2/5] lib/rhashtable: guarantee initial hashtable allocation Davidlohr Bueso
@ 2018-06-01 16:01 ` Davidlohr Bueso
  2018-06-01 16:01 ` [PATCH 4/5] ipc: simplify ipc initialization Davidlohr Bueso
  2018-06-01 16:01 ` [PATCH 5/5] lib/test_rhashtable: rhashtable_init() can no longer fail Davidlohr Bueso
  4 siblings, 0 replies; 15+ messages in thread
From: Davidlohr Bueso @ 2018-06-01 16:01 UTC (permalink / raw)
  To: akpm, torvalds
  Cc: tgraf, herbert, manfred, mhocko, guillaume.knispel, linux-api,
	linux-kernel, Davidlohr Bueso, Davidlohr Bueso

In sysvipc we have an ids->tables_initialized regarding the
rhashtable, introduced in:

    0cfb6aee70b (ipc: optimize semget/shmget/msgget for lots of keys).

It's there, specifically, to prevent nil pointer dereferences,
from using an uninitialized api. Considering how rhashtable_init()
can fail (probably due to ENOMEM, if anything), this made the
overall ipc initialization capable of failure as well. That alone
is ugly, but fine, however I've spotted a few issues regarding the
semantics of tables_initialized (however unlikely they may be):

- There is inconsistency in what we return to userspace: ipc_addid()
returns ENOSPC which is certainly _wrong_, while ipc_obtain_object_idr()
returns EINVAL.

- After we started using rhashtables, ipc_findkey() can return nil upon
!tables_initialized, but the caller expects nil for when the ipc structure
isn't found, and can therefore call into ipcget() callbacks.

Now that rhashtable initialization cannot fail, we can properly
get rid of the hack altogether.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 include/linux/ipc_namespace.h |  1 -
 ipc/util.c                    | 23 ++++++++---------------
 2 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h
index b5630c8eb2f3..37f3a4b7c637 100644
--- a/include/linux/ipc_namespace.h
+++ b/include/linux/ipc_namespace.h
@@ -16,7 +16,6 @@ struct user_namespace;
 struct ipc_ids {
 	int in_use;
 	unsigned short seq;
-	bool tables_initialized;
 	struct rw_semaphore rwsem;
 	struct idr ipcs_idr;
 	int max_id;
diff --git a/ipc/util.c b/ipc/util.c
index 4e81182fa0ac..823e09e72c58 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -125,7 +125,6 @@ int ipc_init_ids(struct ipc_ids *ids)
 	if (err)
 		return err;
 	idr_init(&ids->ipcs_idr);
-	ids->tables_initialized = true;
 	ids->max_id = -1;
 #ifdef CONFIG_CHECKPOINT_RESTORE
 	ids->next_id = -1;
@@ -178,19 +177,16 @@ void __init ipc_init_proc_interface(const char *path, const char *header,
  */
 static struct kern_ipc_perm *ipc_findkey(struct ipc_ids *ids, key_t key)
 {
-	struct kern_ipc_perm *ipcp = NULL;
+	struct kern_ipc_perm *ipcp;
 
-	if (likely(ids->tables_initialized))
-		ipcp = rhashtable_lookup_fast(&ids->key_ht, &key,
+	ipcp = rhashtable_lookup_fast(&ids->key_ht, &key,
 					      ipc_kht_params);
+	if (!ipcp)
+		return NULL;
 
-	if (ipcp) {
-		rcu_read_lock();
-		ipc_lock_object(ipcp);
-		return ipcp;
-	}
-
-	return NULL;
+	rcu_read_lock();
+	ipc_lock_object(ipcp);
+	return ipcp;
 }
 
 #ifdef CONFIG_CHECKPOINT_RESTORE
@@ -255,7 +251,7 @@ int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm *new, int limit)
 	if (limit > IPCMNI)
 		limit = IPCMNI;
 
-	if (!ids->tables_initialized || ids->in_use >= limit)
+	if (ids->in_use >= limit)
 		return -ENOSPC;
 
 	idr_preload(GFP_KERNEL);
@@ -566,9 +562,6 @@ struct kern_ipc_perm *ipc_obtain_object_idr(struct ipc_ids *ids, int id)
 	struct kern_ipc_perm *out;
 	int lid = ipcid_to_idx(id);
 
-	if (unlikely(!ids->tables_initialized))
-		return ERR_PTR(-EINVAL);
-
 	out = idr_find(&ids->ipcs_idr, lid);
 	if (!out)
 		return ERR_PTR(-EINVAL);
-- 
2.16.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] ipc: simplify ipc initialization
  2018-06-01 16:01 [PATCH -next v2 0/5] rhashtable: guarantee first allocation Davidlohr Bueso
                   ` (2 preceding siblings ...)
  2018-06-01 16:01 ` [PATCH 3/5] ipc: get rid of ids->tables_initialized hack Davidlohr Bueso
@ 2018-06-01 16:01 ` Davidlohr Bueso
  2018-06-01 16:01 ` [PATCH 5/5] lib/test_rhashtable: rhashtable_init() can no longer fail Davidlohr Bueso
  4 siblings, 0 replies; 15+ messages in thread
From: Davidlohr Bueso @ 2018-06-01 16:01 UTC (permalink / raw)
  To: akpm, torvalds
  Cc: tgraf, herbert, manfred, mhocko, guillaume.knispel, linux-api,
	linux-kernel, Davidlohr Bueso, Davidlohr Bueso

Now that we know that rhashtable_init() will not fail, we
can get rid of a lot of the unnecessary cleanup paths when
the call errored out.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 ipc/msg.c       |  9 ++++-----
 ipc/namespace.c | 20 ++++----------------
 ipc/sem.c       | 10 ++++------
 ipc/shm.c       |  9 ++++-----
 ipc/util.c      | 18 +++++-------------
 ipc/util.h      | 18 +++++++++---------
 6 files changed, 30 insertions(+), 54 deletions(-)

diff --git a/ipc/msg.c b/ipc/msg.c
index 3b6545302598..62545ce19173 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -1228,7 +1228,7 @@ COMPAT_SYSCALL_DEFINE5(msgrcv, int, msqid, compat_uptr_t, msgp,
 }
 #endif
 
-int msg_init_ns(struct ipc_namespace *ns)
+void msg_init_ns(struct ipc_namespace *ns)
 {
 	ns->msg_ctlmax = MSGMAX;
 	ns->msg_ctlmnb = MSGMNB;
@@ -1236,7 +1236,7 @@ int msg_init_ns(struct ipc_namespace *ns)
 
 	atomic_set(&ns->msg_bytes, 0);
 	atomic_set(&ns->msg_hdrs, 0);
-	return ipc_init_ids(&ns->ids[IPC_MSG_IDS]);
+	ipc_init_ids(&ns->ids[IPC_MSG_IDS]);
 }
 
 #ifdef CONFIG_IPC_NS
@@ -1277,12 +1277,11 @@ static int sysvipc_msg_proc_show(struct seq_file *s, void *it)
 }
 #endif
 
-int __init msg_init(void)
+void __init msg_init(void)
 {
-	const int err = msg_init_ns(&init_ipc_ns);
+	msg_init_ns(&init_ipc_ns);
 
 	ipc_init_proc_interface("sysvipc/msg",
 				"       key      msqid perms      cbytes       qnum lspid lrpid   uid   gid  cuid  cgid      stime      rtime      ctime\n",
 				IPC_MSG_IDS, sysvipc_msg_proc_show);
-	return err;
 }
diff --git a/ipc/namespace.c b/ipc/namespace.c
index f59a89966f92..21607791d62c 100644
--- a/ipc/namespace.c
+++ b/ipc/namespace.c
@@ -55,28 +55,16 @@ static struct ipc_namespace *create_ipc_ns(struct user_namespace *user_ns,
 	ns->user_ns = get_user_ns(user_ns);
 	ns->ucounts = ucounts;
 
-	err = sem_init_ns(ns);
+	err = mq_init_ns(ns);
 	if (err)
 		goto fail_put;
-	err = msg_init_ns(ns);
-	if (err)
-		goto fail_destroy_sem;
-	err = shm_init_ns(ns);
-	if (err)
-		goto fail_destroy_msg;
 
-	err = mq_init_ns(ns);
-	if (err)
-		goto fail_destroy_shm;
+	sem_init_ns(ns);
+	msg_init_ns(ns);
+	shm_init_ns(ns);
 
 	return ns;
 
-fail_destroy_shm:
-	shm_exit_ns(ns);
-fail_destroy_msg:
-	msg_exit_ns(ns);
-fail_destroy_sem:
-	sem_exit_ns(ns);
 fail_put:
 	put_user_ns(ns->user_ns);
 	ns_free_inum(&ns->ns);
diff --git a/ipc/sem.c b/ipc/sem.c
index 20f649eed8c6..78ea913ee0d8 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -220,14 +220,14 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void *it);
 #define sc_semopm	sem_ctls[2]
 #define sc_semmni	sem_ctls[3]
 
-int sem_init_ns(struct ipc_namespace *ns)
+void sem_init_ns(struct ipc_namespace *ns)
 {
 	ns->sc_semmsl = SEMMSL;
 	ns->sc_semmns = SEMMNS;
 	ns->sc_semopm = SEMOPM;
 	ns->sc_semmni = SEMMNI;
 	ns->used_sems = 0;
-	return ipc_init_ids(&ns->ids[IPC_SEM_IDS]);
+	ipc_init_ids(&ns->ids[IPC_SEM_IDS]);
 }
 
 #ifdef CONFIG_IPC_NS
@@ -239,14 +239,12 @@ void sem_exit_ns(struct ipc_namespace *ns)
 }
 #endif
 
-int __init sem_init(void)
+void __init sem_init(void)
 {
-	const int err = sem_init_ns(&init_ipc_ns);
-
+	sem_init_ns(&init_ipc_ns);
 	ipc_init_proc_interface("sysvipc/sem",
 				"       key      semid perms      nsems   uid   gid  cuid  cgid      otime      ctime\n",
 				IPC_SEM_IDS, sysvipc_sem_proc_show);
-	return err;
 }
 
 /**
diff --git a/ipc/shm.c b/ipc/shm.c
index 051a3e1fb8df..e602a6fc3991 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -95,14 +95,14 @@ static void shm_destroy(struct ipc_namespace *ns, struct shmid_kernel *shp);
 static int sysvipc_shm_proc_show(struct seq_file *s, void *it);
 #endif
 
-int shm_init_ns(struct ipc_namespace *ns)
+void shm_init_ns(struct ipc_namespace *ns)
 {
 	ns->shm_ctlmax = SHMMAX;
 	ns->shm_ctlall = SHMALL;
 	ns->shm_ctlmni = SHMMNI;
 	ns->shm_rmid_forced = 0;
 	ns->shm_tot = 0;
-	return ipc_init_ids(&shm_ids(ns));
+	ipc_init_ids(&shm_ids(ns));
 }
 
 /*
@@ -135,9 +135,8 @@ void shm_exit_ns(struct ipc_namespace *ns)
 
 static int __init ipc_ns_init(void)
 {
-	const int err = shm_init_ns(&init_ipc_ns);
-	WARN(err, "ipc: sysv shm_init_ns failed: %d\n", err);
-	return err;
+	shm_init_ns(&init_ipc_ns);
+	return 0;
 }
 
 pure_initcall(ipc_ns_init);
diff --git a/ipc/util.c b/ipc/util.c
index 823e09e72c58..06d7c575847c 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -87,16 +87,12 @@ struct ipc_proc_iface {
  */
 static int __init ipc_init(void)
 {
-	int err_sem, err_msg;
-
 	proc_mkdir("sysvipc", NULL);
-	err_sem = sem_init();
-	WARN(err_sem, "ipc: sysv sem_init failed: %d\n", err_sem);
-	err_msg = msg_init();
-	WARN(err_msg, "ipc: sysv msg_init failed: %d\n", err_msg);
+	sem_init();
+	msg_init();
 	shm_init();
 
-	return err_msg ? err_msg : err_sem;
+	return 0;
 }
 device_initcall(ipc_init);
 
@@ -115,21 +111,17 @@ static const struct rhashtable_params ipc_kht_params = {
  * Set up the sequence range to use for the ipc identifier range (limited
  * below IPCMNI) then initialise the keys hashtable and ids idr.
  */
-int ipc_init_ids(struct ipc_ids *ids)
+void ipc_init_ids(struct ipc_ids *ids)
 {
-	int err;
 	ids->in_use = 0;
 	ids->seq = 0;
 	init_rwsem(&ids->rwsem);
-	err = rhashtable_init(&ids->key_ht, &ipc_kht_params);
-	if (err)
-		return err;
+	rhashtable_init(&ids->key_ht, &ipc_kht_params);
 	idr_init(&ids->ipcs_idr);
 	ids->max_id = -1;
 #ifdef CONFIG_CHECKPOINT_RESTORE
 	ids->next_id = -1;
 #endif
-	return 0;
 }
 
 #ifdef CONFIG_PROC_FS
diff --git a/ipc/util.h b/ipc/util.h
index 0aba3230d007..65fad8a94da8 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -18,8 +18,8 @@
 #define IPCMNI 32768  /* <= MAX_INT limit for ipc arrays (including sysctl changes) */
 #define SEQ_MULTIPLIER	(IPCMNI)
 
-int sem_init(void);
-int msg_init(void);
+void sem_init(void);
+void msg_init(void);
 void shm_init(void);
 
 struct ipc_namespace;
@@ -34,17 +34,17 @@ static inline void mq_put_mnt(struct ipc_namespace *ns) { }
 #endif
 
 #ifdef CONFIG_SYSVIPC
-int sem_init_ns(struct ipc_namespace *ns);
-int msg_init_ns(struct ipc_namespace *ns);
-int shm_init_ns(struct ipc_namespace *ns);
+void sem_init_ns(struct ipc_namespace *ns);
+void msg_init_ns(struct ipc_namespace *ns);
+void shm_init_ns(struct ipc_namespace *ns);
 
 void sem_exit_ns(struct ipc_namespace *ns);
 void msg_exit_ns(struct ipc_namespace *ns);
 void shm_exit_ns(struct ipc_namespace *ns);
 #else
-static inline int sem_init_ns(struct ipc_namespace *ns) { return 0; }
-static inline int msg_init_ns(struct ipc_namespace *ns) { return 0; }
-static inline int shm_init_ns(struct ipc_namespace *ns) { return 0; }
+static inline void sem_init_ns(struct ipc_namespace *ns) { }
+static inline void msg_init_ns(struct ipc_namespace *ns) { }
+static inline void shm_init_ns(struct ipc_namespace *ns) { }
 
 static inline void sem_exit_ns(struct ipc_namespace *ns) { }
 static inline void msg_exit_ns(struct ipc_namespace *ns) { }
@@ -83,7 +83,7 @@ struct ipc_ops {
 struct seq_file;
 struct ipc_ids;
 
-int ipc_init_ids(struct ipc_ids *);
+void ipc_init_ids(struct ipc_ids *);
 #ifdef CONFIG_PROC_FS
 void __init ipc_init_proc_interface(const char *path, const char *header,
 		int ids, int (*show)(struct seq_file *, void *));
-- 
2.16.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] lib/test_rhashtable: rhashtable_init() can no longer fail
  2018-06-01 16:01 [PATCH -next v2 0/5] rhashtable: guarantee first allocation Davidlohr Bueso
                   ` (3 preceding siblings ...)
  2018-06-01 16:01 ` [PATCH 4/5] ipc: simplify ipc initialization Davidlohr Bueso
@ 2018-06-01 16:01 ` Davidlohr Bueso
  2018-06-02  4:42   ` Herbert Xu
  4 siblings, 1 reply; 15+ messages in thread
From: Davidlohr Bueso @ 2018-06-01 16:01 UTC (permalink / raw)
  To: akpm, torvalds
  Cc: tgraf, herbert, manfred, mhocko, guillaume.knispel, linux-api,
	linux-kernel, Davidlohr Bueso, Davidlohr Bueso

Update the test module as such.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 lib/test_rhashtable.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
index f4000c137dbe..a894eb0407f0 100644
--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -182,11 +182,7 @@ static void test_bucket_stats(struct rhashtable *ht, unsigned int entries)
 	struct rhashtable_iter hti;
 	struct rhash_head *pos;
 
-	err = rhashtable_walk_init(ht, &hti, GFP_KERNEL);
-	if (err) {
-		pr_warn("Test failed: allocation error");
-		return;
-	}
+	rhashtable_walk_init(ht, &hti, GFP_KERNEL);
 
 	rhashtable_walk_start(&hti);
 
-- 
2.16.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] lib/rhashtable: convert param sanitations to WARN_ON
  2018-06-01 16:01 ` [PATCH 1/5] lib/rhashtable: convert param sanitations to WARN_ON Davidlohr Bueso
@ 2018-06-01 16:09   ` Herbert Xu
  2018-06-01 16:53     ` Davidlohr Bueso
  0 siblings, 1 reply; 15+ messages in thread
From: Herbert Xu @ 2018-06-01 16:09 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: akpm, torvalds, tgraf, manfred, mhocko, guillaume.knispel,
	linux-api, linux-kernel, Davidlohr Bueso

On Fri, Jun 01, 2018 at 09:01:21AM -0700, Davidlohr Bueso wrote:
> For the purpose of making rhashtable_init() unable to fail,
> we can replace the returning -EINVAL with WARN_ONs whenever
> the caller passes bogus parameters during initialization.
> 
> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
> ---
>  lib/rhashtable.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> index 9427b5766134..05a4b1b8b8ce 100644
> --- a/lib/rhashtable.c
> +++ b/lib/rhashtable.c
> @@ -1024,12 +1024,11 @@ int rhashtable_init(struct rhashtable *ht,
>  
>  	size = HASH_DEFAULT_SIZE;
>  
> -	if ((!params->key_len && !params->obj_hashfn) ||
> -	    (params->obj_hashfn && !params->obj_cmpfn))
> -		return -EINVAL;
> +	WARN_ON((!params->key_len && !params->obj_hashfn) ||
> +		(params->obj_hashfn && !params->obj_cmpfn));
>  
> -	if (params->nulls_base && params->nulls_base < (1U << RHT_BASE_SHIFT))
> -		return -EINVAL;
> +	WARN_ON(params->nulls_base &&
> +		params->nulls_base < (1U << RHT_BASE_SHIFT));

I still don't like this.

Yes for your use-case you will never crash and a WARN_ON is fine.
However, rhashtable is used in all sorts of contexts and returning
an error makes sense for quite a number of them.

So if you really want just add the WARN_ON to your own code:

	err = rhashtable_init(...)
	WARN_ON(err);

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] lib/rhashtable: convert param sanitations to WARN_ON
  2018-06-01 16:09   ` Herbert Xu
@ 2018-06-01 16:53     ` Davidlohr Bueso
  2018-06-01 16:59       ` Herbert Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Davidlohr Bueso @ 2018-06-01 16:53 UTC (permalink / raw)
  To: Herbert Xu
  Cc: akpm, torvalds, tgraf, manfred, mhocko, guillaume.knispel,
	linux-api, linux-kernel, Davidlohr Bueso

On Sat, 02 Jun 2018, Herbert Xu wrote:

>On Fri, Jun 01, 2018 at 09:01:21AM -0700, Davidlohr Bueso wrote:
>> For the purpose of making rhashtable_init() unable to fail,
>> we can replace the returning -EINVAL with WARN_ONs whenever
>> the caller passes bogus parameters during initialization.
>>
>> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
>> ---
>>  lib/rhashtable.c | 9 ++++-----
>>  1 file changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
>> index 9427b5766134..05a4b1b8b8ce 100644
>> --- a/lib/rhashtable.c
>> +++ b/lib/rhashtable.c
>> @@ -1024,12 +1024,11 @@ int rhashtable_init(struct rhashtable *ht,
>>
>>  	size = HASH_DEFAULT_SIZE;
>>
>> -	if ((!params->key_len && !params->obj_hashfn) ||
>> -	    (params->obj_hashfn && !params->obj_cmpfn))
>> -		return -EINVAL;
>> +	WARN_ON((!params->key_len && !params->obj_hashfn) ||
>> +		(params->obj_hashfn && !params->obj_cmpfn));
>>
>> -	if (params->nulls_base && params->nulls_base < (1U << RHT_BASE_SHIFT))
>> -		return -EINVAL;
>> +	WARN_ON(params->nulls_base &&
>> +		params->nulls_base < (1U << RHT_BASE_SHIFT));
>
>I still don't like this.
>
>Yes for your use-case you will never crash and a WARN_ON is fine.
>However, rhashtable is used in all sorts of contexts and returning
>an error makes sense for quite a number of them.

Curious, are these users setting up the param structure dynamically
or something that they can pass along bogus values?

If that's the case then yes, I definitely agree.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] lib/rhashtable: convert param sanitations to WARN_ON
  2018-06-01 16:53     ` Davidlohr Bueso
@ 2018-06-01 16:59       ` Herbert Xu
  2018-06-01 17:10         ` Davidlohr Bueso
  0 siblings, 1 reply; 15+ messages in thread
From: Herbert Xu @ 2018-06-01 16:59 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: akpm, torvalds, tgraf, manfred, mhocko, guillaume.knispel,
	linux-api, linux-kernel, Davidlohr Bueso

On Fri, Jun 01, 2018 at 09:53:47AM -0700, Davidlohr Bueso wrote:
>
> Curious, are these users setting up the param structure dynamically
> or something that they can pass along bogus values?
> 
> If that's the case then yes, I definitely agree.

It's just a quality of implementation issue.  This is a generic API.
Sure for early-boot users like yours it makes sense to just WARN_ON
rather than deal with the messy hash table allocation failure.

But for a driver author writing some kernel module it isn't nice
to WARN_ON and then crash on a NULL-pointer dereference when we
can cleanly fail the table init.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] lib/rhashtable: convert param sanitations to WARN_ON
  2018-06-01 16:59       ` Herbert Xu
@ 2018-06-01 17:10         ` Davidlohr Bueso
  0 siblings, 0 replies; 15+ messages in thread
From: Davidlohr Bueso @ 2018-06-01 17:10 UTC (permalink / raw)
  To: Herbert Xu
  Cc: akpm, torvalds, tgraf, manfred, mhocko, guillaume.knispel,
	linux-api, linux-kernel, Davidlohr Bueso

On Sat, 02 Jun 2018, Herbert Xu wrote:

>On Fri, Jun 01, 2018 at 09:53:47AM -0700, Davidlohr Bueso wrote:
>>
>> Curious, are these users setting up the param structure dynamically
>> or something that they can pass along bogus values?
>>
>> If that's the case then yes, I definitely agree.
>
>It's just a quality of implementation issue.  This is a generic API.
>Sure for early-boot users like yours it makes sense to just WARN_ON
>rather than deal with the messy hash table allocation failure.
>
>But for a driver author writing some kernel module it isn't nice
>to WARN_ON and then crash on a NULL-pointer dereference when we
>can cleanly fail the table init.

Fine, at least patch 2 applies without this one. So Andrew, if you
consider taking this series please drop patch 1 and 5 (which no
longer makes sense as rhashtable_init() won't be returning void
in the future).

If you want me to resend (assuming other issues are not pointed out),
I can do but I wanted to avoid spamming more the necessary.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] lib/rhashtable: guarantee initial hashtable allocation
  2018-06-01 16:01 ` [PATCH 2/5] lib/rhashtable: guarantee initial hashtable allocation Davidlohr Bueso
@ 2018-06-02  4:41   ` Herbert Xu
  2018-06-02  5:41     ` Davidlohr Bueso
  2018-06-04 11:46   ` Michal Hocko
  1 sibling, 1 reply; 15+ messages in thread
From: Herbert Xu @ 2018-06-02  4:41 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: akpm, torvalds, tgraf, manfred, mhocko, guillaume.knispel,
	linux-api, linux-kernel, Davidlohr Bueso

On Fri, Jun 01, 2018 at 09:01:22AM -0700, Davidlohr Bueso wrote:
>
> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> index 05a4b1b8b8ce..ae17da6f0c75 100644
> --- a/lib/rhashtable.c
> +++ b/lib/rhashtable.c
> @@ -175,7 +175,7 @@ static struct bucket_table *bucket_table_alloc(struct rhashtable *ht,
>  	int i;
>  
>  	size = sizeof(*tbl) + nbuckets * sizeof(tbl->buckets[0]);
> -	if (gfp != GFP_KERNEL)
> +	if ((gfp & ~__GFP_NOFAIL) != GFP_KERNEL)
>  		tbl = kzalloc(size, gfp | __GFP_NOWARN | __GFP_NORETRY);
>  	else
>  		tbl = kvzalloc(size, gfp);

There's another GFP_KERNEL check in this function that needs the
same treatment.

> @@ -1067,9 +1067,16 @@ int rhashtable_init(struct rhashtable *ht,
>  		}
>  	}
>  
> +	/*
> +	 * This is api initialization and thus we need to guarantee the
> +	 * initial rhashtable allocation. Upon failure, retry with the
> +	 * smallest possible size with __GFP_NOFAIL semantics.
> +	 */
>  	tbl = bucket_table_alloc(ht, size, GFP_KERNEL);
> -	if (tbl == NULL)
> -		return -ENOMEM;
> +	if (unlikely(tbl == NULL)) {
> +		size = min_t(u16, ht->p.min_size, HASH_MIN_SIZE);

You mean max_t?

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] lib/test_rhashtable: rhashtable_init() can no longer fail
  2018-06-01 16:01 ` [PATCH 5/5] lib/test_rhashtable: rhashtable_init() can no longer fail Davidlohr Bueso
@ 2018-06-02  4:42   ` Herbert Xu
  0 siblings, 0 replies; 15+ messages in thread
From: Herbert Xu @ 2018-06-02  4:42 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: akpm, torvalds, tgraf, manfred, mhocko, guillaume.knispel,
	linux-api, linux-kernel, Davidlohr Bueso

On Fri, Jun 01, 2018 at 09:01:25AM -0700, Davidlohr Bueso wrote:
> Update the test module as such.
> 
> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>

Please drop this patch.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] lib/rhashtable: guarantee initial hashtable allocation
  2018-06-02  4:41   ` Herbert Xu
@ 2018-06-02  5:41     ` Davidlohr Bueso
  2018-06-02 15:53       ` Herbert Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Davidlohr Bueso @ 2018-06-02  5:41 UTC (permalink / raw)
  To: Herbert Xu
  Cc: akpm, torvalds, tgraf, manfred, mhocko, guillaume.knispel,
	linux-api, linux-kernel, Davidlohr Bueso

On Sat, 02 Jun 2018, Herbert Xu wrote:
>>  	tbl = bucket_table_alloc(ht, size, GFP_KERNEL);
>> -	if (tbl == NULL)
>> -		return -ENOMEM;
>> +	if (unlikely(tbl == NULL)) {
>> +		size = min_t(u16, ht->p.min_size, HASH_MIN_SIZE);
>
>You mean max_t?

Not really. I considered some of the users to set quite a large min_size
(such as 1024 buckets). The min() makes sense to me in that it's the smallest
possible value. If memory later becomes available and the hashtable is resized
to a more appropriate value, couldn't any issues regarding collisions not be dealt
with organically? And we've agreed that allocating a tiny table is the
least of our problems.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] lib/rhashtable: guarantee initial hashtable allocation
  2018-06-02  5:41     ` Davidlohr Bueso
@ 2018-06-02 15:53       ` Herbert Xu
  0 siblings, 0 replies; 15+ messages in thread
From: Herbert Xu @ 2018-06-02 15:53 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: akpm, torvalds, tgraf, manfred, mhocko, guillaume.knispel,
	linux-api, linux-kernel, Davidlohr Bueso

On Fri, Jun 01, 2018 at 10:41:31PM -0700, Davidlohr Bueso wrote:
> On Sat, 02 Jun 2018, Herbert Xu wrote:
> > >  	tbl = bucket_table_alloc(ht, size, GFP_KERNEL);
> > > -	if (tbl == NULL)
> > > -		return -ENOMEM;
> > > +	if (unlikely(tbl == NULL)) {
> > > +		size = min_t(u16, ht->p.min_size, HASH_MIN_SIZE);
> > 
> > You mean max_t?
> 
> Not really. I considered some of the users to set quite a large min_size
> (such as 1024 buckets). The min() makes sense to me in that it's the smallest
> possible value. If memory later becomes available and the hashtable is resized
> to a more appropriate value, couldn't any issues regarding collisions not be dealt
> with organically? And we've agreed that allocating a tiny table is the
> least of our problems.

Huh? The min_size is a floor for the hash table size.  Some users
may need it because they cannot tolerate the insert-time allocation
or failure.

Your use of min_t against min_size makes absolutely no sense.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/5] lib/rhashtable: guarantee initial hashtable allocation
  2018-06-01 16:01 ` [PATCH 2/5] lib/rhashtable: guarantee initial hashtable allocation Davidlohr Bueso
  2018-06-02  4:41   ` Herbert Xu
@ 2018-06-04 11:46   ` Michal Hocko
  1 sibling, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2018-06-04 11:46 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: akpm, torvalds, tgraf, herbert, manfred, guillaume.knispel,
	linux-api, linux-kernel, Davidlohr Bueso

On Fri 01-06-18 09:01:22, Davidlohr Bueso wrote:
[...]
> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> index 05a4b1b8b8ce..ae17da6f0c75 100644
> --- a/lib/rhashtable.c
> +++ b/lib/rhashtable.c
> @@ -175,7 +175,7 @@ static struct bucket_table *bucket_table_alloc(struct rhashtable *ht,
>  	int i;
>  
>  	size = sizeof(*tbl) + nbuckets * sizeof(tbl->buckets[0]);
> -	if (gfp != GFP_KERNEL)
> +	if ((gfp & ~__GFP_NOFAIL) != GFP_KERNEL)
>  		tbl = kzalloc(size, gfp | __GFP_NOWARN | __GFP_NORETRY);
>  	else
>  		tbl = kvzalloc(size, gfp);

Just a heads up. This is not longer needed after
http://lkml.kernel.org/r/20180601115329.27807-1-mhocko@kernel.org
is applied. Moreover it will conflict at that spot.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-06-04 11:46 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-01 16:01 [PATCH -next v2 0/5] rhashtable: guarantee first allocation Davidlohr Bueso
2018-06-01 16:01 ` [PATCH 1/5] lib/rhashtable: convert param sanitations to WARN_ON Davidlohr Bueso
2018-06-01 16:09   ` Herbert Xu
2018-06-01 16:53     ` Davidlohr Bueso
2018-06-01 16:59       ` Herbert Xu
2018-06-01 17:10         ` Davidlohr Bueso
2018-06-01 16:01 ` [PATCH 2/5] lib/rhashtable: guarantee initial hashtable allocation Davidlohr Bueso
2018-06-02  4:41   ` Herbert Xu
2018-06-02  5:41     ` Davidlohr Bueso
2018-06-02 15:53       ` Herbert Xu
2018-06-04 11:46   ` Michal Hocko
2018-06-01 16:01 ` [PATCH 3/5] ipc: get rid of ids->tables_initialized hack Davidlohr Bueso
2018-06-01 16:01 ` [PATCH 4/5] ipc: simplify ipc initialization Davidlohr Bueso
2018-06-01 16:01 ` [PATCH 5/5] lib/test_rhashtable: rhashtable_init() can no longer fail Davidlohr Bueso
2018-06-02  4:42   ` Herbert Xu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.