linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] ipc subsystem refcounter conversions
@ 2017-02-20 11:29 Elena Reshetova
  2017-02-20 11:29 ` [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t Elena Reshetova
                   ` (5 more replies)
  0 siblings, 6 replies; 43+ messages in thread
From: Elena Reshetova @ 2017-02-20 11:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, gregkh, akpm, ebiederm, mingo, adobriyan, serge,
	arozansk, dave, Elena Reshetova

Now when new refcount_t type and API are finally merged
(see include/linux/refcount.h), the following
patches convert various refcounters in the ipc susystem from atomic_t
to refcount_t. By doing this we prevent intentional or accidental
underflows or overflows that can led to use-after-free vulnerabilities.

The below patches are fully independent and can be cherry-picked separately.
Since we convert all kernel subsystems in the same fashion, resulting
in about 300 patches, we have to group them for sending at least in some
fashion to be manageable. Please excuse the long cc list.

Elena Reshetova (3):
  ipc: convert ipc_namespace.count from atomic_t to refcount_t
  ipc: convert sem_undo_list.refcnt from atomic_t to refcount_t
  ipc: convert ipc_rcu.refcount from atomic_t to refcount_t

 include/linux/ipc_namespace.h | 5 +++--
 ipc/msgutil.c                 | 2 +-
 ipc/namespace.c               | 4 ++--
 ipc/sem.c                     | 8 ++++----
 ipc/util.c                    | 6 +++---
 ipc/util.h                    | 3 ++-
 6 files changed, 15 insertions(+), 13 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-02-20 11:29 [PATCH 0/3] ipc subsystem refcounter conversions Elena Reshetova
@ 2017-02-20 11:29 ` Elena Reshetova
  2017-05-27 19:41   ` Kees Cook
  2017-02-20 11:29 ` [PATCH 2/3] ipc: convert sem_undo_list.refcnt " Elena Reshetova
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 43+ messages in thread
From: Elena Reshetova @ 2017-02-20 11:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, gregkh, akpm, ebiederm, mingo, adobriyan, serge,
	arozansk, dave, Elena Reshetova, Hans Liljestrand, Kees Cook,
	David Windsor

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
---
 include/linux/ipc_namespace.h | 5 +++--
 ipc/msgutil.c                 | 2 +-
 ipc/namespace.c               | 4 ++--
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h
index 848e579..7230638 100644
--- a/include/linux/ipc_namespace.h
+++ b/include/linux/ipc_namespace.h
@@ -7,6 +7,7 @@
 #include <linux/notifier.h>
 #include <linux/nsproxy.h>
 #include <linux/ns_common.h>
+#include <linux/refcount.h>
 
 struct user_namespace;
 
@@ -19,7 +20,7 @@ struct ipc_ids {
 };
 
 struct ipc_namespace {
-	atomic_t	count;
+	refcount_t	count;
 	struct ipc_ids	ids[3];
 
 	int		sem_ctls[4];
@@ -118,7 +119,7 @@ extern struct ipc_namespace *copy_ipcs(unsigned long flags,
 static inline struct ipc_namespace *get_ipc_ns(struct ipc_namespace *ns)
 {
 	if (ns)
-		atomic_inc(&ns->count);
+		refcount_inc(&ns->count);
 	return ns;
 }
 
diff --git a/ipc/msgutil.c b/ipc/msgutil.c
index bf74eaa..8459802 100644
--- a/ipc/msgutil.c
+++ b/ipc/msgutil.c
@@ -29,7 +29,7 @@ DEFINE_SPINLOCK(mq_lock);
  * and not CONFIG_IPC_NS.
  */
 struct ipc_namespace init_ipc_ns = {
-	.count		= ATOMIC_INIT(1),
+	.count		= REFCOUNT_INIT(1),
 	.user_ns = &init_user_ns,
 	.ns.inum = PROC_IPC_INIT_INO,
 #ifdef CONFIG_IPC_NS
diff --git a/ipc/namespace.c b/ipc/namespace.c
index 0abdea4..ed10bbc 100644
--- a/ipc/namespace.c
+++ b/ipc/namespace.c
@@ -48,7 +48,7 @@ static struct ipc_namespace *create_ipc_ns(struct user_namespace *user_ns,
 		goto fail_free;
 	ns->ns.ops = &ipcns_operations;
 
-	atomic_set(&ns->count, 1);
+	refcount_set(&ns->count, 1);
 	ns->user_ns = get_user_ns(user_ns);
 	ns->ucounts = ucounts;
 
@@ -142,7 +142,7 @@ static void free_ipc_ns(struct ipc_namespace *ns)
  */
 void put_ipc_ns(struct ipc_namespace *ns)
 {
-	if (atomic_dec_and_lock(&ns->count, &mq_lock)) {
+	if (refcount_dec_and_lock(&ns->count, &mq_lock)) {
 		mq_clear_sbinfo(ns);
 		spin_unlock(&mq_lock);
 		mq_put_mnt(ns);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 2/3] ipc: convert sem_undo_list.refcnt from atomic_t to refcount_t
  2017-02-20 11:29 [PATCH 0/3] ipc subsystem refcounter conversions Elena Reshetova
  2017-02-20 11:29 ` [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t Elena Reshetova
@ 2017-02-20 11:29 ` Elena Reshetova
  2017-05-27 19:44   ` Kees Cook
  2017-02-20 11:29 ` [PATCH 3/3] ipc: convert ipc_rcu.refcount " Elena Reshetova
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 43+ messages in thread
From: Elena Reshetova @ 2017-02-20 11:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, gregkh, akpm, ebiederm, mingo, adobriyan, serge,
	arozansk, dave, Elena Reshetova, Hans Liljestrand, Kees Cook,
	David Windsor

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
---
 ipc/sem.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index e468cd1..9063ffa 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -139,7 +139,7 @@ struct sem_undo {
  * that may be shared among all a CLONE_SYSVSEM task group.
  */
 struct sem_undo_list {
-	atomic_t		refcnt;
+	refcount_t		refcnt;
 	spinlock_t		lock;
 	struct list_head	list_proc;
 };
@@ -1646,7 +1646,7 @@ static inline int get_undo_list(struct sem_undo_list **undo_listp)
 		if (undo_list == NULL)
 			return -ENOMEM;
 		spin_lock_init(&undo_list->lock);
-		atomic_set(&undo_list->refcnt, 1);
+		refcount_set(&undo_list->refcnt, 1);
 		INIT_LIST_HEAD(&undo_list->list_proc);
 
 		current->sysvsem.undo_list = undo_list;
@@ -2045,7 +2045,7 @@ int copy_semundo(unsigned long clone_flags, struct task_struct *tsk)
 		error = get_undo_list(&undo_list);
 		if (error)
 			return error;
-		atomic_inc(&undo_list->refcnt);
+		refcount_inc(&undo_list->refcnt);
 		tsk->sysvsem.undo_list = undo_list;
 	} else
 		tsk->sysvsem.undo_list = NULL;
@@ -2074,7 +2074,7 @@ void exit_sem(struct task_struct *tsk)
 		return;
 	tsk->sysvsem.undo_list = NULL;
 
-	if (!atomic_dec_and_test(&ulp->refcnt))
+	if (!refcount_dec_and_test(&ulp->refcnt))
 		return;
 
 	for (;;) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 3/3] ipc: convert ipc_rcu.refcount from atomic_t to refcount_t
  2017-02-20 11:29 [PATCH 0/3] ipc subsystem refcounter conversions Elena Reshetova
  2017-02-20 11:29 ` [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t Elena Reshetova
  2017-02-20 11:29 ` [PATCH 2/3] ipc: convert sem_undo_list.refcnt " Elena Reshetova
@ 2017-02-20 11:29 ` Elena Reshetova
  2017-05-27 19:47   ` Kees Cook
  2017-02-20 11:42 ` [PATCH 0/3] ipc subsystem refcounter conversions Andy Shevchenko
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 43+ messages in thread
From: Elena Reshetova @ 2017-02-20 11:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, gregkh, akpm, ebiederm, mingo, adobriyan, serge,
	arozansk, dave, Elena Reshetova, Hans Liljestrand, Kees Cook,
	David Windsor

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
---
 ipc/util.c | 6 +++---
 ipc/util.h | 3 ++-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/ipc/util.c b/ipc/util.c
index 798cad1..24484a6 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -437,7 +437,7 @@ void *ipc_rcu_alloc(int size)
 	struct ipc_rcu *out = ipc_alloc(sizeof(struct ipc_rcu) + size);
 	if (unlikely(!out))
 		return NULL;
-	atomic_set(&out->refcount, 1);
+	refcount_set(&out->refcount, 1);
 	return out + 1;
 }
 
@@ -445,14 +445,14 @@ int ipc_rcu_getref(void *ptr)
 {
 	struct ipc_rcu *p = ((struct ipc_rcu *)ptr) - 1;
 
-	return atomic_inc_not_zero(&p->refcount);
+	return refcount_inc_not_zero(&p->refcount);
 }
 
 void ipc_rcu_putref(void *ptr, void (*func)(struct rcu_head *head))
 {
 	struct ipc_rcu *p = ((struct ipc_rcu *)ptr) - 1;
 
-	if (!atomic_dec_and_test(&p->refcount))
+	if (!refcount_dec_and_test(&p->refcount))
 		return;
 
 	call_rcu(&p->rcu, func);
diff --git a/ipc/util.h b/ipc/util.h
index 51f7ca5..274ec9b 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -12,6 +12,7 @@
 
 #include <linux/unistd.h>
 #include <linux/err.h>
+#include <linux/refcount.h>
 
 #define SEQ_MULTIPLIER	(IPCMNI)
 
@@ -49,7 +50,7 @@ static inline void shm_exit_ns(struct ipc_namespace *ns) { }
 
 struct ipc_rcu {
 	struct rcu_head rcu;
-	atomic_t refcount;
+	refcount_t refcount;
 } ____cacheline_aligned_in_smp;
 
 #define ipc_rcu_to_struct(p)  ((void *)(p+1))
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] ipc subsystem refcounter conversions
  2017-02-20 11:29 [PATCH 0/3] ipc subsystem refcounter conversions Elena Reshetova
                   ` (2 preceding siblings ...)
  2017-02-20 11:29 ` [PATCH 3/3] ipc: convert ipc_rcu.refcount " Elena Reshetova
@ 2017-02-20 11:42 ` Andy Shevchenko
  2017-02-20 12:30   ` Reshetova, Elena
  2017-02-22 15:41 ` Davidlohr Bueso
  2017-03-04  0:23 ` Andrew Morton
  5 siblings, 1 reply; 43+ messages in thread
From: Andy Shevchenko @ 2017-02-20 11:42 UTC (permalink / raw)
  To: Elena Reshetova
  Cc: linux-kernel, Peter Zijlstra (Intel),
	Greg Kroah-Hartman, Andrew Morton, ebiederm, Ingo Molnar,
	Alexey Dobriyan, Serge E. Hallyn, arozansk, dave

On Mon, Feb 20, 2017 at 1:29 PM, Elena Reshetova
<elena.reshetova@intel.com> wrote:
> Now when new refcount_t type and API are finally merged
> (see include/linux/refcount.h), the following
> patches convert various refcounters in the ipc susystem from atomic_t
> to refcount_t. By doing this we prevent intentional or accidental
> underflows or overflows that can led to use-after-free vulnerabilities.
>
> The below patches are fully independent and can be cherry-picked separately.
> Since we convert all kernel subsystems in the same fashion, resulting
> in about 300 patches, we have to group them for sending at least in some
> fashion to be manageable. Please excuse the long cc list.

Is that done using coccinelle?

Can I see the semantic patch (sorry if I missed it earlier)?

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH 0/3] ipc subsystem refcounter conversions
  2017-02-20 11:42 ` [PATCH 0/3] ipc subsystem refcounter conversions Andy Shevchenko
@ 2017-02-20 12:30   ` Reshetova, Elena
  0 siblings, 0 replies; 43+ messages in thread
From: Reshetova, Elena @ 2017-02-20 12:30 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: linux-kernel, Peter Zijlstra (Intel),
	Greg Kroah-Hartman, Andrew Morton, ebiederm, Ingo Molnar,
	Alexey Dobriyan, Serge E. Hallyn, arozansk, dave

[-- Attachment #1: Type: text/plain, Size: 1246 bytes --]

> On Mon, Feb 20, 2017 at 1:29 PM, Elena Reshetova
> <elena.reshetova@intel.com> wrote:
> > Now when new refcount_t type and API are finally merged
> > (see include/linux/refcount.h), the following
> > patches convert various refcounters in the ipc susystem from atomic_t
> > to refcount_t. By doing this we prevent intentional or accidental
> > underflows or overflows that can led to use-after-free vulnerabilities.
> >
> > The below patches are fully independent and can be cherry-picked separately.
> > Since we convert all kernel subsystems in the same fashion, resulting
> > in about 300 patches, we have to group them for sending at least in some
> > fashion to be manageable. Please excuse the long cc list.
> 
> Is that done using coccinelle?

Yes and no. 
The *finding* of cases that should be converted was done using coccinelle, but actual conversion was done manually for each case and not via semantic patch. 
There were many false-positives and all kind of other issues, so we had to analyse each variable separately to the extend we understand the code.  

> 
> Can I see the semantic patch (sorry if I missed it earlier)?

Attached is the one we used to initially find variables. 

Best Regards,
Elena.

[-- Attachment #2: atomic_as_refount.cocci --]
[-- Type: application/octet-stream, Size: 1339 bytes --]

@r1 exists@
identifier a, x, y;
position p1, p2;
identifier fname =~ ".*free.*";
identifier fname2 =~ ".*destroy.*";
identifier fname3 =~ ".*del.*";
identifier fname4 =~ ".*queue_work.*";
identifier fname5 =~ ".*schedule_work.*";
identifier fname6 =~ ".*call_rcu.*";


@@

(
 atomic_dec_and_test@p1(&(a)->x)
|
 atomic_dec_and_lock@p1(&(a)->x, ...)
|
 atomic_long_dec_and_lock@p1(&(a)->x, ...)
|
 atomic_long_dec_and_test@p1(&(a)->x)
|
 atomic64_dec_and_test@p1(&(a)->x)
|
 local_dec_and_test@p1(&(a)->x)
)
...
(
 fname@p2(a, ...);
|
 fname2@p2(...);
|
 fname3@p2(...);
|
 fname4@p2(...);
|
 fname5@p2(...);
|
 fname6@p2(...);
)


@script:python@
p1 << r1.p1;
p2 << r1.p2;
@@

print "* file: %s atomic_dec_and_test variation %s before object free %s" % (p1[0].file,p1[0].line,p2[0].line)

@r2 exists@
identifier a, x;
position p1;
@@

(
atomic_add_unless(&(a)->x,-1,1)@p1
|
atomic_long_add_unless(&(a)->x,-1,1)@p1
|
atomic64_add_unless(&(a)->x,-1,1)@p1
)

@script:python@
p1 << r2.p1;
@@

print "* file: %s atomic_add_unless %s" % (p1[0].file,p1[0].line)

@r3 exists@
identifier a, x;
position p1, p2;
@@

(
x = atomic_add_return@p1(-1, ...);
|
x = atomic_long_add_return@p1(-1, ...);
|
x = atomic64_add_return@p1(-1, ...);
)

@script:python@
p1 << r3.p1;
@@

print "* file: %s x = atomic_add_return(-1, ...) %s" % (p1[0].file,p1[0].line)


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] ipc subsystem refcounter conversions
  2017-02-20 11:29 [PATCH 0/3] ipc subsystem refcounter conversions Elena Reshetova
                   ` (3 preceding siblings ...)
  2017-02-20 11:42 ` [PATCH 0/3] ipc subsystem refcounter conversions Andy Shevchenko
@ 2017-02-22 15:41 ` Davidlohr Bueso
  2017-03-04  0:23 ` Andrew Morton
  5 siblings, 0 replies; 43+ messages in thread
From: Davidlohr Bueso @ 2017-02-22 15:41 UTC (permalink / raw)
  To: Elena Reshetova
  Cc: linux-kernel, peterz, gregkh, akpm, ebiederm, mingo, adobriyan,
	serge, arozansk

On Mon, 20 Feb 2017, Elena Reshetova wrote:

> include/linux/ipc_namespace.h | 5 +++--
> ipc/msgutil.c                 | 2 +-
> ipc/namespace.c               | 4 ++--
> ipc/sem.c                     | 8 ++++----
> ipc/util.c                    | 6 +++---
> ipc/util.h                    | 3 ++-
> 6 files changed, 15 insertions(+), 13 deletions(-)

The SoB list is a bit weird... otherwise, the conversion
obviously makes sense:

Acked-by: Davidlohr Bueso <dave@stgolabs.net>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] ipc subsystem refcounter conversions
  2017-02-20 11:29 [PATCH 0/3] ipc subsystem refcounter conversions Elena Reshetova
                   ` (4 preceding siblings ...)
  2017-02-22 15:41 ` Davidlohr Bueso
@ 2017-03-04  0:23 ` Andrew Morton
  2017-03-06  9:51   ` Reshetova, Elena
  2017-05-27 19:58   ` Kees Cook
  5 siblings, 2 replies; 43+ messages in thread
From: Andrew Morton @ 2017-03-04  0:23 UTC (permalink / raw)
  To: Elena Reshetova
  Cc: linux-kernel, peterz, gregkh, ebiederm, mingo, adobriyan, serge,
	arozansk, dave

On Mon, 20 Feb 2017 13:29:46 +0200 Elena Reshetova <elena.reshetova@intel.com> wrote:

> Now when new refcount_t type and API are finally merged
> (see include/linux/refcount.h), the following
> patches convert various refcounters in the ipc susystem from atomic_t
> to refcount_t. By doing this we prevent intentional or accidental
> underflows or overflows that can led to use-after-free vulnerabilities.
> 
> The below patches are fully independent and can be cherry-picked separately.
> Since we convert all kernel subsystems in the same fashion, resulting
> in about 300 patches, we have to group them for sending at least in some
> fashion to be manageable. Please excuse the long cc list.

Again, the refcount_t operations are much more expensive than the bare
atomic_t operations.  I'm reluctant to merge any of these conversions
without either

a) a convincing demonstration that the performance impact is
   sufficiently small (ie: unmeasurable) or

b) a compile-time option to disable the refcount_t operations (make
   them generate the same code as the bare atomic_t ops).  Along with
   some suitably reliable means of preventing people from accidentally
   enabling the debug code in production builds.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH 0/3] ipc subsystem refcounter conversions
  2017-03-04  0:23 ` Andrew Morton
@ 2017-03-06  9:51   ` Reshetova, Elena
  2017-05-27 19:58   ` Kees Cook
  1 sibling, 0 replies; 43+ messages in thread
From: Reshetova, Elena @ 2017-03-06  9:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, peterz, gregkh, ebiederm, mingo, adobriyan, serge,
	arozansk, dave

> On Mon, 20 Feb 2017 13:29:46 +0200 Elena Reshetova
> <elena.reshetova@intel.com> wrote:
> 
> > Now when new refcount_t type and API are finally merged
> > (see include/linux/refcount.h), the following
> > patches convert various refcounters in the ipc susystem from atomic_t
> > to refcount_t. By doing this we prevent intentional or accidental
> > underflows or overflows that can led to use-after-free vulnerabilities.
> >
> > The below patches are fully independent and can be cherry-picked separately.
> > Since we convert all kernel subsystems in the same fashion, resulting
> > in about 300 patches, we have to group them for sending at least in some
> > fashion to be manageable. Please excuse the long cc list.
> 
> Again, the refcount_t operations are much more expensive than the bare
> atomic_t operations.  I'm reluctant to merge any of these conversions
> without either
> 
> a) a convincing demonstration that the performance impact is
>    sufficiently small (ie: unmeasurable) or

Unfortunately pretty much any security options would have a performance impact. 
However, I think it would depend greatly on the actual usage of refcounter, so instead of providing you some generic numbers, I guess we would need to measure it by subsystem. 
Could you please suggest what would be the reasonable test setup/load for testing the ipc impact? 

 
> b) a compile-time option to disable the refcount_t operations (make
>    them generate the same code as the bare atomic_t ops). 

I don't think a compile option to disable refcount operations makes sense, because we already have atomic. 
For places that can't take a performance impact of refcount_t we just have to leave them as atomic_t and don't overcomplicate things IMO.

Along with
>    some suitably reliable means of preventing people from accidentally
>    enabling the debug code in production builds.

Hm... Currently it only does WARN in super rare cases when things don't go well at all (overflow, underflow, increment from zero).
It isn't really debug output anymore in a way that such things need to be detected as early as possible, because they mean bugs. 


Best Regards,
Elena.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-02-20 11:29 ` [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t Elena Reshetova
@ 2017-05-27 19:41   ` Kees Cook
  2017-05-28 12:10     ` Manfred Spraul
  0 siblings, 1 reply; 43+ messages in thread
From: Kees Cook @ 2017-05-27 19:41 UTC (permalink / raw)
  To: Andrew Morton, Manfred Spraul
  Cc: LKML, Elena Reshetova, Peter Zijlstra, Greg KH,
	Eric W. Biederman, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn,
	arozansk, Davidlohr Bueso, Hans Liljestrand, David Windsor

On Mon, Feb 20, 2017 at 3:29 AM, Elena Reshetova
<elena.reshetova@intel.com> wrote:
> refcount_t type and corresponding API should be
> used instead of atomic_t when the variable is used as
> a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free
> situations.
>
> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: David Windsor <dwindsor@gmail.com>

Manfred, is this okay by you? I think this should go via -mm...

-Kees

> ---
>  include/linux/ipc_namespace.h | 5 +++--
>  ipc/msgutil.c                 | 2 +-
>  ipc/namespace.c               | 4 ++--
>  3 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h
> index 848e579..7230638 100644
> --- a/include/linux/ipc_namespace.h
> +++ b/include/linux/ipc_namespace.h
> @@ -7,6 +7,7 @@
>  #include <linux/notifier.h>
>  #include <linux/nsproxy.h>
>  #include <linux/ns_common.h>
> +#include <linux/refcount.h>
>
>  struct user_namespace;
>
> @@ -19,7 +20,7 @@ struct ipc_ids {
>  };
>
>  struct ipc_namespace {
> -       atomic_t        count;
> +       refcount_t      count;
>         struct ipc_ids  ids[3];
>
>         int             sem_ctls[4];
> @@ -118,7 +119,7 @@ extern struct ipc_namespace *copy_ipcs(unsigned long flags,
>  static inline struct ipc_namespace *get_ipc_ns(struct ipc_namespace *ns)
>  {
>         if (ns)
> -               atomic_inc(&ns->count);
> +               refcount_inc(&ns->count);
>         return ns;
>  }
>
> diff --git a/ipc/msgutil.c b/ipc/msgutil.c
> index bf74eaa..8459802 100644
> --- a/ipc/msgutil.c
> +++ b/ipc/msgutil.c
> @@ -29,7 +29,7 @@ DEFINE_SPINLOCK(mq_lock);
>   * and not CONFIG_IPC_NS.
>   */
>  struct ipc_namespace init_ipc_ns = {
> -       .count          = ATOMIC_INIT(1),
> +       .count          = REFCOUNT_INIT(1),
>         .user_ns = &init_user_ns,
>         .ns.inum = PROC_IPC_INIT_INO,
>  #ifdef CONFIG_IPC_NS
> diff --git a/ipc/namespace.c b/ipc/namespace.c
> index 0abdea4..ed10bbc 100644
> --- a/ipc/namespace.c
> +++ b/ipc/namespace.c
> @@ -48,7 +48,7 @@ static struct ipc_namespace *create_ipc_ns(struct user_namespace *user_ns,
>                 goto fail_free;
>         ns->ns.ops = &ipcns_operations;
>
> -       atomic_set(&ns->count, 1);
> +       refcount_set(&ns->count, 1);
>         ns->user_ns = get_user_ns(user_ns);
>         ns->ucounts = ucounts;
>
> @@ -142,7 +142,7 @@ static void free_ipc_ns(struct ipc_namespace *ns)
>   */
>  void put_ipc_ns(struct ipc_namespace *ns)
>  {
> -       if (atomic_dec_and_lock(&ns->count, &mq_lock)) {
> +       if (refcount_dec_and_lock(&ns->count, &mq_lock)) {
>                 mq_clear_sbinfo(ns);
>                 spin_unlock(&mq_lock);
>                 mq_put_mnt(ns);
> --
> 2.7.4
>



-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 2/3] ipc: convert sem_undo_list.refcnt from atomic_t to refcount_t
  2017-02-20 11:29 ` [PATCH 2/3] ipc: convert sem_undo_list.refcnt " Elena Reshetova
@ 2017-05-27 19:44   ` Kees Cook
  0 siblings, 0 replies; 43+ messages in thread
From: Kees Cook @ 2017-05-27 19:44 UTC (permalink / raw)
  To: Manfred Spraul, Andrew Morton
  Cc: LKML, Elena Reshetova, Peter Zijlstra, Greg KH,
	Eric W. Biederman, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn,
	arozansk, Davidlohr Bueso, Hans Liljestrand, David Windsor

On Mon, Feb 20, 2017 at 3:29 AM, Elena Reshetova
<elena.reshetova@intel.com> wrote:
> refcount_t type and corresponding API should be
> used instead of atomic_t when the variable is used as
> a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free
> situations.
>
> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: David Windsor <dwindsor@gmail.com>

Here's the 2/3 that should go via -mm, I think too.

-Kees

> ---
>  ipc/sem.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/ipc/sem.c b/ipc/sem.c
> index e468cd1..9063ffa 100644
> --- a/ipc/sem.c
> +++ b/ipc/sem.c
> @@ -139,7 +139,7 @@ struct sem_undo {
>   * that may be shared among all a CLONE_SYSVSEM task group.
>   */
>  struct sem_undo_list {
> -       atomic_t                refcnt;
> +       refcount_t              refcnt;
>         spinlock_t              lock;
>         struct list_head        list_proc;
>  };
> @@ -1646,7 +1646,7 @@ static inline int get_undo_list(struct sem_undo_list **undo_listp)
>                 if (undo_list == NULL)
>                         return -ENOMEM;
>                 spin_lock_init(&undo_list->lock);
> -               atomic_set(&undo_list->refcnt, 1);
> +               refcount_set(&undo_list->refcnt, 1);
>                 INIT_LIST_HEAD(&undo_list->list_proc);
>
>                 current->sysvsem.undo_list = undo_list;
> @@ -2045,7 +2045,7 @@ int copy_semundo(unsigned long clone_flags, struct task_struct *tsk)
>                 error = get_undo_list(&undo_list);
>                 if (error)
>                         return error;
> -               atomic_inc(&undo_list->refcnt);
> +               refcount_inc(&undo_list->refcnt);
>                 tsk->sysvsem.undo_list = undo_list;
>         } else
>                 tsk->sysvsem.undo_list = NULL;
> @@ -2074,7 +2074,7 @@ void exit_sem(struct task_struct *tsk)
>                 return;
>         tsk->sysvsem.undo_list = NULL;
>
> -       if (!atomic_dec_and_test(&ulp->refcnt))
> +       if (!refcount_dec_and_test(&ulp->refcnt))
>                 return;
>
>         for (;;) {
> --
> 2.7.4
>



-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 3/3] ipc: convert ipc_rcu.refcount from atomic_t to refcount_t
  2017-02-20 11:29 ` [PATCH 3/3] ipc: convert ipc_rcu.refcount " Elena Reshetova
@ 2017-05-27 19:47   ` Kees Cook
  0 siblings, 0 replies; 43+ messages in thread
From: Kees Cook @ 2017-05-27 19:47 UTC (permalink / raw)
  To: Elena Reshetova, Manfred Spraul
  Cc: LKML, Peter Zijlstra, Greg KH, Andrew Morton, Eric W. Biederman,
	Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
	Davidlohr Bueso, Hans Liljestrand, David Windsor

On Mon, Feb 20, 2017 at 3:29 AM, Elena Reshetova
<elena.reshetova@intel.com> wrote:
> refcount_t type and corresponding API should be
> used instead of atomic_t when the variable is used as
> a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free
> situations.
>
> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: David Windsor <dwindsor@gmail.com>

After the IPC RCU refactoring, this needs rework... I'll send a new patch...

-Kees

> ---
>  ipc/util.c | 6 +++---
>  ipc/util.h | 3 ++-
>  2 files changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/ipc/util.c b/ipc/util.c
> index 798cad1..24484a6 100644
> --- a/ipc/util.c
> +++ b/ipc/util.c
> @@ -437,7 +437,7 @@ void *ipc_rcu_alloc(int size)
>         struct ipc_rcu *out = ipc_alloc(sizeof(struct ipc_rcu) + size);
>         if (unlikely(!out))
>                 return NULL;
> -       atomic_set(&out->refcount, 1);
> +       refcount_set(&out->refcount, 1);
>         return out + 1;
>  }
>
> @@ -445,14 +445,14 @@ int ipc_rcu_getref(void *ptr)
>  {
>         struct ipc_rcu *p = ((struct ipc_rcu *)ptr) - 1;
>
> -       return atomic_inc_not_zero(&p->refcount);
> +       return refcount_inc_not_zero(&p->refcount);
>  }
>
>  void ipc_rcu_putref(void *ptr, void (*func)(struct rcu_head *head))
>  {
>         struct ipc_rcu *p = ((struct ipc_rcu *)ptr) - 1;
>
> -       if (!atomic_dec_and_test(&p->refcount))
> +       if (!refcount_dec_and_test(&p->refcount))
>                 return;
>
>         call_rcu(&p->rcu, func);
> diff --git a/ipc/util.h b/ipc/util.h
> index 51f7ca5..274ec9b 100644
> --- a/ipc/util.h
> +++ b/ipc/util.h
> @@ -12,6 +12,7 @@
>
>  #include <linux/unistd.h>
>  #include <linux/err.h>
> +#include <linux/refcount.h>
>
>  #define SEQ_MULTIPLIER (IPCMNI)
>
> @@ -49,7 +50,7 @@ static inline void shm_exit_ns(struct ipc_namespace *ns) { }
>
>  struct ipc_rcu {
>         struct rcu_head rcu;
> -       atomic_t refcount;
> +       refcount_t refcount;
>  } ____cacheline_aligned_in_smp;
>
>  #define ipc_rcu_to_struct(p)  ((void *)(p+1))
> --
> 2.7.4
>



-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] ipc subsystem refcounter conversions
  2017-03-04  0:23 ` Andrew Morton
  2017-03-06  9:51   ` Reshetova, Elena
@ 2017-05-27 19:58   ` Kees Cook
  2017-05-29  8:39     ` Christoph Hellwig
  1 sibling, 1 reply; 43+ messages in thread
From: Kees Cook @ 2017-05-27 19:58 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Elena Reshetova, Peter Zijlstra, Greg KH, Eric W. Biederman,
	Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
	Davidlohr Bueso, Manfred Spraul, Christoph Hellwig, axboe,
	James Bottomley, x86, Ingo Molnar, Arnd Bergmann,
	David S. Miller, Rik van Riel, linux-arch, kernel-hardening,
	LKML

On Fri, Mar 3, 2017 at 4:23 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Mon, 20 Feb 2017 13:29:46 +0200 Elena Reshetova <elena.reshetova@intel.com> wrote:
>
>> Now when new refcount_t type and API are finally merged
>> (see include/linux/refcount.h), the following
>> patches convert various refcounters in the ipc susystem from atomic_t
>> to refcount_t. By doing this we prevent intentional or accidental
>> underflows or overflows that can led to use-after-free vulnerabilities.
>>
>> The below patches are fully independent and can be cherry-picked separately.
>> Since we convert all kernel subsystems in the same fashion, resulting
>> in about 300 patches, we have to group them for sending at least in some
>> fashion to be manageable. Please excuse the long cc list.
>
> Again, the refcount_t operations are much more expensive than the bare
> atomic_t operations.  I'm reluctant to merge any of these conversions

Since Manfred did a recent refactor of IPC RCU, and the refcount usage
is minimal, it seemed a good time to ask after these patches again.
(And send an updated one for the refactor.)

> without either
>
> a) a convincing demonstration that the performance impact is
>    sufficiently small (ie: unmeasurable) or

I've sent a few versions of a much faster refcount implementation
which has no measurable difference to atomic_t operations (based on
the PaX implementation). Getting more eyes on that would be nice; I'll
include you on CC when I send the next version.

> b) a compile-time option to disable the refcount_t operations (make
>    them generate the same code as the bare atomic_t ops).  Along with
>    some suitably reliable means of preventing people from accidentally
>    enabling the debug code in production builds.

Since the speed-ups will likely be arch-specific, should my proposed
CONFIG_FAST_REFCOUNT be made arch-indep when no arch-specific
implementation available (via totally unchecked atomic_t ops)? i.e.:

FAST_REFCOUNT=n: use function-based refcount_t with cmpxvhg and
full-verification
FAST_REFCOUNT=y without arch-specific implementation: use atomic_t
with no verification (i.e. no functional change from now)
FAST_REFCOUNT=y with arch-specific implementation: use atomic_t with
overflow protection

which means FAST_REFCOUNT would need to be default-on so that mm,
block, net users will remain happy.

Does that sound reasonable?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-05-27 19:41   ` Kees Cook
@ 2017-05-28 12:10     ` Manfred Spraul
  0 siblings, 0 replies; 43+ messages in thread
From: Manfred Spraul @ 2017-05-28 12:10 UTC (permalink / raw)
  To: Kees Cook, Andrew Morton
  Cc: LKML, Elena Reshetova, Peter Zijlstra, Greg KH,
	Eric W. Biederman, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn,
	arozansk, Davidlohr Bueso, Hans Liljestrand, David Windsor

On 05/27/2017 09:41 PM, Kees Cook wrote:
> On Mon, Feb 20, 2017 at 3:29 AM, Elena Reshetova
> <elena.reshetova@intel.com> wrote:
>> refcount_t type and corresponding API should be
>> used instead of atomic_t when the variable is used as
>> a reference counter. This allows to avoid accidental
>> refcounter overflows that might lead to use-after-free
>> situations.
>>
>> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
>> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>> Signed-off-by: David Windsor <dwindsor@gmail.com>
> Manfred, is this okay by you? I think this should go via -mm...
The patch is ok, the refcounters are not in the critical path.
I'll try to test the patches and then I would send an 'Acked-by', but 
this will take some days.

--
     Manfred

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] ipc subsystem refcounter conversions
  2017-05-27 19:58   ` Kees Cook
@ 2017-05-29  8:39     ` Christoph Hellwig
  2017-05-29  9:11       ` Eric W. Biederman
  0 siblings, 1 reply; 43+ messages in thread
From: Christoph Hellwig @ 2017-05-29  8:39 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andrew Morton, Elena Reshetova, Peter Zijlstra, Greg KH,
	Eric W. Biederman, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn,
	arozansk, Davidlohr Bueso, Manfred Spraul, Christoph Hellwig,
	axboe, James Bottomley, x86, Ingo Molnar, Arnd Bergmann,
	David S. Miller, Rik van Riel, linux-arch, kernel-hardening,
	LKML

On Sat, May 27, 2017 at 12:58:14PM -0700, Kees Cook wrote:
> FAST_REFCOUNT=n: use function-based refcount_t with cmpxvhg and
> full-verification
> FAST_REFCOUNT=y without arch-specific implementation: use atomic_t
> with no verification (i.e. no functional change from now)
> FAST_REFCOUNT=y with arch-specific implementation: use atomic_t with
> overflow protection
> 
> which means FAST_REFCOUNT would need to be default-on so that mm,
> block, net users will remain happy.
> 
> Does that sound reasonable?

I'd rather turn the options around so that the atomic_t or fast
arch implementations are the defaul.  But either way it needs to
be configurable.  Once that is done we can spread refcount_t everywhere
and everyone will be better off, if only for the documentation value
of the type when they use the atomic_t based implementation.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] ipc subsystem refcounter conversions
  2017-05-29  8:39     ` Christoph Hellwig
@ 2017-05-29  9:11       ` Eric W. Biederman
  2017-05-29 10:24         ` Peter Zijlstra
  0 siblings, 1 reply; 43+ messages in thread
From: Eric W. Biederman @ 2017-05-29  9:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Kees Cook, Andrew Morton, Elena Reshetova, Peter Zijlstra,
	Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
	Davidlohr Bueso, Manfred Spraul, axboe, James Bottomley, x86,
	Ingo Molnar, Arnd Bergmann, David S. Miller, Rik van Riel,
	linux-arch, kernel-hardening, LKML

Christoph Hellwig <hch@infradead.org> writes:

> On Sat, May 27, 2017 at 12:58:14PM -0700, Kees Cook wrote:
>> FAST_REFCOUNT=n: use function-based refcount_t with cmpxvhg and
>> full-verification
>> FAST_REFCOUNT=y without arch-specific implementation: use atomic_t
>> with no verification (i.e. no functional change from now)
>> FAST_REFCOUNT=y with arch-specific implementation: use atomic_t with
>> overflow protection
>> 
>> which means FAST_REFCOUNT would need to be default-on so that mm,
>> block, net users will remain happy.
>> 
>> Does that sound reasonable?
>
> I'd rather turn the options around so that the atomic_t or fast
> arch implementations are the defaul.  But either way it needs to
> be configurable.  Once that is done we can spread refcount_t everywhere
> and everyone will be better off, if only for the documentation value
> of the type when they use the atomic_t based implementation.

Agreed on the opposite default as opting into common library
implementations is how we typically handle things in linux.

Kees I I have a concern:

__must_check bool refcount_add_not_zero(unsigned int i, refcount_t *r)
{
        unsigned int new, val = atomic_read(&r->refs);

        do {
                if (!val)
                        return false;

                if (unlikely(val == UINT_MAX))
                        return true;

                new = val + i;
                if (new < val)
                        new = UINT_MAX;

        } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));

        WARN_ONCE(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");

        return true;
}

Why in the world do you succeed when you the value saturates????

>From a code perspective that is bizarre.   The code already has to handle
the case when the counter does not increment.

So since we already have to have that code to handle the failure to
increment I think it would make much more sense if the counters did not
only saturate but report failure when the counter can not increment.

Right now I am inclined to NACK refcount_t conversions because their
semantics don't make sense.  Which would seem to make the code less
correct by introducing bizarre corner cases instead of letting the code
use the it's existing handling of the failure to increment or decrement
the counter.

Fixing the return value would move refcount_t into the realm of
something that is desirable because it has bettern semantics and
is more useful just on a day to day correctness point of view.  Even
ignoring the security implications.

I suspect that would also make it easier for refcount_t implementations
to produce efficient code.

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] ipc subsystem refcounter conversions
  2017-05-29  9:11       ` Eric W. Biederman
@ 2017-05-29 10:24         ` Peter Zijlstra
  2017-05-29 10:49           ` Eric W. Biederman
  0 siblings, 1 reply; 43+ messages in thread
From: Peter Zijlstra @ 2017-05-29 10:24 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
	Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
	Davidlohr Bueso, Manfred Spraul, axboe, James Bottomley, x86,
	Ingo Molnar, Arnd Bergmann, David S. Miller, Rik van Riel,
	linux-arch, kernel-hardening, LKML

On Mon, May 29, 2017 at 04:11:13AM -0500, Eric W. Biederman wrote:

> Kees I I have a concern:
> 
> __must_check bool refcount_add_not_zero(unsigned int i, refcount_t *r)
> {
>         unsigned int new, val = atomic_read(&r->refs);
> 
>         do {
>                 if (!val)
>                         return false;
> 
>                 if (unlikely(val == UINT_MAX))
>                         return true;
> 
>                 new = val + i;
>                 if (new < val)
>                         new = UINT_MAX;
> 
>         } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
> 
>         WARN_ONCE(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");
> 
>         return true;
> }
> 
> Why in the world do you succeed when you the value saturates????

Why not? On saturation the object will leak and returning a reference to
it is always good.

> From a code perspective that is bizarre.   The code already has to handle
> the case when the counter does not increment.

I don't see it as bizarre, we turned an overflow/use-after-free into a
leak. That's the primary mechanism here.

As long as we have a reference to a leaked object, we might as well use
it, its not going anywhere.

> Fixing the return value would move refcount_t into the realm of
> something that is desirable because it has bettern semantics and
> is more useful just on a day to day correctness point of view.  Even
> ignoring the security implications.

It changes the semantics between inc_not_zero() and inc(). It also
complicates the semantics of inc_not_zero(), where currently the failure
implies the count is 0 and means no-such-object, you complicate matters
by basically returning 'busy'.

That is a completely new class of failure that is actually hard to deal
with, not to mention that it completely destroys refcount_inc_not_zero()
being a 'simple' replacement for atomic_inc_not_zero().

In case of the current failure, the no-such-object, we can fix that by
creating said object. But what to do on 'busy' ? Surely you don't want
to create another. You'd have to somehow retrofit something to wait on
in every user.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] ipc subsystem refcounter conversions
  2017-05-29 10:24         ` Peter Zijlstra
@ 2017-05-29 10:49           ` Eric W. Biederman
  2017-05-29 11:30             ` Eric W. Biederman
  2017-05-29 12:13             ` Peter Zijlstra
  0 siblings, 2 replies; 43+ messages in thread
From: Eric W. Biederman @ 2017-05-29 10:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
	Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
	Davidlohr Bueso, Manfred Spraul, axboe, James Bottomley, x86,
	Ingo Molnar, Arnd Bergmann, David S. Miller, Rik van Riel,
	linux-arch, kernel-hardening, LKML

Peter Zijlstra <peterz@infradead.org> writes:

> On Mon, May 29, 2017 at 04:11:13AM -0500, Eric W. Biederman wrote:
>
>> Kees I I have a concern:
>> 
>> __must_check bool refcount_add_not_zero(unsigned int i, refcount_t *r)
>> {
>>         unsigned int new, val = atomic_read(&r->refs);
>> 
>>         do {
>>                 if (!val)
>>                         return false;
>> 
>>                 if (unlikely(val == UINT_MAX))
>>                         return true;
>> 
>>                 new = val + i;
>>                 if (new < val)
>>                         new = UINT_MAX;
>> 
>>         } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
>> 
>>         WARN_ONCE(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");
>> 
>>         return true;
>> }
>> 
>> Why in the world do you succeed when you the value saturates????
>
> Why not? On saturation the object will leak and returning a reference to
> it is always good.
>
>> From a code perspective that is bizarre.   The code already has to handle
>> the case when the counter does not increment.
>
> I don't see it as bizarre, we turned an overflow/use-after-free into a
> leak. That's the primary mechanism here.
>
> As long as we have a reference to a leaked object, we might as well use
> it, its not going anywhere.
>
>> Fixing the return value would move refcount_t into the realm of
>> something that is desirable because it has bettern semantics and
>> is more useful just on a day to day correctness point of view.  Even
>> ignoring the security implications.
>
> It changes the semantics between inc_not_zero() and inc(). It also
> complicates the semantics of inc_not_zero(), where currently the failure
> implies the count is 0 and means no-such-object, you complicate matters
> by basically returning 'busy'.

Busy is not a state of a reference count.

It is true I am suggesting treating something with a saturated reference
as not available.  If that is what you mean by busy.  But if it's
reference is zero it is also not available.  So there is no practical
difference.

> That is a completely new class of failure that is actually hard to deal
> with, not to mention that it completely destroys refcount_inc_not_zero()
> being a 'simple' replacement for atomic_inc_not_zero().
>
> In case of the current failure, the no-such-object, we can fix that by
> creating said object. But what to do on 'busy' ? Surely you don't want
> to create another. You'd have to somehow retrofit something to wait on
> in every user.

Using little words.

A return of true from inc_not_zero means we took a reference.
A return of false means we did not take a reference.

The code already handles I took a reference or I did not take a
reference.

Therefore lying with refcount_t is not helpful.  It takes failures
the code could easily handle and turns them into leaks.

At least that is how I have seen reference counts used.  And those
are definitely the plane obivous semantics.

Your changes are definitely not drop in replacements for atomic_t in my
code.

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] ipc subsystem refcounter conversions
  2017-05-29 10:49           ` Eric W. Biederman
@ 2017-05-29 11:30             ` Eric W. Biederman
  2017-05-29 11:39               ` Eric W. Biederman
  2017-05-29 12:13             ` Peter Zijlstra
  1 sibling, 1 reply; 43+ messages in thread
From: Eric W. Biederman @ 2017-05-29 11:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
	Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
	Davidlohr Bueso, Manfred Spraul, axboe, James Bottomley, x86,
	Ingo Molnar, Arnd Bergmann, David S. Miller, Rik van Riel,
	linux-arch, kernel-hardening, LKML

ebiederm@xmission.com (Eric W. Biederman) writes:

> Peter Zijlstra <peterz@infradead.org> writes:
>
>> On Mon, May 29, 2017 at 04:11:13AM -0500, Eric W. Biederman wrote:
>>
>>> Kees I I have a concern:
>>> 
>>> __must_check bool refcount_add_not_zero(unsigned int i, refcount_t *r)
>>> {
>>>         unsigned int new, val = atomic_read(&r->refs);
>>> 
>>>         do {
>>>                 if (!val)
>>>                         return false;
>>> 
>>>                 if (unlikely(val == UINT_MAX))
>>>                         return true;
>>> 
>>>                 new = val + i;
>>>                 if (new < val)
>>>                         new = UINT_MAX;
>>> 
>>>         } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
>>> 
>>>         WARN_ONCE(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");
>>> 
>>>         return true;
>>> }
>>> 
>>> Why in the world do you succeed when you the value saturates????
>>
>> Why not? On saturation the object will leak and returning a reference to
>> it is always good.
>>
>>> From a code perspective that is bizarre.   The code already has to handle
>>> the case when the counter does not increment.
>>
>> I don't see it as bizarre, we turned an overflow/use-after-free into a
>> leak. That's the primary mechanism here.
>>
>> As long as we have a reference to a leaked object, we might as well use
>> it, its not going anywhere.
>>
>>> Fixing the return value would move refcount_t into the realm of
>>> something that is desirable because it has bettern semantics and
>>> is more useful just on a day to day correctness point of view.  Even
>>> ignoring the security implications.
>>
>> It changes the semantics between inc_not_zero() and inc(). It also
>> complicates the semantics of inc_not_zero(), where currently the failure
>> implies the count is 0 and means no-such-object, you complicate matters
>> by basically returning 'busy'.
>
> Busy is not a state of a reference count.
>
> It is true I am suggesting treating something with a saturated reference
> as not available.  If that is what you mean by busy.  But if it's
> reference is zero it is also not available.  So there is no practical
> difference.
>
>> That is a completely new class of failure that is actually hard to deal
>> with, not to mention that it completely destroys refcount_inc_not_zero()
>> being a 'simple' replacement for atomic_inc_not_zero().
>>
>> In case of the current failure, the no-such-object, we can fix that by
>> creating said object. But what to do on 'busy' ? Surely you don't want
>> to create another. You'd have to somehow retrofit something to wait on
>> in every user.
>
> Using little words.
>
> A return of true from inc_not_zero means we took a reference.
> A return of false means we did not take a reference.
>
> The code already handles I took a reference or I did not take a
> reference.
>
> Therefore lying with refcount_t is not helpful.  It takes failures
> the code could easily handle and turns them into leaks.
>
> At least that is how I have seen reference counts used.  And those
> are definitely the plane obivous semantics.
>
> Your changes are definitely not drop in replacements for atomic_t in my
> code.

To clarify.

If my code uses atomic_inc it does not expect a failure of any sort
and saturate semantics are a fine replacement.

If my code uses atomic_inc_not_zero it knows how to handle a failure
to take a reference count.  Making hiding the failure really bizarre.

A must check function that hides a case I can handle and requires
checking in a case where my code is built not to check is a drop in
replacement for neither.

So anyone who is proposing a refcount_t change as a drop in replacement
for any code I maintain I will nack on sight because refcount_t is not
currently a no-brain drop in replacement.

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] ipc subsystem refcounter conversions
  2017-05-29 11:30             ` Eric W. Biederman
@ 2017-05-29 11:39               ` Eric W. Biederman
  2017-05-29 12:23                 ` Peter Zijlstra
  0 siblings, 1 reply; 43+ messages in thread
From: Eric W. Biederman @ 2017-05-29 11:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
	Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
	Davidlohr Bueso, Manfred Spraul, axboe, James Bottomley, x86,
	Ingo Molnar, Arnd Bergmann, David S. Miller, Rik van Riel,
	linux-arch, kernel-hardening, LKML

ebiederm@xmission.com (Eric W. Biederman) writes:

> ebiederm@xmission.com (Eric W. Biederman) writes:
>
>> Peter Zijlstra <peterz@infradead.org> writes:
>>
>>> On Mon, May 29, 2017 at 04:11:13AM -0500, Eric W. Biederman wrote:
>>>
>>>> Kees I I have a concern:
>>>> 
>>>> __must_check bool refcount_add_not_zero(unsigned int i, refcount_t *r)
>>>> {
>>>>         unsigned int new, val = atomic_read(&r->refs);
>>>> 
>>>>         do {
>>>>                 if (!val)
>>>>                         return false;
>>>> 
>>>>                 if (unlikely(val == UINT_MAX))
>>>>                         return true;
>>>> 
>>>>                 new = val + i;
>>>>                 if (new < val)
>>>>                         new = UINT_MAX;
>>>> 
>>>>         } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
>>>> 
>>>>         WARN_ONCE(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");
>>>> 
>>>>         return true;
>>>> }
>>>> 
>>>> Why in the world do you succeed when you the value saturates????
>>>
>>> Why not? On saturation the object will leak and returning a reference to
>>> it is always good.
>>>
>>>> From a code perspective that is bizarre.   The code already has to handle
>>>> the case when the counter does not increment.
>>>
>>> I don't see it as bizarre, we turned an overflow/use-after-free into a
>>> leak. That's the primary mechanism here.
>>>
>>> As long as we have a reference to a leaked object, we might as well use
>>> it, its not going anywhere.
>>>
>>>> Fixing the return value would move refcount_t into the realm of
>>>> something that is desirable because it has bettern semantics and
>>>> is more useful just on a day to day correctness point of view.  Even
>>>> ignoring the security implications.
>>>
>>> It changes the semantics between inc_not_zero() and inc(). It also
>>> complicates the semantics of inc_not_zero(), where currently the failure
>>> implies the count is 0 and means no-such-object, you complicate matters
>>> by basically returning 'busy'.
>>
>> Busy is not a state of a reference count.
>>
>> It is true I am suggesting treating something with a saturated reference
>> as not available.  If that is what you mean by busy.  But if it's
>> reference is zero it is also not available.  So there is no practical
>> difference.
>>
>>> That is a completely new class of failure that is actually hard to deal
>>> with, not to mention that it completely destroys refcount_inc_not_zero()
>>> being a 'simple' replacement for atomic_inc_not_zero().
>>>
>>> In case of the current failure, the no-such-object, we can fix that by
>>> creating said object. But what to do on 'busy' ? Surely you don't want
>>> to create another. You'd have to somehow retrofit something to wait on
>>> in every user.
>>
>> Using little words.
>>
>> A return of true from inc_not_zero means we took a reference.
>> A return of false means we did not take a reference.
>>
>> The code already handles I took a reference or I did not take a
>> reference.
>>
>> Therefore lying with refcount_t is not helpful.  It takes failures
>> the code could easily handle and turns them into leaks.
>>
>> At least that is how I have seen reference counts used.  And those
>> are definitely the plane obivous semantics.
>>
>> Your changes are definitely not drop in replacements for atomic_t in my
>> code.
>
> To clarify.
>
> If my code uses atomic_inc it does not expect a failure of any sort
> and saturate semantics are a fine replacement.
>
> If my code uses atomic_inc_not_zero it knows how to handle a failure
> to take a reference count.  Making hiding the failure really bizarre.
>
> A must check function that hides a case I can handle and requires
> checking in a case where my code is built not to check is a drop in
> replacement for neither.
>
> So anyone who is proposing a refcount_t change as a drop in replacement
> for any code I maintain I will nack on sight because refcount_t is not
> currently a no-brain drop in replacement.

*Blink*

I failed to see that there is a refcount_inc.  Too much noise in
the header file I suppose.

But implementing refcount_inc in terms of refcount_inc_not_zero is
totally broken.  The two operations are not the same and the go to
different assumptions the code is making.

That explains why you think refcount_inc_not_zero should lie because
you are implementing refcount_inc with it.  They are semantically very
different operations.  Please separate them.

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] ipc subsystem refcounter conversions
  2017-05-29 10:49           ` Eric W. Biederman
  2017-05-29 11:30             ` Eric W. Biederman
@ 2017-05-29 12:13             ` Peter Zijlstra
  1 sibling, 0 replies; 43+ messages in thread
From: Peter Zijlstra @ 2017-05-29 12:13 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
	Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
	Davidlohr Bueso, Manfred Spraul, axboe, James Bottomley, x86,
	Ingo Molnar, Arnd Bergmann, David S. Miller, Rik van Riel,
	linux-arch, kernel-hardening, LKML

On Mon, May 29, 2017 at 05:49:53AM -0500, Eric W. Biederman wrote:

> > It changes the semantics between inc_not_zero() and inc(). It also
> > complicates the semantics of inc_not_zero(), where currently the failure
> > implies the count is 0 and means no-such-object, you complicate matters
> > by basically returning 'busy'.
> 
> Busy is not a state of a reference count.
> 
> It is true I am suggesting treating something with a saturated reference
> as not available.  If that is what you mean by busy.  But if it's
> reference is zero it is also not available.  So there is no practical
> difference.

There is. Previously when inc_not_zero() failed, you _knew_ it was 0 and
therefore know the object no longer 'exists'.

Similarly, if you know you're serialized against 1->0 you can then
assume it will not fail.

That goes out the window the moment you fail for any other condition.

> > That is a completely new class of failure that is actually hard to deal
> > with, not to mention that it completely destroys refcount_inc_not_zero()
> > being a 'simple' replacement for atomic_inc_not_zero().
> >
> > In case of the current failure, the no-such-object, we can fix that by
> > creating said object. But what to do on 'busy' ? Surely you don't want
> > to create another. You'd have to somehow retrofit something to wait on
> > in every user.
> 
> Using little words.
> 
> A return of true from inc_not_zero means we took a reference.
> A return of false means we did not take a reference.
> 
> The code already handles I took a reference or I did not take a
> reference.

I can well imagine code relying on the fact that failing to take a
reference means 0, see below. And once you start to fail for more
conditions, the actual value you failed on becomes interesting in order
to determine how to deal with it.

> Therefore lying with refcount_t is not helpful.

It is _NOT_ lying. It does as promised, it increments when it is not
zero. The fact that that increment can saturate is irrelevant. A
saturated reference is still a valid reference. Sure it causes a leak,
but who bloody cares, it shouldn't happen in the first place.

> It takes failures
> the code could easily handle and turns them into leaks.

That is the 'feature', we get to have leaks. Also those leaks _should_
not happen. They are a result of 'broken' code. So I don't see how
exposing them to the wider world helps anything but spread the pain of
the failure.

Please explain how the below is not subtly broken by changing
inc_not_zero.

struct obj *__obj_lookup(key)
{
	obj = rcu_based_lookup(key);
	if (refcount_inc_not_zero(&obj->ref))
		return obj;

	return NULL;
}

struct obj *obj_find_acquire(key)
{
	/* fast path, lockless lookup */

	rcu_read_lock()
	obj = __obj_lookup(key);
	rcu_read_unlock();

	if (obj)
		return obj;

	/* slow path, serialize */

	lock(&obj_lock);
	/* we're serialized, if it exists we must get a ref */
	obj = __obj_lookup(key);
	if (obj)
		goto unlock;

	/* allocate a new object and insert it */
	obj = obj_alloc();
	obj_init(obj, key);

unlock:
	unlock(&obj_lock);

	return obj;
}

> At least that is how I have seen reference counts used.  And those
> are definitely the plane obivous semantics.

I very strongly disagree. The one thing this primitive does is change
add/sub to be saturating, *consistently*.

You argue to expose the failure case, leading to more error paths
leading to more complication. I'll argue that nobody wants more error
handling, esp. for something that should not happen to begin with.

> Your changes are definitely not drop in replacements for atomic_t in my
> code.

refcount_t was never meant to be a drop-in replacement for atomic_t.
It is meant to be a fairly painless replacement for reference
counts that were previously implemented using atomic_t.

There is also the fairly common usage count scenario, that does not fit
well with refcount_t. I'm not sure we want to shoehorn that into
refcount_t either, we could create yet another type for that.

And there are of course a lot of less common things we're not wanting to
replace at all. atomic_t isn't broken we don't need to fix it.

I have no idea what you do so I cannot comment further.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] ipc subsystem refcounter conversions
  2017-05-29 11:39               ` Eric W. Biederman
@ 2017-05-29 12:23                 ` Peter Zijlstra
  2017-05-29 15:43                   ` Peter Zijlstra
  0 siblings, 1 reply; 43+ messages in thread
From: Peter Zijlstra @ 2017-05-29 12:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
	Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
	Davidlohr Bueso, Manfred Spraul, axboe, James Bottomley, x86,
	Ingo Molnar, Arnd Bergmann, David S. Miller, Rik van Riel,
	linux-arch, kernel-hardening, LKML

On Mon, May 29, 2017 at 06:39:44AM -0500, Eric W. Biederman wrote:
> I failed to see that there is a refcount_inc.  Too much noise in
> the header file I suppose.
> 
> But implementing refcount_inc in terms of refcount_inc_not_zero is
> totally broken.  The two operations are not the same and the go to
> different assumptions the code is making.
> 
> That explains why you think refcount_inc_not_zero should lie because
> you are implementing refcount_inc with it.  They are semantically very
> different operations.  Please separate them.

There has been much debate about this. And the best I'll do is add a
comment and/or retain these exact semantics.

What is done is:

	refcount_inc() := WARN_ON(!refcount_inc_not_zero())

Because incrementing a zero reference count is a use-after-free and
something we should not do ever.

This is where the whole usage count vs reference count pain comes from.

Once there are no more _references_ to an object, a reference count
frees the object. Therefore a zero reference count means a dead object
and incrementing from that is fail.

The usage count model otoh counts how many (active) users there are of
an object, and no active users is a good and expected situation. But it
is very explicitly not a reference count. Because even in the no users
case do we have a reference to the object (we've not leaked it after
all, we just don't track all references).


Similarly, refcount_dec() is implemented using dec_and_test() and will
WARN when it hits 0, because this is a leak and we don't want those
either.

A usage count variant otoh would be fine with hitting 0.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 0/3] ipc subsystem refcounter conversions
  2017-05-29 12:23                 ` Peter Zijlstra
@ 2017-05-29 15:43                   ` Peter Zijlstra
  0 siblings, 0 replies; 43+ messages in thread
From: Peter Zijlstra @ 2017-05-29 15:43 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
	Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
	Davidlohr Bueso, Manfred Spraul, axboe, James Bottomley, x86,
	Ingo Molnar, Arnd Bergmann, David S. Miller, Rik van Riel,
	linux-arch, kernel-hardening, LKML

On Mon, May 29, 2017 at 02:23:16PM +0200, Peter Zijlstra wrote:
> On Mon, May 29, 2017 at 06:39:44AM -0500, Eric W. Biederman wrote:
> > I failed to see that there is a refcount_inc.  Too much noise in
> > the header file I suppose.
> > 
> > But implementing refcount_inc in terms of refcount_inc_not_zero is
> > totally broken.  The two operations are not the same and the go to
> > different assumptions the code is making.
> > 
> > That explains why you think refcount_inc_not_zero should lie because
> > you are implementing refcount_inc with it.  They are semantically very
> > different operations.  Please separate them.
> 
> There has been much debate about this. And the best I'll do is add a
> comment and/or retain these exact semantics.
> 
> What is done is:
> 
> 	refcount_inc() := WARN_ON(!refcount_inc_not_zero())
> 
> Because incrementing a zero reference count is a use-after-free and
> something we should not do ever.
> 
> This is where the whole usage count vs reference count pain comes from.
> 
> Once there are no more _references_ to an object, a reference count
> frees the object. Therefore a zero reference count means a dead object
> and incrementing from that is fail.
> 
> The usage count model otoh counts how many (active) users there are of
> an object, and no active users is a good and expected situation. But it
> is very explicitly not a reference count. Because even in the no users
> case do we have a reference to the object (we've not leaked it after
> all, we just don't track all references).
> 
> 
> Similarly, refcount_dec() is implemented using dec_and_test() and will
> WARN when it hits 0, because this is a leak and we don't want those
> either.
> 
> A usage count variant otoh would be fine with hitting 0.

A typical pattern for the usage count is caches, where objects are kept
in a data structure (hash/tree and/or list) and we count how many users
there are of said objects. A GC or shrinker will then iterate the
structure and prune those objects that have 0 users.

It is fairly straight forward to convert those to refcount_t by adding
one reference for the data structure itself. The GC/shrinker will then
have to use something like refcount_dec_if_one() to drop the reference
from 1->0 (and we could easily add something like dec_and_lock_if_one if
so desired).

Not all of them mind you, but simple cases can certainly be done without
too much pain.

But clearly there have been conversions of less than desired quality /
clarity though ...

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-20 12:34             ` Eric W. Biederman
@ 2017-07-20 15:12               ` Kees Cook
  0 siblings, 0 replies; 43+ messages in thread
From: Kees Cook @ 2017-07-20 15:12 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Ingo Molnar, Andrew Morton, Davidlohr Bueso, Elena Reshetova,
	LKML, Peter Zijlstra, Greg KH, Ingo Molnar, Alexey Dobriyan,
	Serge E. Hallyn, arozansk, Hans Liljestrand, David Windsor

On Thu, Jul 20, 2017 at 5:34 AM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> Ingo Molnar <mingo@kernel.org> writes:
>
>> * Andrew Morton <akpm@linux-foundation.org> wrote:
>>
>>> On Wed, 19 Jul 2017 15:54:27 -0700 Davidlohr Bueso <dave@stgolabs.net> wrote:
>>>
>>> > On Wed, 19 Jul 2017, Andrew Morton wrote:
>>> >
>>> > >I do rather dislike these conversions from the point of view of
>>> > >performance overhead and general code bloat.  But I seem to have lost
>>> > >that struggle and I don't think any of these are fastpath(?).
>>> >
>>> > Well, since we now have fd25d19 (locking/refcount: Create unchecked atomic_t
>>> > implementation), performance is supposed to be ok.
>>>
>>> Sure, things are OK for people who disable the feature.
>>
>> So with the WIP fast-refcount series from Kees:
>>
>>       [PATCH v6 0/2] x86: Implement fast refcount overflow protection
>>
>> I believe the robustness difference between optimized-refcount_t and
>> full-refcount_t will be marginal.
>>
>> I.e. we'll be able to have both higher API safety _and_ performance.
>>
>>> But for people who want to enable the feature we really should minimize the cost
>>> by avoiding blindly converting sites which simply don't need it: simple, safe,
>>> old, well-tested code.  Why go and slow down such code?  Need to apply some
>>> common sense here...
>>
>> It's old, well-tested code _for existing, sane parameters_, until someone finds a
>> decade old bug in one of these with an insane parameters no-one stumbled upon so
>> far, and builds an exploit on top of it.
>>
>> Only by touching all these places do we have a chance to improve things measurably
>> in terms of reducing the probability of bugs.
>
> The more I hear people pushing the upsides of refcount_t without
> considering the downsides the more I dislike it.
>
> - refcount_t is really the wrong thing because it uses saturation
>   semantics.  So by definition it includes a bug.

This is a feature, not a bug. :) If the kernel has a refcount overflow
flaw (which, in the pantheon of exploitable kernel bugs, is
_common_[1], as I've referenced earlier), then we're downgrading an
exploitable use-after-free to a harmless memory allocation leak. Even
if you don't include malicious attackers in the consideration, this
changes a memory corruption of unknown results into a memory leak.
That's actually an _improvement_ to availability and integrity.

> - refcount_t will only really prevent something if there is an extra
>   increment.  That is not the kind of bug people are likely to make.

Like I've said, this is common. This is usually a mistake in error
handling which forgets (or misplaces) a "put".

> - refcount_t won't help if you have an extra decrement.  The bad
>   use-after-free will still happen.

Yes, and not having a protected refcount_t will also allow a
use-after-free. There is no change here, so it's not a "downside" of
refcount_t. In fact, having gained the implicit annotation of
refcount_t being a refcounter (rather than a simple atomic_t) means
that auditing users is easier and more focused. This could reduce the
chance people make mistakes in the first place, especially since the
API is more constrained than atomic_t.

> - refcount_t won't help if there is a memory stomp.  As with an extra
>   decrement the bad use-after-free will still happen.

A stomp of the refcount_t value itself? Sure, and this remains as
vulnerable as atomic_t. This isn't a downside to refcount_t. And
again, since there _is_ checking of the value in places, it's possible
an actionable warning will be produced (though, yes, after the
use-after-free has been exposed), which is a benefit over simple
atomic_t. I mention this in the commit log ("better to maybe produce
the warning than be universally silent").

> So all I see is a huge amount of code churn to implement a buggy (by
> definition) refcounting API, that risks adding new bugs and only truly
> helps with bugs that are unlikely in the first place.

Given that the conversions alone have been uncovering refcount bugs
and that the implementation isn't "buggy" (it provides a specific set
of protections), I strongly disagree with your assessment.

> I really don't think this is an obvious slam dunk.

It entirely blocks a commonly exploitable flaw in the kernel. This
isn't a probabilistic mitigation, either. While I'm not sure I'd ever
describe a security protection as a slam dunk, I think this is up
there. :)

-Kees

[1] When I say "common", I'm speaking from the perspective of security
flaw frequency. The kernel sees about 1-2 high severity security flaws
a year (with an average lifetime of 5 years), and the
refcount-overflow use-after-free class of flaw is normally reliable
for attackers (and I'd classify as high severity). With 2016 seeing
two known separate refcount-overflow use-after-free flaws, this could
be better described as an epidemic, but I'll try to be less
inflammatory and just say "common".

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-20  9:34           ` Ingo Molnar
@ 2017-07-20 12:34             ` Eric W. Biederman
  2017-07-20 15:12               ` Kees Cook
  0 siblings, 1 reply; 43+ messages in thread
From: Eric W. Biederman @ 2017-07-20 12:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Davidlohr Bueso, Elena Reshetova, linux-kernel,
	peterz, gregkh, mingo, adobriyan, serge, arozansk, keescook,
	Hans Liljestrand, David Windsor

Ingo Molnar <mingo@kernel.org> writes:

> * Andrew Morton <akpm@linux-foundation.org> wrote:
>
>> On Wed, 19 Jul 2017 15:54:27 -0700 Davidlohr Bueso <dave@stgolabs.net> wrote:
>> 
>> > On Wed, 19 Jul 2017, Andrew Morton wrote:
>> > 
>> > >I do rather dislike these conversions from the point of view of
>> > >performance overhead and general code bloat.  But I seem to have lost
>> > >that struggle and I don't think any of these are fastpath(?).
>> > 
>> > Well, since we now have fd25d19 (locking/refcount: Create unchecked atomic_t
>> > implementation), performance is supposed to be ok.
>> 
>> Sure, things are OK for people who disable the feature.
>
> So with the WIP fast-refcount series from Kees:
>
> 	[PATCH v6 0/2] x86: Implement fast refcount overflow protection
>
> I believe the robustness difference between optimized-refcount_t and 
> full-refcount_t will be marginal.
>
> I.e. we'll be able to have both higher API safety _and_ performance.
>
>> But for people who want to enable the feature we really should minimize the cost 
>> by avoiding blindly converting sites which simply don't need it: simple, safe, 
>> old, well-tested code.  Why go and slow down such code?  Need to apply some 
>> common sense here...
>
> It's old, well-tested code _for existing, sane parameters_, until someone finds a 
> decade old bug in one of these with an insane parameters no-one stumbled upon so 
> far, and builds an exploit on top of it.
>
> Only by touching all these places do we have a chance to improve things measurably 
> in terms of reducing the probability of bugs.

The more I hear people pushing the upsides of refcount_t without
considering the downsides the more I dislike it.

- refcount_t is really the wrong thing because it uses saturation
  semantics.  So by definition it includes a bug.

- refcount_t will only really prevent something if there is an extra
  increment.  That is not the kind of bug people are likely to make.

- refcount_t won't help if you have an extra decrement.  The bad
  use-after-free will still happen.

- refcount_t won't help if there is a memory stomp.  As with an extra
  decrement the bad use-after-free will still happen.
  
So all I see is a huge amount of code churn to implement a buggy (by
definition) refcounting API, that risks adding new bugs and only truly
helps with bugs that are unlikely in the first place.

I really don't think this is an obvious slam dunk.

Eric





  
 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-19 22:58         ` Andrew Morton
  2017-07-19 23:11           ` Davidlohr Bueso
@ 2017-07-20  9:34           ` Ingo Molnar
  2017-07-20 12:34             ` Eric W. Biederman
  1 sibling, 1 reply; 43+ messages in thread
From: Ingo Molnar @ 2017-07-20  9:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Davidlohr Bueso, Eric W. Biederman, Elena Reshetova,
	linux-kernel, peterz, gregkh, mingo, adobriyan, serge, arozansk,
	keescook, Hans Liljestrand, David Windsor


* Andrew Morton <akpm@linux-foundation.org> wrote:

> On Wed, 19 Jul 2017 15:54:27 -0700 Davidlohr Bueso <dave@stgolabs.net> wrote:
> 
> > On Wed, 19 Jul 2017, Andrew Morton wrote:
> > 
> > >I do rather dislike these conversions from the point of view of
> > >performance overhead and general code bloat.  But I seem to have lost
> > >that struggle and I don't think any of these are fastpath(?).
> > 
> > Well, since we now have fd25d19 (locking/refcount: Create unchecked atomic_t
> > implementation), performance is supposed to be ok.
> 
> Sure, things are OK for people who disable the feature.

So with the WIP fast-refcount series from Kees:

	[PATCH v6 0/2] x86: Implement fast refcount overflow protection

I believe the robustness difference between optimized-refcount_t and 
full-refcount_t will be marginal.

I.e. we'll be able to have both higher API safety _and_ performance.

> But for people who want to enable the feature we really should minimize the cost 
> by avoiding blindly converting sites which simply don't need it: simple, safe, 
> old, well-tested code.  Why go and slow down such code?  Need to apply some 
> common sense here...

It's old, well-tested code _for existing, sane parameters_, until someone finds a 
decade old bug in one of these with an insane parameters no-one stumbled upon so 
far, and builds an exploit on top of it.

Only by touching all these places do we have a chance to improve things measurably 
in terms of reducing the probability of bugs.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-19 23:20             ` Kees Cook
@ 2017-07-20  0:32               ` Kees Cook
  0 siblings, 0 replies; 43+ messages in thread
From: Kees Cook @ 2017-07-20  0:32 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Andrew Morton, Eric W. Biederman, Elena Reshetova, LKML,
	Peter Zijlstra, Greg KH, Ingo Molnar, Alexey Dobriyan,
	Serge E. Hallyn, arozansk, Hans Liljestrand, David Windsor

On Wed, Jul 19, 2017 at 4:20 PM, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Jul 19, 2017 at 4:11 PM, Davidlohr Bueso <dave@stgolabs.net> wrote:
>> May I suggest using mmtests with the following config file:
>>
>> https://github.com/gormanm/mmtests/blob/7e070a810bc0af92e592e5121d0ea75fada51aeb/configs/config-global-dhp__workload-ipc-scale-short
>>
>> It will run two of Manfred's ipcscale sem benchmarks.
>
> I'll see if I can figure out how to use this for testing the fast
> refcount protection:
> https://lkml.org/lkml/2017/7/18/1223
>
> Then we could see:
>
> before conversion
> after conversion
> with CONFIG_REFCOUNT_FULL
> with fast refcount protection

I have no idea how to read this report. It seems to be mostly noise
(multiple baseline runs seem to show greater variability than compared
against the other possible results).

Test runs were atomic_t, atomic_t-2, refcount_t, refcount-full, and
refcount-fast. (Two baselines, refcount_t conversion, with FULL, and
with the fast implementation.) Output here:
http://pastebin.ubuntu.com/25129382/

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-19 23:11           ` Davidlohr Bueso
@ 2017-07-19 23:20             ` Kees Cook
  2017-07-20  0:32               ` Kees Cook
  0 siblings, 1 reply; 43+ messages in thread
From: Kees Cook @ 2017-07-19 23:20 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Andrew Morton, Eric W. Biederman, Elena Reshetova, LKML,
	Peter Zijlstra, Greg KH, Ingo Molnar, Alexey Dobriyan,
	Serge E. Hallyn, arozansk, Hans Liljestrand, David Windsor

On Wed, Jul 19, 2017 at 4:11 PM, Davidlohr Bueso <dave@stgolabs.net> wrote:
> On Wed, 19 Jul 2017, Andrew Morton wrote:
>
>> On Wed, 19 Jul 2017 15:54:27 -0700 Davidlohr Bueso <dave@stgolabs.net>
>> wrote:
>>
>>> On Wed, 19 Jul 2017, Andrew Morton wrote:
>>>
>>> >I do rather dislike these conversions from the point of view of
>>> >performance overhead and general code bloat.  But I seem to have lost
>>> >that struggle and I don't think any of these are fastpath(?).
>>>
>>> Well, since we now have fd25d19 (locking/refcount: Create unchecked
>>> atomic_t
>>> implementation), performance is supposed to be ok.
>>
>>
>> Sure, things are OK for people who disable the feature.

FWIW, it's off by default.

>>
>> But for people who want to enable the feature we really should minimize
>> the cost by avoiding blindly converting sites which simply don't need
>> it: simple, safe, old, well-tested code.  Why go and slow down such
>> code?  Need to apply some common sense here...

These are the very code paths we'd want to make sure are well
protected since people may never expect them to misbehave when some
"small change" goes in.

> Fair points.
>
>>> It would be lovely to have
>>> some actual numbers nonetheless.
>>
>> Very much so.
>
> May I suggest using mmtests with the following config file:
>
> https://github.com/gormanm/mmtests/blob/7e070a810bc0af92e592e5121d0ea75fada51aeb/configs/config-global-dhp__workload-ipc-scale-short
>
> It will run two of Manfred's ipcscale sem benchmarks.

I'll see if I can figure out how to use this for testing the fast
refcount protection:
https://lkml.org/lkml/2017/7/18/1223

Then we could see:

before conversion
after conversion
with CONFIG_REFCOUNT_FULL
with fast refcount protection

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-19 22:58         ` Andrew Morton
@ 2017-07-19 23:11           ` Davidlohr Bueso
  2017-07-19 23:20             ` Kees Cook
  2017-07-20  9:34           ` Ingo Molnar
  1 sibling, 1 reply; 43+ messages in thread
From: Davidlohr Bueso @ 2017-07-19 23:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eric W. Biederman, Elena Reshetova, linux-kernel, peterz, gregkh,
	mingo, adobriyan, serge, arozansk, keescook, Hans Liljestrand,
	David Windsor

On Wed, 19 Jul 2017, Andrew Morton wrote:

>On Wed, 19 Jul 2017 15:54:27 -0700 Davidlohr Bueso <dave@stgolabs.net> wrote:
>
>> On Wed, 19 Jul 2017, Andrew Morton wrote:
>>
>> >I do rather dislike these conversions from the point of view of
>> >performance overhead and general code bloat.  But I seem to have lost
>> >that struggle and I don't think any of these are fastpath(?).
>>
>> Well, since we now have fd25d19 (locking/refcount: Create unchecked atomic_t
>> implementation), performance is supposed to be ok.
>
>Sure, things are OK for people who disable the feature.
>
>But for people who want to enable the feature we really should minimize
>the cost by avoiding blindly converting sites which simply don't need
>it: simple, safe, old, well-tested code.  Why go and slow down such
>code?  Need to apply some common sense here...

Fair points.

>
>> It would be lovely to have
>> some actual numbers nonetheless.
>
>Very much so.

May I suggest using mmtests with the following config file:

https://github.com/gormanm/mmtests/blob/7e070a810bc0af92e592e5121d0ea75fada51aeb/configs/config-global-dhp__workload-ipc-scale-short

It will run two of Manfred's ipcscale sem benchmarks.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-19 22:54       ` Davidlohr Bueso
@ 2017-07-19 22:58         ` Andrew Morton
  2017-07-19 23:11           ` Davidlohr Bueso
  2017-07-20  9:34           ` Ingo Molnar
  0 siblings, 2 replies; 43+ messages in thread
From: Andrew Morton @ 2017-07-19 22:58 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Eric W. Biederman, Elena Reshetova, linux-kernel, peterz, gregkh,
	mingo, adobriyan, serge, arozansk, keescook, Hans Liljestrand,
	David Windsor

On Wed, 19 Jul 2017 15:54:27 -0700 Davidlohr Bueso <dave@stgolabs.net> wrote:

> On Wed, 19 Jul 2017, Andrew Morton wrote:
> 
> >I do rather dislike these conversions from the point of view of
> >performance overhead and general code bloat.  But I seem to have lost
> >that struggle and I don't think any of these are fastpath(?).
> 
> Well, since we now have fd25d19 (locking/refcount: Create unchecked atomic_t
> implementation), performance is supposed to be ok.

Sure, things are OK for people who disable the feature.

But for people who want to enable the feature we really should minimize
the cost by avoiding blindly converting sites which simply don't need
it: simple, safe, old, well-tested code.  Why go and slow down such
code?  Need to apply some common sense here...

> It would be lovely to have
> some actual numbers nonetheless.

Very much so.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-19 22:35     ` Andrew Morton
@ 2017-07-19 22:54       ` Davidlohr Bueso
  2017-07-19 22:58         ` Andrew Morton
  0 siblings, 1 reply; 43+ messages in thread
From: Davidlohr Bueso @ 2017-07-19 22:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eric W. Biederman, Elena Reshetova, linux-kernel, peterz, gregkh,
	mingo, adobriyan, serge, arozansk, keescook, Hans Liljestrand,
	David Windsor

On Wed, 19 Jul 2017, Andrew Morton wrote:

>I do rather dislike these conversions from the point of view of
>performance overhead and general code bloat.  But I seem to have lost
>that struggle and I don't think any of these are fastpath(?).

Well, since we now have fd25d19 (locking/refcount: Create unchecked atomic_t
implementation), performance is supposed to be ok. It would be lovely to have
some actual numbers nonetheless.

Thanks,
Davidlohr

>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-09 21:59   ` Eric W. Biederman
  2017-07-10  6:48     ` Reshetova, Elena
@ 2017-07-19 22:35     ` Andrew Morton
  2017-07-19 22:54       ` Davidlohr Bueso
  1 sibling, 1 reply; 43+ messages in thread
From: Andrew Morton @ 2017-07-19 22:35 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Elena Reshetova, linux-kernel, peterz, gregkh, mingo, adobriyan,
	serge, arozansk, dave, keescook, Hans Liljestrand, David Windsor

On Sun, 09 Jul 2017 16:59:55 -0500 ebiederm@xmission.com (Eric W. Biederman) wrote:

> Elena Reshetova <elena.reshetova@intel.com> writes:
> 
> > refcount_t type and corresponding API should be
> > used instead of atomic_t when the variable is used as
> > a reference counter. This allows to avoid accidental
> > refcounter overflows that might lead to use-after-free
> > situations.
> 
> In this patch you can see all of the uses of the count.
> What accidental refcount overflows are possible?

I do rather dislike these conversions from the point of view of
performance overhead and general code bloat.  But I seem to have lost
that struggle and I don't think any of these are fastpath(?).

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-10 20:32               ` Eric W. Biederman
@ 2017-07-12  9:21                 ` Reshetova, Elena
  0 siblings, 0 replies; 43+ messages in thread
From: Reshetova, Elena @ 2017-07-12  9:21 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, peterz, gregkh, akpm, mingo, adobriyan, serge,
	arozansk, dave, keescook, Hans Liljestrand, David Windsor


 
> "Reshetova, Elena" <elena.reshetova@intel.com> writes:
> 
> >> "Reshetova, Elena" <elena.reshetova@intel.com> writes:
> >>
> >> >> "Reshetova, Elena" <elena.reshetova@intel.com> writes:
> >> >>
> >> >> 2>> Elena Reshetova <elena.reshetova@intel.com> writes:
> >> >> >>
> >> >> >> > refcount_t type and corresponding API should be
> >> >> >> > used instead of atomic_t when the variable is used as
> >> >> >> > a reference counter. This allows to avoid accidental
> >> >> >> > refcounter overflows that might lead to use-after-free
> >> >> >> > situations.
> >> >> >>
> >> >> >> In this patch you can see all of the uses of the count.
> >> >> >> What accidental refcount overflows are possible?
> >> >> >
> >> >> > Even if one can guarantee and prove that in the current implementation
> >> >> > there are no overflows possible, we can't say that for
> >> >> > sure for any future implementation. Bugs might always happen
> >> >> > unfortunately, but if we convert the refcounter to a safer
> >> >> > type we can be sure that overflows are not possible.
> >> >> >
> >> >> > Does it make sense to you?
> >> >>
> >> >> Not for code that is likely to remain unchanged for a decade no.
> >> >
> >> > Can we really be sure for any kernel code about this? And does it make
> >> > sense to trust our security on a fact like this?
> >>
> >> But refcount_t doesn't fix anything.  At best it changes a bad bug to a
> >> less bad bug.  So now my machine OOMS instead of allows a memory
> >> overwrite.   It still doesn't work.
> >
> > Well, it is a step forward from security standpoint. OOMS is really hard
> > to exploit vs. memory overwrites. Pretty much all exploits need either
> > memory write or memory read, out of memory is really much harder to
> > exploit.
> 
> OOM in production is a denial of service attack.  From a serverity point
> of view an OOM can be considered equivalent to a kernel panic and
> something that requires a box to reboot.

Denial of service is usually (unless the system is designed poorly) not the
highest security concern. It is one thing to get your system down for a
certain period of time vs. telling your customers that all your database
of their data and credit cards info is now public in the Internet. 
So, I would still stand by the statement that it makes the situation better.

> 
> From a long term perspective I expect we will need to change all
> reference counters to a type where that is not saturating but instead
> fails to increment and returns an error.  If we want to keep a system
> functioning in the face of maxing out a reference count that is the only
> way it can truly be done.

I think the work on refcount_t internals will continue. We have a number
of things that need to be improved further. Kees has first set of patches
in work now, and no one expects it to be the last. However, we also do
need to make conversions in the kernel, otherwise we are not moving
anywhere and it is also easier to test implications of changes inside
refcount_t  on as many converted cases as possible when they are merged.

> 
> >> Plus refcount_t does not provide any safety on the architectures where
> >> it is a noop.
> >
> > Not sure I understood this. What do you mean by "noop"?
> > refcount_t is currently architecture independent.
> 
> noop being short for no operation.
> 
> I believe there were some/most archicture implementations that define
> refcount_t to atomic_t out of performance concerns.  I know I saw
> patches fly by to that effect.  On an architecture where refcount_t is
> equivalent to atomic_t the change in these patches is a noop.

This choice isn't architecture-dependent now. It is defaults to atomic_t for all
architectures unless CONFIG_REFCOUNT_FULL is enabled
(then it is arch. independent refcount_t implementation). Next set of
patches (still under review) provides x86 arch. dependent fast
refcount_t implementation. 
> 
> >> >> This looks like a large set of unautomated changes without any real
> >> >> thought put into it.
> >> >
> >> > We are soon into the end of the first year that we started to look into
> >> > refcounter overflow/underflow problem and coming up this far was
> >> > not easy enough (just check all the millions of emails on kernel-hardening
> >> > mailing list). Each refcount_t conversion candidate was first found by
> Coccinelle
> >> > analysis and then manually checked and converted. The story of
> >> > refcount_t API and all discussions go even further.
> >> > So you can't really claim that there is no " thought put into it " :)
> >>
> >> But the conversion of the instance happens without thought and manually.
> >> Which is a good recipe for typos.  Which is what I am saying.
> >>
> >> There have been lots of conversions like that in the kernel and
> >> practically every one has introduced at least one typo.
> >
> > What do you exactly mean by "typo"? Typos should be detected at these
> > stages:
> 
> An unintentional mistake.  The term "thinko" might be more what I am
> thinking of.   Many human mistakes are not slips of the fingers but
> accidentally giving the wrong command at some level.
> 
> > 1) typos like wrong function name etc. can be found at compile time
> >     (trust me I have found a number of these on the very first iteration with
> patches)
> > 2) "typos" (not sure if it is correct to call them typos) like usage of wrong
> refcount_t
> >       API vs. original atomic_t API can be found during internal reviews or reviews
> by maintainers
> 
> This I worry about most as the mental distance from xxx_inc to xxx_dec
> can be very short.

I would say the distance is huge :) Can't think of possible way one would
make such a mistake. 

> 
> > 3) much bigger problem is actually not any typos, but hidden issues that show
> up only
> >     in run-time that detect underflows/overflows or inability to increment from
> zero.
> >  These only are nasty, but given that refcount_t WARNs left and right about
> them,
> >   we can detect them fast.
> 
> Which means I worry about those less.
> 
> > I don't know what is a better recipe for doing API changes like this?
> > Do you have any suggestions?
> 
> I would think a semantic patch targeting a specific lock would be less
> error prone.  I would think that the same semantic patch could be used
> from lock to lock with just a change of the lock that is being targeted.

I was considering in the beginning of writing a full cocci semantic patch, 
but based on how spread the code for refcounters might be (take smth like
socket's refcounter), how many different edge cases we had,
the semantic patch would be so complex (and I
would even not really trust the output personally), needs manual verification
anyway etc. So, at the end we end up with just using .cocci file to find
occurrences, but do conversions manually. And when I say manually, it
didn't mean that I manually typed each function name, the process was
roughly like this:

1. take a look on the reported occurrence. Determine if it should be
converted at the first place
2. If needs conversion, run substitute atomic_* to refcount_* (not this
only changes prefixes)
3. resolve cases that need manual addressing (i.e refcount_* analog
function doesn't simply exists)
4. check the conversion overall if everything makes sense together, 
fix other issues (remove unneeded WARNs, change inc to set for
initializing refcounting etc.)

Note the 4th step has many of these angles.
> 
> I strongly suspect that would reduce the chance of accident when dealing
> with a particular API and being scripted would increase the confidence
> in the changes.

Even if I made a wrong decision with it, I guess now it doesn't matter so much
because we have our conversions now and I am very confident that if there
are mistakes there, they could not be caught by semantic patch. Unless we
write now a script to check the validity of each patch, which is probably
doable, but I don't think I have energy for this left. Any volunteers? :) 
> 
> >> So from an engineering standpoint it is a very valid question to ask
> >> about.  And I find the apparent insistence that you don't make typos
> >> very disturbing.
> >>
> >> >  That almost always results in a typo somewhere
> >> >> that breaks things.
> >> >>
> >> >> So there is no benefit to the code, and a non-zero chance that there
> >> >> will be a typo breaking the code.
> >> >
> >> > The code is very active on issuing WARNs when anything goes wrong.
> >> > Using this feature we have not only found errors in conversions, but
> >> > sometimes errors in code itself. So, any bug would be actually much
> >> > faster visible than using old atomic_t interface.
> >> >
> >> > In addition by default refcount_t equals to atomic, which also gives a
> >> > possibility to make a softer transition and catch all related bugs in couple
> >> > of cycles when enabling CONFIG_REFCOUNT_FULL.
> >>
> >> But if you make a typo and change one operation for another I don't see
> >> how any of that applies.
> >
> > It is hard to make a "typo" to change one operation to another. It is not a
> > one-two char mismatch error.
> 
> Those might be more properly "thinkos" but they do happen.
> 
> > When doing these patches we followed the
> > logic of "less code changes - better" (since less chances of making mistake),
> > so if in some cases functions are changed (like from atomic_sub to
> > refcount_sub_and_test(), or from atomic_inc_not_zero() to atomic_inc() etc.)
> > there was a reason for making it and the change wasn't automatic and without
> > thinking at all. Again, we do have our maintainers also to catch if a change that
> > we did doesn't actually work for them right?
> 
> In general a maintainers job is make certain the appropriate code review
> and due dilligence has happened not nececessarily to perform that code
> review themselves.
> 
> Changing from atomoic_inc_not_zero to a simple refcount_inc really
> troubles me because it makes it harder to switch to something fully
> correct.

I actually meant case like this (just came up recently, so stayed in my head):

static void get_pi_state(struct futex_pi_state *pi_state)
 {
-	WARN_ON_ONCE(!atomic_inc_not_zero(&pi_state->refcount));
+	refcount_inc(&pi_state->refcount);

Just to avoid extra layer of unneeded warning. But even with cases like this
as I said we were super conservative and changing as little as possible unless
It is 100% clear case or people suggest the change during patch review. 

Best Regards,
Elena.


> 
> Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-10 12:11             ` Reshetova, Elena
@ 2017-07-10 20:32               ` Eric W. Biederman
  2017-07-12  9:21                 ` Reshetova, Elena
  0 siblings, 1 reply; 43+ messages in thread
From: Eric W. Biederman @ 2017-07-10 20:32 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: linux-kernel, peterz, gregkh, akpm, mingo, adobriyan, serge,
	arozansk, dave, keescook, Hans Liljestrand, David Windsor

"Reshetova, Elena" <elena.reshetova@intel.com> writes:

>> "Reshetova, Elena" <elena.reshetova@intel.com> writes:
>> 
>> >> "Reshetova, Elena" <elena.reshetova@intel.com> writes:
>> >>
>> >> 2>> Elena Reshetova <elena.reshetova@intel.com> writes:
>> >> >>
>> >> >> > refcount_t type and corresponding API should be
>> >> >> > used instead of atomic_t when the variable is used as
>> >> >> > a reference counter. This allows to avoid accidental
>> >> >> > refcounter overflows that might lead to use-after-free
>> >> >> > situations.
>> >> >>
>> >> >> In this patch you can see all of the uses of the count.
>> >> >> What accidental refcount overflows are possible?
>> >> >
>> >> > Even if one can guarantee and prove that in the current implementation
>> >> > there are no overflows possible, we can't say that for
>> >> > sure for any future implementation. Bugs might always happen
>> >> > unfortunately, but if we convert the refcounter to a safer
>> >> > type we can be sure that overflows are not possible.
>> >> >
>> >> > Does it make sense to you?
>> >>
>> >> Not for code that is likely to remain unchanged for a decade no.
>> >
>> > Can we really be sure for any kernel code about this? And does it make
>> > sense to trust our security on a fact like this?
>> 
>> But refcount_t doesn't fix anything.  At best it changes a bad bug to a
>> less bad bug.  So now my machine OOMS instead of allows a memory
>> overwrite.   It still doesn't work.
>
> Well, it is a step forward from security standpoint. OOMS is really hard
> to exploit vs. memory overwrites. Pretty much all exploits need either
> memory write or memory read, out of memory is really much harder to
> exploit.

OOM in production is a denial of service attack.  From a serverity point
of view an OOM can be considered equivalent to a kernel panic and
something that requires a box to reboot.

>From a long term perspective I expect we will need to change all
reference counters to a type where that is not saturating but instead
fails to increment and returns an error.  If we want to keep a system
functioning in the face of maxing out a reference count that is the only
way it can truly be done.

>> Plus refcount_t does not provide any safety on the architectures where
>> it is a noop.
>
> Not sure I understood this. What do you mean by "noop"? 
> refcount_t is currently architecture independent.

noop being short for no operation.

I believe there were some/most archicture implementations that define
refcount_t to atomic_t out of performance concerns.  I know I saw
patches fly by to that effect.  On an architecture where refcount_t is
equivalent to atomic_t the change in these patches is a noop.

>> >> This looks like a large set of unautomated changes without any real
>> >> thought put into it.
>> >
>> > We are soon into the end of the first year that we started to look into
>> > refcounter overflow/underflow problem and coming up this far was
>> > not easy enough (just check all the millions of emails on kernel-hardening
>> > mailing list). Each refcount_t conversion candidate was first found by Coccinelle
>> > analysis and then manually checked and converted. The story of
>> > refcount_t API and all discussions go even further.
>> > So you can't really claim that there is no " thought put into it " :)
>> 
>> But the conversion of the instance happens without thought and manually.
>> Which is a good recipe for typos.  Which is what I am saying.
>> 
>> There have been lots of conversions like that in the kernel and
>> practically every one has introduced at least one typo.
>
> What do you exactly mean by "typo"? Typos should be detected at these
> stages:

An unintentional mistake.  The term "thinko" might be more what I am
thinking of.   Many human mistakes are not slips of the fingers but
accidentally giving the wrong command at some level.  

> 1) typos like wrong function name etc. can be found at compile time
>     (trust me I have found a number of these on the very first iteration with patches)
> 2) "typos" (not sure if it is correct to call them typos) like usage of wrong refcount_t
>       API vs. original atomic_t API can be found during internal reviews or reviews by maintainers

This I worry about most as the mental distance from xxx_inc to xxx_dec
can be very short.

> 3) much bigger problem is actually not any typos, but hidden issues that show up only
>     in run-time that detect underflows/overflows or inability to increment from zero. 
>  These only are nasty, but given that refcount_t WARNs left and right about them,
>   we can detect them fast.

Which means I worry about those less.

> I don't know what is a better recipe for doing API changes like this?
> Do you have any suggestions?

I would think a semantic patch targeting a specific lock would be less
error prone.  I would think that the same semantic patch could be used
from lock to lock with just a change of the lock that is being targeted.

I strongly suspect that would reduce the chance of accident when dealing
with a particular API and being scripted would increase the confidence
in the changes.

>> So from an engineering standpoint it is a very valid question to ask
>> about.  And I find the apparent insistence that you don't make typos
>> very disturbing.
>> 
>> >  That almost always results in a typo somewhere
>> >> that breaks things.
>> >>
>> >> So there is no benefit to the code, and a non-zero chance that there
>> >> will be a typo breaking the code.
>> >
>> > The code is very active on issuing WARNs when anything goes wrong.
>> > Using this feature we have not only found errors in conversions, but
>> > sometimes errors in code itself. So, any bug would be actually much
>> > faster visible than using old atomic_t interface.
>> >
>> > In addition by default refcount_t equals to atomic, which also gives a
>> > possibility to make a softer transition and catch all related bugs in couple
>> > of cycles when enabling CONFIG_REFCOUNT_FULL.
>> 
>> But if you make a typo and change one operation for another I don't see
>> how any of that applies.
>
> It is hard to make a "typo" to change one operation to another. It is not a
> one-two char mismatch error.

Those might be more properly "thinkos" but they do happen.

> When doing these patches we followed the
> logic of "less code changes - better" (since less chances of making mistake),
> so if in some cases functions are changed (like from atomic_sub to
> refcount_sub_and_test(), or from atomic_inc_not_zero() to atomic_inc() etc.)
> there was a reason for making it and the change wasn't automatic and without
> thinking at all. Again, we do have our maintainers also to catch if a change that
> we did doesn't actually work for them right?

In general a maintainers job is make certain the appropriate code review
and due dilligence has happened not nececessarily to perform that code
review themselves.

Changing from atomoic_inc_not_zero to a simple refcount_inc really
troubles me because it makes it harder to switch to something fully
correct.

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-10 11:26           ` Eric W. Biederman
@ 2017-07-10 12:11             ` Reshetova, Elena
  2017-07-10 20:32               ` Eric W. Biederman
  0 siblings, 1 reply; 43+ messages in thread
From: Reshetova, Elena @ 2017-07-10 12:11 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, peterz, gregkh, akpm, mingo, adobriyan, serge,
	arozansk, dave, keescook, Hans Liljestrand, David Windsor

> "Reshetova, Elena" <elena.reshetova@intel.com> writes:
> 
> >> "Reshetova, Elena" <elena.reshetova@intel.com> writes:
> >>
> >> 2>> Elena Reshetova <elena.reshetova@intel.com> writes:
> >> >>
> >> >> > refcount_t type and corresponding API should be
> >> >> > used instead of atomic_t when the variable is used as
> >> >> > a reference counter. This allows to avoid accidental
> >> >> > refcounter overflows that might lead to use-after-free
> >> >> > situations.
> >> >>
> >> >> In this patch you can see all of the uses of the count.
> >> >> What accidental refcount overflows are possible?
> >> >
> >> > Even if one can guarantee and prove that in the current implementation
> >> > there are no overflows possible, we can't say that for
> >> > sure for any future implementation. Bugs might always happen
> >> > unfortunately, but if we convert the refcounter to a safer
> >> > type we can be sure that overflows are not possible.
> >> >
> >> > Does it make sense to you?
> >>
> >> Not for code that is likely to remain unchanged for a decade no.
> >
> > Can we really be sure for any kernel code about this? And does it make
> > sense to trust our security on a fact like this?
> 
> But refcount_t doesn't fix anything.  At best it changes a bad bug to a
> less bad bug.  So now my machine OOMS instead of allows a memory
> overwrite.   It still doesn't work.

Well, it is a step forward from security standpoint. OOMS is really hard
to exploit vs. memory overwrites. Pretty much all exploits need either
memory write or memory read, out of memory is really much harder to
exploit. 

> 
> Plus refcount_t does not provide any safety on the architectures where
> it is a noop.

Not sure I understood this. What do you mean by "noop"? 
refcount_t is currently architecture independent. 


> 
> >> This looks like a large set of unautomated changes without any real
> >> thought put into it.
> >
> > We are soon into the end of the first year that we started to look into
> > refcounter overflow/underflow problem and coming up this far was
> > not easy enough (just check all the millions of emails on kernel-hardening
> > mailing list). Each refcount_t conversion candidate was first found by Coccinelle
> > analysis and then manually checked and converted. The story of
> > refcount_t API and all discussions go even further.
> > So you can't really claim that there is no " thought put into it " :)
> 
> But the conversion of the instance happens without thought and manually.
> Which is a good recipe for typos.  Which is what I am saying.
> 
> There have been lots of conversions like that in the kernel and
> practically every one has introduced at least one typo.

What do you exactly mean by "typo"? Typos should be detected at these stages:

1) typos like wrong function name etc. can be found at compile time
    (trust me I have found a number of these on the very first iteration with patches)
2) "typos" (not sure if it is correct to call them typos) like usage of wrong refcount_t
      API vs. original atomic_t API can be found during internal reviews or reviews by maintainers
3) much bigger problem is actually not any typos, but hidden issues that show up only
    in run-time that detect underflows/overflows or inability to increment from zero. 
 These only are nasty, but given that refcount_t WARNs left and right about them,
  we can detect them fast. 

I don't know what is a better recipe for doing API changes like this?
Do you have any suggestions? 

> 
> So from an engineering standpoint it is a very valid question to ask
> about.  And I find the apparent insistence that you don't make typos
> very disturbing.
> 
> >  That almost always results in a typo somewhere
> >> that breaks things.
> >>
> >> So there is no benefit to the code, and a non-zero chance that there
> >> will be a typo breaking the code.
> >
> > The code is very active on issuing WARNs when anything goes wrong.
> > Using this feature we have not only found errors in conversions, but
> > sometimes errors in code itself. So, any bug would be actually much
> > faster visible than using old atomic_t interface.
> >
> > In addition by default refcount_t equals to atomic, which also gives a
> > possibility to make a softer transition and catch all related bugs in couple
> > of cycles when enabling CONFIG_REFCOUNT_FULL.
> 
> But if you make a typo and change one operation for another I don't see
> how any of that applies.

It is hard to make a "typo" to change one operation to another. It is not a
one-two char mismatch error. When doing these patches we followed the
logic of "less code changes - better" (since less chances of making mistake),
so if in some cases functions are changed (like from atomic_sub to
refcount_sub_and_test(), or from atomic_inc_not_zero() to atomic_inc() etc.)
there was a reason for making it and the change wasn't automatic and without
thinking at all. Again, we do have our maintainers also to catch if a change that
we did doesn't actually work for them right?

Best Regards,
Elena. 
> 
> And that is what it looks like I we are looking at here.
> 
> Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-10  9:56         ` Reshetova, Elena
@ 2017-07-10 11:26           ` Eric W. Biederman
  2017-07-10 12:11             ` Reshetova, Elena
  0 siblings, 1 reply; 43+ messages in thread
From: Eric W. Biederman @ 2017-07-10 11:26 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: linux-kernel, peterz, gregkh, akpm, mingo, adobriyan, serge,
	arozansk, dave, keescook, Hans Liljestrand, David Windsor

"Reshetova, Elena" <elena.reshetova@intel.com> writes:

>> "Reshetova, Elena" <elena.reshetova@intel.com> writes:
>> 
>> 2>> Elena Reshetova <elena.reshetova@intel.com> writes:
>> >>
>> >> > refcount_t type and corresponding API should be
>> >> > used instead of atomic_t when the variable is used as
>> >> > a reference counter. This allows to avoid accidental
>> >> > refcounter overflows that might lead to use-after-free
>> >> > situations.
>> >>
>> >> In this patch you can see all of the uses of the count.
>> >> What accidental refcount overflows are possible?
>> >
>> > Even if one can guarantee and prove that in the current implementation
>> > there are no overflows possible, we can't say that for
>> > sure for any future implementation. Bugs might always happen
>> > unfortunately, but if we convert the refcounter to a safer
>> > type we can be sure that overflows are not possible.
>> >
>> > Does it make sense to you?
>> 
>> Not for code that is likely to remain unchanged for a decade no.
>
> Can we really be sure for any kernel code about this? And does it make
> sense to trust our security on a fact like this?

But refcount_t doesn't fix anything.  At best it changes a bad bug to a
less bad bug.  So now my machine OOMS instead of allows a memory
overwrite.   It still doesn't work.

Plus refcount_t does not provide any safety on the architectures where
it is a noop.

>> This looks like a large set of unautomated changes without any real
>> thought put into it. 
>
> We are soon into the end of the first year that we started to look into
> refcounter overflow/underflow problem and coming up this far was 
> not easy enough (just check all the millions of emails on kernel-hardening
> mailing list). Each refcount_t conversion candidate was first found by Coccinelle
> analysis and then manually checked and converted. The story of 
> refcount_t API and all discussions go even further. 
> So you can't really claim that there is no " thought put into it " :)

But the conversion of the instance happens without thought and manually.
Which is a good recipe for typos.  Which is what I am saying.

There have been lots of conversions like that in the kernel and
practically every one has introduced at least one typo.

So from an engineering standpoint it is a very valid question to ask
about.  And I find the apparent insistence that you don't make typos
very disturbing.

>  That almost always results in a typo somewhere
>> that breaks things.
>> 
>> So there is no benefit to the code, and a non-zero chance that there
>> will be a typo breaking the code.
>
> The code is very active on issuing WARNs when anything goes wrong. 
> Using this feature we have not only found errors in conversions, but
> sometimes errors in code itself. So, any bug would be actually much
> faster visible than using old atomic_t interface. 
>
> In addition by default refcount_t equals to atomic, which also gives a
> possibility to make a softer transition and catch all related bugs in couple
> of cycles when enabling CONFIG_REFCOUNT_FULL.

But if you make a typo and change one operation for another I don't see
how any of that applies.

And that is what it looks like I we are looking at here.

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-10  9:34         ` Alexey Dobriyan
@ 2017-07-10 11:19           ` Eric W. Biederman
  0 siblings, 0 replies; 43+ messages in thread
From: Eric W. Biederman @ 2017-07-10 11:19 UTC (permalink / raw)
  To: Alexey Dobriyan
  Cc: Reshetova, Elena, linux-kernel, peterz, gregkh, akpm, mingo,
	serge, arozansk, dave, keescook, Hans Liljestrand, David Windsor

Alexey Dobriyan <adobriyan@gmail.com> writes:

> On Mon, Jul 10, 2017 at 11:37 AM, Eric W. Biederman
> <ebiederm@xmission.com> wrote:
>> "Reshetova, Elena" <elena.reshetova@intel.com> writes:
>>
>> 2>> Elena Reshetova <elena.reshetova@intel.com> writes:
>>>>
>>>> > refcount_t type and corresponding API should be
>>>> > used instead of atomic_t when the variable is used as
>>>> > a reference counter. This allows to avoid accidental
>>>> > refcounter overflows that might lead to use-after-free
>>>> > situations.
>>>>
>>>> In this patch you can see all of the uses of the count.
>>>> What accidental refcount overflows are possible?
>>>
>>> Even if one can guarantee and prove that in the current implementation
>>> there are no overflows possible, we can't say that for
>>> sure for any future implementation. Bugs might always happen
>>> unfortunately, but if we convert the refcounter to a safer
>>> type we can be sure that overflows are not possible.
>>>
>>> Does it make sense to you?
>>
>> Not for code that is likely to remain unchanged for a decade no.
>>
>> This looks like a large set of unautomated changes without any real
>> thought put into it.  That almost always results in a typo somewhere
>> that breaks things.
>
> This is nonsense. The wrong code would simply emit a warning
> which are caught very quickly.

That depends on the typo.

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-10  8:37       ` Eric W. Biederman
  2017-07-10  9:34         ` Alexey Dobriyan
@ 2017-07-10  9:56         ` Reshetova, Elena
  2017-07-10 11:26           ` Eric W. Biederman
  1 sibling, 1 reply; 43+ messages in thread
From: Reshetova, Elena @ 2017-07-10  9:56 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, peterz, gregkh, akpm, mingo, adobriyan, serge,
	arozansk, dave, keescook, Hans Liljestrand, David Windsor

> "Reshetova, Elena" <elena.reshetova@intel.com> writes:
> 
> 2>> Elena Reshetova <elena.reshetova@intel.com> writes:
> >>
> >> > refcount_t type and corresponding API should be
> >> > used instead of atomic_t when the variable is used as
> >> > a reference counter. This allows to avoid accidental
> >> > refcounter overflows that might lead to use-after-free
> >> > situations.
> >>
> >> In this patch you can see all of the uses of the count.
> >> What accidental refcount overflows are possible?
> >
> > Even if one can guarantee and prove that in the current implementation
> > there are no overflows possible, we can't say that for
> > sure for any future implementation. Bugs might always happen
> > unfortunately, but if we convert the refcounter to a safer
> > type we can be sure that overflows are not possible.
> >
> > Does it make sense to you?
> 
> Not for code that is likely to remain unchanged for a decade no.

Can we really be sure for any kernel code about this? And does it make
sense to trust our security on a fact like this?

> 
> This looks like a large set of unautomated changes without any real
> thought put into it. 

We are soon into the end of the first year that we started to look into
refcounter overflow/underflow problem and coming up this far was 
not easy enough (just check all the millions of emails on kernel-hardening
mailing list). Each refcount_t conversion candidate was first found by Coccinelle
analysis and then manually checked and converted. The story of 
refcount_t API and all discussions go even further. 
So you can't really claim that there is no " thought put into it " :)


 That almost always results in a typo somewhere
> that breaks things.
> 
> So there is no benefit to the code, and a non-zero chance that there
> will be a typo breaking the code.

The code is very active on issuing WARNs when anything goes wrong. 
Using this feature we have not only found errors in conversions, but
sometimes errors in code itself. So, any bug would be actually much
faster visible than using old atomic_t interface. 

In addition by default refcount_t equals to atomic, which also gives a
possibility to make a softer transition and catch all related bugs in couple
of cycles when enabling CONFIG_REFCOUNT_FULL. 


Best Regards,
Elena.

 
> 
> All to harden the code for an unlikely future when the code is
> updated with a full test cycle and people paying attention.
> 
> Introduce a bug now to avoid a bug in the future.  That seems like a
> very poor engineering trade off.
> 
> Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-10  8:37       ` Eric W. Biederman
@ 2017-07-10  9:34         ` Alexey Dobriyan
  2017-07-10 11:19           ` Eric W. Biederman
  2017-07-10  9:56         ` Reshetova, Elena
  1 sibling, 1 reply; 43+ messages in thread
From: Alexey Dobriyan @ 2017-07-10  9:34 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Reshetova, Elena, linux-kernel, peterz, gregkh, akpm, mingo,
	serge, arozansk, dave, keescook, Hans Liljestrand, David Windsor

On Mon, Jul 10, 2017 at 11:37 AM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> "Reshetova, Elena" <elena.reshetova@intel.com> writes:
>
> 2>> Elena Reshetova <elena.reshetova@intel.com> writes:
>>>
>>> > refcount_t type and corresponding API should be
>>> > used instead of atomic_t when the variable is used as
>>> > a reference counter. This allows to avoid accidental
>>> > refcounter overflows that might lead to use-after-free
>>> > situations.
>>>
>>> In this patch you can see all of the uses of the count.
>>> What accidental refcount overflows are possible?
>>
>> Even if one can guarantee and prove that in the current implementation
>> there are no overflows possible, we can't say that for
>> sure for any future implementation. Bugs might always happen
>> unfortunately, but if we convert the refcounter to a safer
>> type we can be sure that overflows are not possible.
>>
>> Does it make sense to you?
>
> Not for code that is likely to remain unchanged for a decade no.
>
> This looks like a large set of unautomated changes without any real
> thought put into it.  That almost always results in a typo somewhere
> that breaks things.

This is nonsense. The wrong code would simply emit a warning
which are caught very quickly.

> So there is no benefit to the code, and a non-zero chance that there
> will be a typo breaking the code.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-10  6:48     ` Reshetova, Elena
@ 2017-07-10  8:37       ` Eric W. Biederman
  2017-07-10  9:34         ` Alexey Dobriyan
  2017-07-10  9:56         ` Reshetova, Elena
  0 siblings, 2 replies; 43+ messages in thread
From: Eric W. Biederman @ 2017-07-10  8:37 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: linux-kernel, peterz, gregkh, akpm, mingo, adobriyan, serge,
	arozansk, dave, keescook, Hans Liljestrand, David Windsor

"Reshetova, Elena" <elena.reshetova@intel.com> writes:

2>> Elena Reshetova <elena.reshetova@intel.com> writes:
>> 
>> > refcount_t type and corresponding API should be
>> > used instead of atomic_t when the variable is used as
>> > a reference counter. This allows to avoid accidental
>> > refcounter overflows that might lead to use-after-free
>> > situations.
>> 
>> In this patch you can see all of the uses of the count.
>> What accidental refcount overflows are possible?
>
> Even if one can guarantee and prove that in the current implementation
> there are no overflows possible, we can't say that for
> sure for any future implementation. Bugs might always happen
> unfortunately, but if we convert the refcounter to a safer
> type we can be sure that overflows are not possible. 
>
> Does it make sense to you?

Not for code that is likely to remain unchanged for a decade no.

This looks like a large set of unautomated changes without any real
thought put into it.  That almost always results in a typo somewhere
that breaks things.

So there is no benefit to the code, and a non-zero chance that there
will be a typo breaking the code.

All to harden the code for an unlikely future when the code is
updated with a full test cycle and people paying attention.

Introduce a bug now to avoid a bug in the future.  That seems like a
very poor engineering trade off.

Eric

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-09 21:59   ` Eric W. Biederman
@ 2017-07-10  6:48     ` Reshetova, Elena
  2017-07-10  8:37       ` Eric W. Biederman
  2017-07-19 22:35     ` Andrew Morton
  1 sibling, 1 reply; 43+ messages in thread
From: Reshetova, Elena @ 2017-07-10  6:48 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, peterz, gregkh, akpm, mingo, adobriyan, serge,
	arozansk, dave, keescook, Hans Liljestrand, David Windsor


> Elena Reshetova <elena.reshetova@intel.com> writes:
> 
> > refcount_t type and corresponding API should be
> > used instead of atomic_t when the variable is used as
> > a reference counter. This allows to avoid accidental
> > refcounter overflows that might lead to use-after-free
> > situations.
> 
> In this patch you can see all of the uses of the count.
> What accidental refcount overflows are possible?

Even if one can guarantee and prove that in the current implementation
there are no overflows possible, we can't say that for
sure for any future implementation. Bugs might always happen
unfortunately, but if we convert the refcounter to a safer
type we can be sure that overflows are not possible. 

Does it make sense to you?

Best Regards,
Elena.

> 
> Eric
> 
> > Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
> > Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
> > Signed-off-by: Kees Cook <keescook@chromium.org>
> > Signed-off-by: David Windsor <dwindsor@gmail.com>
> > ---
> >  include/linux/ipc_namespace.h | 5 +++--
> >  ipc/msgutil.c                 | 2 +-
> >  ipc/namespace.c               | 4 ++--
> >  3 files changed, 6 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h
> > index 65327ee..e81445c 100644
> > --- a/include/linux/ipc_namespace.h
> > +++ b/include/linux/ipc_namespace.h
> > @@ -7,6 +7,7 @@
> >  #include <linux/notifier.h>
> >  #include <linux/nsproxy.h>
> >  #include <linux/ns_common.h>
> > +#include <linux/refcount.h>
> >
> >  struct user_namespace;
> >
> > @@ -19,7 +20,7 @@ struct ipc_ids {
> >  };
> >
> >  struct ipc_namespace {
> > -	atomic_t	count;
> > +	refcount_t	count;
> >  	struct ipc_ids	ids[3];
> >
> >  	int		sem_ctls[4];
> > @@ -118,7 +119,7 @@ extern struct ipc_namespace *copy_ipcs(unsigned long
> flags,
> >  static inline struct ipc_namespace *get_ipc_ns(struct ipc_namespace *ns)
> >  {
> >  	if (ns)
> > -		atomic_inc(&ns->count);
> > +		refcount_inc(&ns->count);
> >  	return ns;
> >  }
> >
> > diff --git a/ipc/msgutil.c b/ipc/msgutil.c
> > index bf74eaa..8459802 100644
> > --- a/ipc/msgutil.c
> > +++ b/ipc/msgutil.c
> > @@ -29,7 +29,7 @@ DEFINE_SPINLOCK(mq_lock);
> >   * and not CONFIG_IPC_NS.
> >   */
> >  struct ipc_namespace init_ipc_ns = {
> > -	.count		= ATOMIC_INIT(1),
> > +	.count		= REFCOUNT_INIT(1),
> >  	.user_ns = &init_user_ns,
> >  	.ns.inum = PROC_IPC_INIT_INO,
> >  #ifdef CONFIG_IPC_NS
> > diff --git a/ipc/namespace.c b/ipc/namespace.c
> > index b4d80f9..7af6e6b 100644
> > --- a/ipc/namespace.c
> > +++ b/ipc/namespace.c
> > @@ -50,7 +50,7 @@ static struct ipc_namespace *create_ipc_ns(struct
> user_namespace *user_ns,
> >  		goto fail_free;
> >  	ns->ns.ops = &ipcns_operations;
> >
> > -	atomic_set(&ns->count, 1);
> > +	refcount_set(&ns->count, 1);
> >  	ns->user_ns = get_user_ns(user_ns);
> >  	ns->ucounts = ucounts;
> >
> > @@ -144,7 +144,7 @@ static void free_ipc_ns(struct ipc_namespace *ns)
> >   */
> >  void put_ipc_ns(struct ipc_namespace *ns)
> >  {
> > -	if (atomic_dec_and_lock(&ns->count, &mq_lock)) {
> > +	if (refcount_dec_and_lock(&ns->count, &mq_lock)) {
> >  		mq_clear_sbinfo(ns);
> >  		spin_unlock(&mq_lock);
> >  		mq_put_mnt(ns);

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-07  8:59 ` [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t Elena Reshetova
@ 2017-07-09 21:59   ` Eric W. Biederman
  2017-07-10  6:48     ` Reshetova, Elena
  2017-07-19 22:35     ` Andrew Morton
  0 siblings, 2 replies; 43+ messages in thread
From: Eric W. Biederman @ 2017-07-09 21:59 UTC (permalink / raw)
  To: Elena Reshetova
  Cc: linux-kernel, peterz, gregkh, akpm, mingo, adobriyan, serge,
	arozansk, dave, keescook, Hans Liljestrand, David Windsor

Elena Reshetova <elena.reshetova@intel.com> writes:

> refcount_t type and corresponding API should be
> used instead of atomic_t when the variable is used as
> a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free
> situations.

In this patch you can see all of the uses of the count.
What accidental refcount overflows are possible?

Eric

> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: David Windsor <dwindsor@gmail.com>
> ---
>  include/linux/ipc_namespace.h | 5 +++--
>  ipc/msgutil.c                 | 2 +-
>  ipc/namespace.c               | 4 ++--
>  3 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h
> index 65327ee..e81445c 100644
> --- a/include/linux/ipc_namespace.h
> +++ b/include/linux/ipc_namespace.h
> @@ -7,6 +7,7 @@
>  #include <linux/notifier.h>
>  #include <linux/nsproxy.h>
>  #include <linux/ns_common.h>
> +#include <linux/refcount.h>
>  
>  struct user_namespace;
>  
> @@ -19,7 +20,7 @@ struct ipc_ids {
>  };
>  
>  struct ipc_namespace {
> -	atomic_t	count;
> +	refcount_t	count;
>  	struct ipc_ids	ids[3];
>  
>  	int		sem_ctls[4];
> @@ -118,7 +119,7 @@ extern struct ipc_namespace *copy_ipcs(unsigned long flags,
>  static inline struct ipc_namespace *get_ipc_ns(struct ipc_namespace *ns)
>  {
>  	if (ns)
> -		atomic_inc(&ns->count);
> +		refcount_inc(&ns->count);
>  	return ns;
>  }
>  
> diff --git a/ipc/msgutil.c b/ipc/msgutil.c
> index bf74eaa..8459802 100644
> --- a/ipc/msgutil.c
> +++ b/ipc/msgutil.c
> @@ -29,7 +29,7 @@ DEFINE_SPINLOCK(mq_lock);
>   * and not CONFIG_IPC_NS.
>   */
>  struct ipc_namespace init_ipc_ns = {
> -	.count		= ATOMIC_INIT(1),
> +	.count		= REFCOUNT_INIT(1),
>  	.user_ns = &init_user_ns,
>  	.ns.inum = PROC_IPC_INIT_INO,
>  #ifdef CONFIG_IPC_NS
> diff --git a/ipc/namespace.c b/ipc/namespace.c
> index b4d80f9..7af6e6b 100644
> --- a/ipc/namespace.c
> +++ b/ipc/namespace.c
> @@ -50,7 +50,7 @@ static struct ipc_namespace *create_ipc_ns(struct user_namespace *user_ns,
>  		goto fail_free;
>  	ns->ns.ops = &ipcns_operations;
>  
> -	atomic_set(&ns->count, 1);
> +	refcount_set(&ns->count, 1);
>  	ns->user_ns = get_user_ns(user_ns);
>  	ns->ucounts = ucounts;
>  
> @@ -144,7 +144,7 @@ static void free_ipc_ns(struct ipc_namespace *ns)
>   */
>  void put_ipc_ns(struct ipc_namespace *ns)
>  {
> -	if (atomic_dec_and_lock(&ns->count, &mq_lock)) {
> +	if (refcount_dec_and_lock(&ns->count, &mq_lock)) {
>  		mq_clear_sbinfo(ns);
>  		spin_unlock(&mq_lock);
>  		mq_put_mnt(ns);

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
  2017-07-07  8:59 [PATCH 0/3] v2 ipc subsystem refcount coversions Elena Reshetova
@ 2017-07-07  8:59 ` Elena Reshetova
  2017-07-09 21:59   ` Eric W. Biederman
  0 siblings, 1 reply; 43+ messages in thread
From: Elena Reshetova @ 2017-07-07  8:59 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterz, gregkh, akpm, ebiederm, mingo, adobriyan, serge,
	arozansk, dave, keescook, Elena Reshetova, Hans Liljestrand,
	David Windsor

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
---
 include/linux/ipc_namespace.h | 5 +++--
 ipc/msgutil.c                 | 2 +-
 ipc/namespace.c               | 4 ++--
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h
index 65327ee..e81445c 100644
--- a/include/linux/ipc_namespace.h
+++ b/include/linux/ipc_namespace.h
@@ -7,6 +7,7 @@
 #include <linux/notifier.h>
 #include <linux/nsproxy.h>
 #include <linux/ns_common.h>
+#include <linux/refcount.h>
 
 struct user_namespace;
 
@@ -19,7 +20,7 @@ struct ipc_ids {
 };
 
 struct ipc_namespace {
-	atomic_t	count;
+	refcount_t	count;
 	struct ipc_ids	ids[3];
 
 	int		sem_ctls[4];
@@ -118,7 +119,7 @@ extern struct ipc_namespace *copy_ipcs(unsigned long flags,
 static inline struct ipc_namespace *get_ipc_ns(struct ipc_namespace *ns)
 {
 	if (ns)
-		atomic_inc(&ns->count);
+		refcount_inc(&ns->count);
 	return ns;
 }
 
diff --git a/ipc/msgutil.c b/ipc/msgutil.c
index bf74eaa..8459802 100644
--- a/ipc/msgutil.c
+++ b/ipc/msgutil.c
@@ -29,7 +29,7 @@ DEFINE_SPINLOCK(mq_lock);
  * and not CONFIG_IPC_NS.
  */
 struct ipc_namespace init_ipc_ns = {
-	.count		= ATOMIC_INIT(1),
+	.count		= REFCOUNT_INIT(1),
 	.user_ns = &init_user_ns,
 	.ns.inum = PROC_IPC_INIT_INO,
 #ifdef CONFIG_IPC_NS
diff --git a/ipc/namespace.c b/ipc/namespace.c
index b4d80f9..7af6e6b 100644
--- a/ipc/namespace.c
+++ b/ipc/namespace.c
@@ -50,7 +50,7 @@ static struct ipc_namespace *create_ipc_ns(struct user_namespace *user_ns,
 		goto fail_free;
 	ns->ns.ops = &ipcns_operations;
 
-	atomic_set(&ns->count, 1);
+	refcount_set(&ns->count, 1);
 	ns->user_ns = get_user_ns(user_ns);
 	ns->ucounts = ucounts;
 
@@ -144,7 +144,7 @@ static void free_ipc_ns(struct ipc_namespace *ns)
  */
 void put_ipc_ns(struct ipc_namespace *ns)
 {
-	if (atomic_dec_and_lock(&ns->count, &mq_lock)) {
+	if (refcount_dec_and_lock(&ns->count, &mq_lock)) {
 		mq_clear_sbinfo(ns);
 		spin_unlock(&mq_lock);
 		mq_put_mnt(ns);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2017-07-20 15:12 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-20 11:29 [PATCH 0/3] ipc subsystem refcounter conversions Elena Reshetova
2017-02-20 11:29 ` [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t Elena Reshetova
2017-05-27 19:41   ` Kees Cook
2017-05-28 12:10     ` Manfred Spraul
2017-02-20 11:29 ` [PATCH 2/3] ipc: convert sem_undo_list.refcnt " Elena Reshetova
2017-05-27 19:44   ` Kees Cook
2017-02-20 11:29 ` [PATCH 3/3] ipc: convert ipc_rcu.refcount " Elena Reshetova
2017-05-27 19:47   ` Kees Cook
2017-02-20 11:42 ` [PATCH 0/3] ipc subsystem refcounter conversions Andy Shevchenko
2017-02-20 12:30   ` Reshetova, Elena
2017-02-22 15:41 ` Davidlohr Bueso
2017-03-04  0:23 ` Andrew Morton
2017-03-06  9:51   ` Reshetova, Elena
2017-05-27 19:58   ` Kees Cook
2017-05-29  8:39     ` Christoph Hellwig
2017-05-29  9:11       ` Eric W. Biederman
2017-05-29 10:24         ` Peter Zijlstra
2017-05-29 10:49           ` Eric W. Biederman
2017-05-29 11:30             ` Eric W. Biederman
2017-05-29 11:39               ` Eric W. Biederman
2017-05-29 12:23                 ` Peter Zijlstra
2017-05-29 15:43                   ` Peter Zijlstra
2017-05-29 12:13             ` Peter Zijlstra
2017-07-07  8:59 [PATCH 0/3] v2 ipc subsystem refcount coversions Elena Reshetova
2017-07-07  8:59 ` [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t Elena Reshetova
2017-07-09 21:59   ` Eric W. Biederman
2017-07-10  6:48     ` Reshetova, Elena
2017-07-10  8:37       ` Eric W. Biederman
2017-07-10  9:34         ` Alexey Dobriyan
2017-07-10 11:19           ` Eric W. Biederman
2017-07-10  9:56         ` Reshetova, Elena
2017-07-10 11:26           ` Eric W. Biederman
2017-07-10 12:11             ` Reshetova, Elena
2017-07-10 20:32               ` Eric W. Biederman
2017-07-12  9:21                 ` Reshetova, Elena
2017-07-19 22:35     ` Andrew Morton
2017-07-19 22:54       ` Davidlohr Bueso
2017-07-19 22:58         ` Andrew Morton
2017-07-19 23:11           ` Davidlohr Bueso
2017-07-19 23:20             ` Kees Cook
2017-07-20  0:32               ` Kees Cook
2017-07-20  9:34           ` Ingo Molnar
2017-07-20 12:34             ` Eric W. Biederman
2017-07-20 15:12               ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).