linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf: Incorrect use of snprintf results in SEGV
@ 2012-03-07  0:42 Anton Blanchard
  2012-03-07  0:49 ` Peter Seebach
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Anton Blanchard @ 2012-03-07  0:42 UTC (permalink / raw)
  To: acme, paulus, peterz, mingo, dsahern, fweisbec, yanmin_zhang, emunson
  Cc: linux-kernel


I have a workload where perf top scribbles over the stack and we
SEGV. What makes it interesting is that an snprintf is causing this.

The workload is a c++ gem that has method names over 3000 characters
long, but snprintf is designed to avoid overrunning buffers. So what
went wrong?

The problem is we assume snprintf returns the number of characters
written:

    ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", self->level);
...
    ret += repsep_snprintf(bf + ret, size - ret, "%s", self->ms.sym->name);

Unfortunately this is not how snprintf works. snprintf returns the
number of characters that would have been written if there was enough
space. In the above case, if the first snprintf returns a value larger
than size, we pass a negative size into the second snprintf and
happily scribble over the stack. If you have 3000 character c++
methods thats a lot of stack to trample.

This patch fixes repsep_snprintf by clamping the value at size - 1
which is the maximum snprintf can write before adding the NULL
terminator.

I get the sinking feeling that there are a lot of other uses of
snprintf that have this same bug, we should audit them all.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@kernel.org
---

Index: linux-build/tools/perf/util/sort.c
===================================================================
--- linux-build.orig/tools/perf/util/sort.c	2012-03-07 10:58:57.502318907 +1100
+++ linux-build/tools/perf/util/sort.c	2012-03-07 11:00:58.812423271 +1100
@@ -33,6 +33,9 @@ static int repsep_snprintf(char *bf, siz
 		}
 	}
 	va_end(ap);
+
+	if (n >= (int)size)
+		return size - 1;
 	return n;
 }
 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf: Incorrect use of snprintf results in SEGV
  2012-03-07  0:42 [PATCH] perf: Incorrect use of snprintf results in SEGV Anton Blanchard
@ 2012-03-07  0:49 ` Peter Seebach
  2012-03-07  1:09 ` Arnaldo Carvalho de Melo
  2012-03-14 19:59 ` [tip:perf/urgent] perf tools: " tip-bot for Anton Blanchard
  2 siblings, 0 replies; 17+ messages in thread
From: Peter Seebach @ 2012-03-07  0:49 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: acme, paulus, peterz, mingo, dsahern, fweisbec, yanmin_zhang,
	emunson, linux-kernel

On Wed, 7 Mar 2012 11:42:49 +1100
Anton Blanchard <anton@samba.org> wrote:

> This patch fixes repsep_snprintf by clamping the value at size - 1
> which is the maximum snprintf can write before adding the NULL
> terminator.

I would be concerned by this, simply because I at least sometimes
use snprintf-like functions with the understanding that I can check
for overflow by comparing the return value to the size.

... Of course, I think I also make this mistake you describe in other
code, so I'm gonna go look for that.

But simply clamping the value might break code which is relying
on the existing semantics.  (And of course, any snprintf-related
crash or misbehavior is likely to happen only when the planets are
aligned just so...)

Possible alternative:  Check for a provided size value which is
unreasonably large, and if you get one, assume that it's probably
intended to be negative and refuse to write anything.  I don't know
what unreasonably large is, but "large enough that it would have been
negative had it been a signed type" might be a good starting point --
no one should be writing strings that long anyway*.

-s
[*] I am totally ready for someone in twenty years to throw that quote
in my face contemptuously as it shows that I was hopelessly
short-sighted.
-- 
Listen, get this.  Nobody with a good compiler needs to be justified.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf: Incorrect use of snprintf results in SEGV
  2012-03-07  0:42 [PATCH] perf: Incorrect use of snprintf results in SEGV Anton Blanchard
  2012-03-07  0:49 ` Peter Seebach
@ 2012-03-07  1:09 ` Arnaldo Carvalho de Melo
  2012-03-07  1:29   ` Peter Seebach
  2012-03-14 19:59 ` [tip:perf/urgent] perf tools: " tip-bot for Anton Blanchard
  2 siblings, 1 reply; 17+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-03-07  1:09 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: paulus, peterz, mingo, dsahern, fweisbec, yanmin_zhang, emunson,
	linux-kernel

Em Wed, Mar 07, 2012 at 11:42:49AM +1100, Anton Blanchard escreveu:
> 
> I have a workload where perf top scribbles over the stack and we
> SEGV. What makes it interesting is that an snprintf is causing this.
> 
> The workload is a c++ gem that has method names over 3000 characters
> long, but snprintf is designed to avoid overrunning buffers. So what
> went wrong?
> 
> The problem is we assume snprintf returns the number of characters
> written:
> 
>     ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", self->level);
> ...
>     ret += repsep_snprintf(bf + ret, size - ret, "%s", self->ms.sym->name);
> 
> Unfortunately this is not how snprintf works. snprintf returns the
> number of characters that would have been written if there was enough
> space. In the above case, if the first snprintf returns a value larger
> than size, we pass a negative size into the second snprintf and
> happily scribble over the stack. If you have 3000 character c++
> methods thats a lot of stack to trample.
> 
> This patch fixes repsep_snprintf by clamping the value at size - 1
> which is the maximum snprintf can write before adding the NULL
> terminator.
> 
> I get the sinking feeling that there are a lot of other uses of
> snprintf that have this same bug, we should audit them all.

Indeed, I wonder what kind of crack pipe I was smoking when I read the
snprintf man page.

Or what kind of such pipe the people who designed snprintf were using
:-(

> 
> Signed-off-by: Anton Blanchard <anton@samba.org>
> Cc: stable@kernel.org
> ---
> 
> Index: linux-build/tools/perf/util/sort.c
> ===================================================================
> --- linux-build.orig/tools/perf/util/sort.c	2012-03-07 10:58:57.502318907 +1100
> +++ linux-build/tools/perf/util/sort.c	2012-03-07 11:00:58.812423271 +1100
> @@ -33,6 +33,9 @@ static int repsep_snprintf(char *bf, siz
>  		}
>  	}
>  	va_end(ap);
> +
> +	if (n >= (int)size)
> +		return size - 1;
>  	return n;
>  }
>  

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf: Incorrect use of snprintf results in SEGV
  2012-03-07  1:09 ` Arnaldo Carvalho de Melo
@ 2012-03-07  1:29   ` Peter Seebach
  2012-03-07 18:44     ` Nick Bowler
  2012-03-07 20:37     ` Ingo Molnar
  0 siblings, 2 replies; 17+ messages in thread
From: Peter Seebach @ 2012-03-07  1:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Anton Blanchard, paulus, peterz, mingo, dsahern, fweisbec,
	yanmin_zhang, emunson, linux-kernel

On Tue, 6 Mar 2012 22:09:04 -0300
Arnaldo Carvalho de Melo <acme@redhat.com> wrote:

> Or what kind of such pipe the people who designed snprintf were using
> :-(

I wasn't there for the original 4.4BSD implementation, but I was on the
ISO committee when we adopted it, and I regret to say, while the food
was lovely, the hosting organization didn't offer us any drugs at all.
But I can explain the rationale of the choice.

If snprintf returns the size it needed, and you know the size you gave
it, you have a choice of what to do, and you have all the information
you need to make an informed choice.

If it returns the amount it wrote, or possibly an error indicator (such
as -1) when out of space, you *don't* have the information you need to
make an informed choice, and one possible choice ("reallocate with the
right amount") is not available to you. We had also seen other functions
which made that implementation choice, and consistently, people
disliked them more.

To frame it another way:  Imagine an alternative function, called
slenprintf(), which is just like snprintf except that it returns the
number of bytes written instead of the number it would have liked to
write in the event that the buffer isn't big enough.  And also
vslenprintf(), analogous to vsnprintf().

Now consider what happens if you have one and want the semantics of the
other:

size_t
myslenprintf(char *buffer, size_t len, char *fmt, ...) {
    size_t ret;
    va_list ap;
    va_start(ap, fmt)
    ret = vsnprintf(buffer, len, fmt, ap);
    /* easy to check for this possibility */
    if (ret >= size)
        return size - 1;
    return ret;
}

size_t
mysnprintf(char *buffer, size_t len, char *fmt, ...) {
    size_t ret;
    va_list ap;
    va_start(ap, fmt)
    ret = vslenprintf(buffer, len, fmt, ap);
    /* now what? */
    ...
}

-s
-- 
Listen, get this.  Nobody with a good compiler needs to be justified.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf: Incorrect use of snprintf results in SEGV
  2012-03-07  1:29   ` Peter Seebach
@ 2012-03-07 18:44     ` Nick Bowler
  2012-03-07 20:24       ` Peter Seebach
  2012-03-07 20:37     ` Ingo Molnar
  1 sibling, 1 reply; 17+ messages in thread
From: Nick Bowler @ 2012-03-07 18:44 UTC (permalink / raw)
  To: Peter Seebach
  Cc: Arnaldo Carvalho de Melo, Anton Blanchard, paulus, peterz, mingo,
	dsahern, fweisbec, yanmin_zhang, emunson, linux-kernel

On 2012-03-06 19:29 -0600, Peter Seebach wrote:
> To frame it another way:  Imagine an alternative function, called
> slenprintf(), which is just like snprintf except that it returns the
> number of bytes written instead of the number it would have liked to
> write in the event that the buffer isn't big enough.  And also
> vslenprintf(), analogous to vsnprintf().
> 
> Now consider what happens if you have one and want the semantics of the
> other:
[...]
> size_t
> mysnprintf(char *buffer, size_t len, char *fmt, ...) {
>     size_t ret;
>     va_list ap;
>     va_start(ap, fmt)
>     ret = vslenprintf(buffer, len, fmt, ap);
>     /* now what? */

To answer the question, one "solution" here is to run in a loop
allocating larger and larger buffers until ret is strictly less
than len, then (for this function) free the allocated buffer.

There are a couple functions in POSIX that work this way (*cough*
readlink *cough*), and it's *ugly*.

Cheers,
-- 
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf: Incorrect use of snprintf results in SEGV
  2012-03-07 18:44     ` Nick Bowler
@ 2012-03-07 20:24       ` Peter Seebach
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Seebach @ 2012-03-07 20:24 UTC (permalink / raw)
  To: Nick Bowler
  Cc: Arnaldo Carvalho de Melo, Anton Blanchard, paulus, peterz, mingo,
	dsahern, fweisbec, yanmin_zhang, emunson, linux-kernel

On Wed, 7 Mar 2012 13:44:55 -0500
Nick Bowler <nbowler@elliptictech.com> wrote:

> To answer the question, one "solution" here is to run in a loop
> allocating larger and larger buffers until ret is strictly less
> than len, then (for this function) free the allocated buffer.

Strictly speaking, I am obliged to concede that this does, in fact,
result in either success or knowledge that success is impossible in a
finite number of iterations.  However, the number-of-iterations vs.
wasted-space tradeoff is horrible.  I appreciate the use of scare
quotes around the word "solution".  :)

> There are a couple functions in POSIX that work this way (*cough*
> readlink *cough*), and it's *ugly*.

The other thing we looked at, I believe, was Microsoft's sprintf_s(),
which is the "secure" version.  I can't honestly say from reading their
docs whether "ran out of space" is an error (resulting in returning -1)
or whether it returns the number of bytes written.  Either way, it has
that same basic problem.  Also there's strftime(), which has the
brilliant design choice that if it runs out of space, it returns a
value which could in fact be a correct return value for at least some
possible inputs, and the contents of the buffer are indeterminate.  A
true feat of software engineering, that.

-s
-- 
Listen, get this.  Nobody with a good compiler needs to be justified.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf: Incorrect use of snprintf results in SEGV
  2012-03-07  1:29   ` Peter Seebach
  2012-03-07 18:44     ` Nick Bowler
@ 2012-03-07 20:37     ` Ingo Molnar
  2012-03-07 20:59       ` Peter Zijlstra
  2012-03-07 21:19       ` Peter Seebach
  1 sibling, 2 replies; 17+ messages in thread
From: Ingo Molnar @ 2012-03-07 20:37 UTC (permalink / raw)
  To: Peter Seebach
  Cc: Arnaldo Carvalho de Melo, Anton Blanchard, paulus, peterz,
	dsahern, fweisbec, yanmin_zhang, emunson, linux-kernel


* Peter Seebach <peter.seebach@windriver.com> wrote:

> On Tue, 6 Mar 2012 22:09:04 -0300
> Arnaldo Carvalho de Melo <acme@redhat.com> wrote:
> 
> > Or what kind of such pipe the people who designed snprintf 
> > were using
> > :-(
> 
> I wasn't there for the original 4.4BSD implementation, but I 
> was on the ISO committee when we adopted it, and I regret to 
> say, while the food was lovely, the hosting organization 
> didn't offer us any drugs at all. But I can explain the 
> rationale of the choice.

;-)

> If snprintf returns the size it needed, and you know the size 
> you gave it, you have a choice of what to do, and you have all 
> the information you need to make an informed choice.
> 
> If it returns the amount it wrote, or possibly an error 
> indicator (such as -1) when out of space, you *don't* have the 
> information you need to make an informed choice, and one 
> possible choice ("reallocate with the right amount") is not 
> available to you. We had also seen other functions which made 
> that implementation choice, and consistently, people disliked 
> them more.

You are missing two important aspects:

1) Dynamic reallocation on snprintf() failure is an utterly rare
   thing - it is used in less than 1% of snprintf() invocations. 
   (Yes, I just checked a couple of codebases.)

   We *DONT* want to make APIs more fragile just to accomodate a
   rare, esoteric usecase! Doing that you are introducing very
   real bugs in very real code. You are hurting the 99% for the
   sake of the 1%, and needlessly so:

2) It's not even true that should some code want to
   dynamically allocate the 'required' number of bytes is not
   available. Some oddball side API could be added for that 1%:

	size_needed = snprintf_size(...);

So this API could have been designed right but it was messed up 
out of concern for an insane 1% case - FAIL.

This is a case study for how insane semantics are created ...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf: Incorrect use of snprintf results in SEGV
  2012-03-07 20:37     ` Ingo Molnar
@ 2012-03-07 20:59       ` Peter Zijlstra
  2012-03-07 21:28         ` Peter Seebach
  2012-03-08  7:34         ` Ingo Molnar
  2012-03-07 21:19       ` Peter Seebach
  1 sibling, 2 replies; 17+ messages in thread
From: Peter Zijlstra @ 2012-03-07 20:59 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Seebach, Arnaldo Carvalho de Melo, Anton Blanchard, paulus,
	dsahern, fweisbec, yanmin_zhang, emunson, linux-kernel

On Wed, 2012-03-07 at 21:37 +0100, Ingo Molnar wrote:
>         size_needed = snprintf_size(...);

This would require 3 passes over the fmt+args, first to find the
allocated size is insufficient, 2nd to compute the size, 3rd to fill
buffer.

Whereas with the current "creative" API only 2 passes are needed.

I can imagine that back in the day of small memory and small CPU this
was deemed important enough.

Anyway, its all moot, this API exists and has been out in the wild for
several decades now, its not like we can actually change it :-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf: Incorrect use of snprintf results in SEGV
  2012-03-07 20:37     ` Ingo Molnar
  2012-03-07 20:59       ` Peter Zijlstra
@ 2012-03-07 21:19       ` Peter Seebach
  2012-03-08  0:58         ` Arnaldo Carvalho de Melo
  2012-03-08  7:48         ` Ingo Molnar
  1 sibling, 2 replies; 17+ messages in thread
From: Peter Seebach @ 2012-03-07 21:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Arnaldo Carvalho de Melo, Anton Blanchard, paulus, peterz,
	dsahern, fweisbec, yanmin_zhang, emunson, linux-kernel

On Wed, 7 Mar 2012 21:37:25 +0100
Ingo Molnar <mingo@elte.hu> wrote:

> You are missing two important aspects:
> 
> 1) Dynamic reallocation on snprintf() failure is an utterly rare
>    thing - it is used in less than 1% of snprintf() invocations. 
>    (Yes, I just checked a couple of codebases.)

I would agree that it's very rare.  But then, using the return value at
all isn't especially common in my experience -- the only interesting
part, most of the time, is "we're sure this didn't overrun the buffer".
 
>    We *DONT* want to make APIs more fragile just to accomodate a
>    rare, esoteric usecase!

I would view snprintf as an API which already exists.  If it's the
wrong API, by all means, write a different one -- but I would suggest
not using the same name for it.  If a function is going to be called
snprintf, IMO it should have the semantics of snprintf.  If those are
the wrong semantics (and they may well be), then I would say use a
function which has the right semantics, and isn't named snprintf.
 
> 2) It's not even true that should some code want to
>    dynamically allocate the 'required' number of bytes is not
>    available. Some oddball side API could be added for that 1%:
 
> 	size_needed = snprintf_size(...);

That's where the "can write one in terms of the other" argument comes
into play.

If you have snprintf_needed(), it's easy to write snprintf_written()
in terms of it.  If you have only snprintf_written(), it is unreasonably
ugly and/or inefficient to write snprintf_needed() in terms of it.

> So this API could have been designed right but it was messed up 
> out of concern for an insane 1% case - FAIL.

Well, the thing is.  If there *exists* a reallocation case, those
semantics end up being needed.

I do agree that this is a source of errors in usage (and a quick audit
shows that I have at least one use which falls prey to this, as well as
several which check for it correctly).  In practice, I'd guess that
treating probably-negative sizes as an error would likely resolve
things, although that's also a semantics change -- it's just that it's
a semantics change which only affects single snprintf calls that were
expected to write to half the address space.

-s
-- 
Listen, get this.  Nobody with a good compiler needs to be justified.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf: Incorrect use of snprintf results in SEGV
  2012-03-07 20:59       ` Peter Zijlstra
@ 2012-03-07 21:28         ` Peter Seebach
  2012-03-08  7:34         ` Ingo Molnar
  1 sibling, 0 replies; 17+ messages in thread
From: Peter Seebach @ 2012-03-07 21:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Anton Blanchard, paulus,
	dsahern, fweisbec, yanmin_zhang, emunson, linux-kernel

On Wed, 7 Mar 2012 21:59:24 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> On Wed, 2012-03-07 at 21:37 +0100, Ingo Molnar wrote:
> >         size_needed = snprintf_size(...);
> 
> This would require 3 passes over the fmt+args, first to find the
> allocated size is insufficient, 2nd to compute the size, 3rd to fill
> buffer.
> 
> Whereas with the current "creative" API only 2 passes are needed.
> 
> I can imagine that back in the day of small memory and small CPU this
> was deemed important enough.

It occurs to me that I have seen this discussion, or a variant, before:

http://coding.derkeiler.com/Archive/C_CPP/comp.lang.c/2004-11/2332.html

In which people discuss the possible alternative return values at some
length.

-s
-- 
Listen, get this.  Nobody with a good compiler needs to be justified.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf: Incorrect use of snprintf results in SEGV
  2012-03-07 21:19       ` Peter Seebach
@ 2012-03-08  0:58         ` Arnaldo Carvalho de Melo
  2012-03-08  7:48         ` Ingo Molnar
  1 sibling, 0 replies; 17+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-03-08  0:58 UTC (permalink / raw)
  To: Peter Seebach
  Cc: Ingo Molnar, Anton Blanchard, paulus, peterz, dsahern, fweisbec,
	yanmin_zhang, emunson, linux-kernel

Em Wed, Mar 07, 2012 at 03:19:51PM -0600, Peter Seebach escreveu:
> On Wed, 7 Mar 2012 21:37:25 +0100 Ingo Molnar <mingo@elte.hu> wrote:
> >    We *DONT* want to make APIs more fragile just to accomodate a
> >    rare, esoteric usecase!
> 
> I would view snprintf as an API which already exists.  If it's the
> wrong API, by all means, write a different one -- but I would suggest
> not using the same name for it.  If a function is going to be called
> snprintf, IMO it should have the semantics of snprintf.  If those are
> the wrong semantics (and they may well be), then I would say use a
> function which has the right semantics, and isn't named snprintf.

Right, its more a case of: Don't assume people do things you think are
reasonable, read carefully and follow the instructions.

At least it is not as  long as other EULAs, it is much, much shorter,
but managed to be just as non intuitive.

- Arnaldo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf: Incorrect use of snprintf results in SEGV
  2012-03-07 20:59       ` Peter Zijlstra
  2012-03-07 21:28         ` Peter Seebach
@ 2012-03-08  7:34         ` Ingo Molnar
  2012-03-08  8:51           ` Peter Seebach
  1 sibling, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2012-03-08  7:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Peter Seebach, Arnaldo Carvalho de Melo, Anton Blanchard, paulus,
	dsahern, fweisbec, yanmin_zhang, emunson, linux-kernel


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Wed, 2012-03-07 at 21:37 +0100, Ingo Molnar wrote:
> >         size_needed = snprintf_size(...);
> 
> This would require 3 passes over the fmt+args, first to find 
> the allocated size is insufficient, 2nd to compute the size, 
> 3rd to fill buffer.

No. The 1% case would use this separate API with its quirky 
return value. snprintf_size() works like today's snprintf, it's 
just *named* clearly to signal that it returns something not 
quite intuitive and results in bugs even in code that *tries* to 
be aware of the corner cases.

> Whereas with the current "creative" API only 2 passes are 
> needed.
> 
> I can imagine that back in the day of small memory and small 
> CPU this was deemed important enough.
> 
> Anyway, its all moot, this API exists and has been out in the 
> wild for several decades now, its not like we can actually 
> change it :-)

Of course it is moot - I am not arguing for a change in the API.

But the self-justification, as outlined in the mail I replied 
to, is brutally wrong, and nowhere in this discussion did I see 
the important notion mentioned that the *common case matters* - 
so maybe reading this will keep others from committing the same 
mistake, with newly introduced APIs.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf: Incorrect use of snprintf results in SEGV
  2012-03-07 21:19       ` Peter Seebach
  2012-03-08  0:58         ` Arnaldo Carvalho de Melo
@ 2012-03-08  7:48         ` Ingo Molnar
  2012-03-08  7:52           ` Ingo Molnar
  2012-03-09 19:00           ` Peter Seebach
  1 sibling, 2 replies; 17+ messages in thread
From: Ingo Molnar @ 2012-03-08  7:48 UTC (permalink / raw)
  To: Peter Seebach
  Cc: Arnaldo Carvalho de Melo, Anton Blanchard, paulus, peterz,
	dsahern, fweisbec, yanmin_zhang, emunson, linux-kernel


* Peter Seebach <peter.seebach@windriver.com> wrote:

> On Wed, 7 Mar 2012 21:37:25 +0100
> Ingo Molnar <mingo@elte.hu> wrote:
> 
> > You are missing two important aspects:
> > 
> > 1) Dynamic reallocation on snprintf() failure is an utterly rare
> >    thing - it is used in less than 1% of snprintf() invocations. 
> >    (Yes, I just checked a couple of codebases.)
> 
> I would agree that it's very rare.  But then, using the return 
> value at all isn't especially common in my experience -- the 
> only interesting part, most of the time, is "we're sure this 
> didn't overrun the buffer".

Erm. Doing:

	+= snprintf(...);

is a *very* common pattern within the kernel. It occurs more 
than a thousand times - i.e. about 25% of all snprintf uses 
(~5000 instances) within the kernel does care about the return 
value.

I found only a single case that did a reallocation if the buffer 
did not fit. Lets assume that I missed some and there's 4 
altogether.

I.e. the API usage proportion, within the kernel project, looks 
like this, approximately:

  snprintf() call site that:

   does not care about the return value:          75.0%
   uses the return value as a 'written' count:    24.9%
   wants to dynamically reallocate:                0.1%

> >    We *DONT* want to make APIs more fragile just to accomodate a
> >    rare, esoteric usecase!
> 
> I would view snprintf as an API which already exists.

Changing it is obviously not possible anymore.

I was just countering your justification for it - which is still 
wrong. People might read that and use it to justify newly 
introduced, crappy APIs.

The 0.1% usecase is absolutely not a valid excuse to make an API 
less robust - *especially* when a separate API could serve that 
0.1% case just fine.

When designing APIs it is of utmost importance how average 
developers intuitively *think* it works - not how the designer 
thinks it should work ... Any severe mismatch between the two is 
a serious design FAIL that should not be repeated in new code.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf: Incorrect use of snprintf results in SEGV
  2012-03-08  7:48         ` Ingo Molnar
@ 2012-03-08  7:52           ` Ingo Molnar
  2012-03-09 19:00           ` Peter Seebach
  1 sibling, 0 replies; 17+ messages in thread
From: Ingo Molnar @ 2012-03-08  7:52 UTC (permalink / raw)
  To: Peter Seebach
  Cc: Arnaldo Carvalho de Melo, Anton Blanchard, paulus, peterz,
	dsahern, fweisbec, yanmin_zhang, emunson, linux-kernel


* Ingo Molnar <mingo@elte.hu> wrote:

> [...]
> 
> When designing APIs it is of utmost importance how average 
> developers intuitively *think* it works - not how the designer 
> thinks it should work ... Any severe mismatch between the two 
> is a serious design FAIL that should not be repeated in new 
> code.

Btw., I'm not picking on you, in the last 15 years I have added 
my own sad share of brown paperbag API mis-designs to the Linux 
kernel - most [but not all] of which could fortunately be fixed 
within the kernel.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf: Incorrect use of snprintf results in SEGV
  2012-03-08  7:34         ` Ingo Molnar
@ 2012-03-08  8:51           ` Peter Seebach
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Seebach @ 2012-03-08  8:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, Anton Blanchard,
	paulus, dsahern, fweisbec, yanmin_zhang, emunson, linux-kernel

On Thu, 8 Mar 2012 08:34:54 +0100
Ingo Molnar <mingo@elte.hu> wrote:

> But the self-justification, as outlined in the mail I replied 
> to, is brutally wrong,

Consider, if you will, an interface which has both snprintf_needed()
and snprintf_written().

Under the hood, any reasonably sane implementation of these functions
needs to have at least one function with the semantics of
snprintf_needed.  APIs which omit that create baroque monstrosities on
the (admittedly relatively rare) occasions when people try to get things
right.

Here, I think, the habit of C programmers of not writing trivial
five-line wrappers because "you could just write that, it's only five
lines", rather bites us.  It might have been better to put both of
these designs in the spec, but at the time when the decision was
originally made, it was clear that there was a need for *a*
length-limited version of snprintf, and if you're only going to have
one, there are significant technical advantages to making the one that
makes the other trivial to provide if you'd prefer it.  It may be
easy to misuse snprintf, but it's nearly impossible to use sprintf_s
correctly if you actually care about your data.

> When designing APIs it is of utmost importance how average 
> developers intuitively *think* it works - not how the designer 
> thinks it should work ...

There is perhaps a subtle clash between the Principle of Least
Astonishment and the Spirit of C.  C has often chosen design decisions
that are intuitively appealing to skilled programmers who have heard
the rationale for them, but which may surprise average programmers or
programmers relying purely on intuition.

I usually think the right thing to do in designing an API is to
provide the smallest and cleanest spanning set possible, and trust
users to know whether they need wrapper functions to provide more
congenial semantics.

The snprintf() interface might well have been done differently if
there'd been any way to get data on how it would be used before it was
specified.  At the time, though, I think the only available criterion
was the observation that one specification could be trivially
implemented in terms of the other.  I suspect, though, that even with
that information, the C committee's response would have been to provide
the function which could predictably produce correct output when used
correctly, rather than the function which couldn't.

We did look at the existing practice, and we heard feedback from people
on what they found with it.  Multiple people present had been badly
burned by one or another API that did not provide a way to find out how
much space was needed.  None of us had, at the time, encountered
this problem with the snprintf() specification, or if we had, it seemed
obvious enough that the fix was trivial.  (Possibly even a Simple
Matter of Programming.)

It may be a decision which produced problems in retrospect, but I think
it was a pretty reasonable decision at the time.

... And I say this even noting that I have at least one hunk of code
which I'm pretty sure makes this bad assumption.

-s
-- 
Listen, get this.  Nobody with a good compiler needs to be justified.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] perf: Incorrect use of snprintf results in SEGV
  2012-03-08  7:48         ` Ingo Molnar
  2012-03-08  7:52           ` Ingo Molnar
@ 2012-03-09 19:00           ` Peter Seebach
  1 sibling, 0 replies; 17+ messages in thread
From: Peter Seebach @ 2012-03-09 19:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Arnaldo Carvalho de Melo, Anton Blanchard, paulus, peterz,
	dsahern, fweisbec, yanmin_zhang, emunson, linux-kernel

On Thu, 8 Mar 2012 08:48:37 +0100
Ingo Molnar <mingo@elte.hu> wrote:
> Erm. Doing:
> 
> 	+= snprintf(...);
> 
> is a *very* common pattern within the kernel. It occurs more 
> than a thousand times - i.e. about 25% of all snprintf uses 
> (~5000 instances) within the kernel does care about the return 
> value.
> 
> I found only a single case that did a reallocation if the buffer 
> did not fit. Lets assume that I missed some and there's 4 
> altogether.

I came back to this, because I totally thought I saw one of the horse's
legs twitch out of the corner of my eye.

See, after this conversation, I decided to go audit some of my code,
and sure enough, I found a similar pattern -- I found a couple of
places where my code had this idiom, and did not reallocate.

And in both cases, it was *a bug* that the code did not attempt to
reallocate.

This led me to reevaluate my assumptions.  It appears to me that if you
have a buffer of a given size, and for some reason you have more data
than will fit in the buffer, you have four options:

1.  Scribble outside the buffer.
2.  Truncate the data.
3.  Reallocate a larger buffer.
4.  Report a failure.

In userspace code, I think it is probably nearly always an error to
truncate rather than reallocating.  I can't think of a case where
truncating would be better.  Reporting a failure may be reasonable in
some cases.

In the kernel, that's going to be a harder call -- there are probably
circumstances where reallocating is impractical.  But casually browsing
through the many cases where no attempt is made to reallocate, I
frequently find myself thinking "boy, truncating that would sorta suck."

The assumption that code reflects intended use is a completely
reasonable one, but on evaluating the code not for what it does, but
for what it probably ought to do, I find that there are many more cases
where the correct answer should be "get a larger buffer" rather than
"discard data or possibly cut off halfway through a word".

And in userspace, I don't think I've yet found a case where
reallocating isn't the right call.  (The C library was not really
designed with kernel requirements, such as the possibility that
allocation might not be an option, in mind.)

So I think a sample of how snprintf() *is* used is not the right way to
determine the "common" use case to design for.  So far as I can tell,
in nearly all cases you need to know whether snprintf needed more
space, and you probably *want* to know how much more it needed;
otherwise, you are silently producing corrupted data.  And indeed, most
of the APIs I checked that aren't correctly checking whether snprintf
wanted to overflow are, as a result, potentially returning bogus data.

For a specific example, consider mm/mempolicy.c:

        if (flags & MPOL_MODE_FLAGS) {
                if (buffer + maxlen < p + 2)
                        return -ENOSPC;
                *p++ = '=';
                /*
                 * Currently, the only defined flags are mutually
                exclusive */
                if (flags & MPOL_F_STATIC_NODES)
                        p += snprintf(p, buffer + maxlen - p, "static");
	...

This code clearly takes great care to ensure that it returns -ENOSPC
if there isn't enough space, but does not check that the result of the
snprintf call was in range.  Failure to check this is a bug.  Because
the code is full of other careful checks, it won't result in
scribbling, but it could result in returning a buffer which ends with
"=sta" and reporting too high a length for it.

So studying the code has left me convinced that the snprintf()
interface is the right interface, and that the usage errors
you've identified are bugs, not because snprintf() returns the wrong
value, but because they are failures to check for a condition which
needs to be handled.

-s
-- 
Listen, get this.  Nobody with a good compiler needs to be justified.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [tip:perf/urgent] perf tools: Incorrect use of snprintf results in SEGV
  2012-03-07  0:42 [PATCH] perf: Incorrect use of snprintf results in SEGV Anton Blanchard
  2012-03-07  0:49 ` Peter Seebach
  2012-03-07  1:09 ` Arnaldo Carvalho de Melo
@ 2012-03-14 19:59 ` tip-bot for Anton Blanchard
  2 siblings, 0 replies; 17+ messages in thread
From: tip-bot for Anton Blanchard @ 2012-03-14 19:59 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, linux-kernel, paulus, anton, hpa, mingo, yanmin_zhang,
	peterz, emunson, fweisbec, dsahern, tglx, mingo

Commit-ID:  b832796caa1fda8516464a003c8c7cc547bc20c2
Gitweb:     http://git.kernel.org/tip/b832796caa1fda8516464a003c8c7cc547bc20c2
Author:     Anton Blanchard <anton@samba.org>
AuthorDate: Wed, 7 Mar 2012 11:42:49 +1100
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 14 Mar 2012 12:36:19 -0300

perf tools: Incorrect use of snprintf results in SEGV

I have a workload where perf top scribbles over the stack and we SEGV.
What makes it interesting is that an snprintf is causing this.

The workload is a c++ gem that has method names over 3000 characters
long, but snprintf is designed to avoid overrunning buffers. So what
went wrong?

The problem is we assume snprintf returns the number of characters
written:

    ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", self->level);
...
    ret += repsep_snprintf(bf + ret, size - ret, "%s", self->ms.sym->name);

Unfortunately this is not how snprintf works. snprintf returns the
number of characters that would have been written if there was enough
space. In the above case, if the first snprintf returns a value larger
than size, we pass a negative size into the second snprintf and happily
scribble over the stack. If you have 3000 character c++ methods thats a
lot of stack to trample.

This patch fixes repsep_snprintf by clamping the value at size - 1 which
is the maximum snprintf can write before adding the NULL terminator.

I get the sinking feeling that there are a lot of other uses of snprintf
that have this same bug, we should audit them all.

Cc: David Ahern <dsahern@gmail.com>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/20120307114249.44275ca3@kryten
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/sort.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 16da30d..076c9d4 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -33,6 +33,9 @@ static int repsep_snprintf(char *bf, size_t size, const char *fmt, ...)
 		}
 	}
 	va_end(ap);
+
+	if (n >= (int)size)
+		return size - 1;
 	return n;
 }
 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2012-03-14 20:01 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-07  0:42 [PATCH] perf: Incorrect use of snprintf results in SEGV Anton Blanchard
2012-03-07  0:49 ` Peter Seebach
2012-03-07  1:09 ` Arnaldo Carvalho de Melo
2012-03-07  1:29   ` Peter Seebach
2012-03-07 18:44     ` Nick Bowler
2012-03-07 20:24       ` Peter Seebach
2012-03-07 20:37     ` Ingo Molnar
2012-03-07 20:59       ` Peter Zijlstra
2012-03-07 21:28         ` Peter Seebach
2012-03-08  7:34         ` Ingo Molnar
2012-03-08  8:51           ` Peter Seebach
2012-03-07 21:19       ` Peter Seebach
2012-03-08  0:58         ` Arnaldo Carvalho de Melo
2012-03-08  7:48         ` Ingo Molnar
2012-03-08  7:52           ` Ingo Molnar
2012-03-09 19:00           ` Peter Seebach
2012-03-14 19:59 ` [tip:perf/urgent] perf tools: " tip-bot for Anton Blanchard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).