All of lore.kernel.org
 help / color / mirror / Atom feed
* RFC for a new string-copy function, using mixtures of strlcpy and strscpy
@ 2021-11-08  3:16 Ajay Garg
       [not found] ` <CAHp75VcVZ6dDDm-k=Njo-jDq81bL4BTwrtkkAnm24b23qWKB_g@mail.gmail.com>
  2021-11-08 12:22 ` Greg KH
  0 siblings, 2 replies; 10+ messages in thread
From: Ajay Garg @ 2021-11-08  3:16 UTC (permalink / raw)
  To: andy, Kees Cook, akpm, adobriyan, Nick Desaulniers,
	Linux Kernel Mailing List, linux-hardening

Hello everyone.

I recently came across the functions strlcpy and strscpy, and it seems
something is missing/unclean.

This email is a RFC for another function, by the name "strlscpy".
Following is a small .ko, which demonstrates the differences in
behaviours of strlcpy/strscpy/strlscpy :

######################################################
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/slab.h>
#include <linux/string.h>

char *__MODULE__ = "KERN-ajay-hello-world";

char a[10] = {0};
char b[] = "1234567890123456";

typedef size_t (*func_type_1)(char *dest, const char *src, size_t size);
typedef ssize_t (*func_type_2)(char *dest, const char *src, size_t size);

size_t strlscpy(char *dest, const char *src, size_t size)
{
    size_t len = strlen(src);

    if (size) {
            if(len >= size)
                len = size - 1;

        memcpy(dest, src, len);
        dest[len] = '\0';
    }

    return len;
}

void common(const char *fname, int len)
{
        printk("\n\nUsing [%s] ::  \n\n", fname);
        printk("len = [%d]\n", len);
        printk("a = [%s]\n", a);
}

void run_test_type_1(func_type_1 fn, const char *fname)
{
        common(fname, fn(a, b, sizeof(a)));
}

void run_test_type_2(func_type_2 fn, const char *fname)
{
        common(fname, fn(a, b, sizeof(a)));
}

static int __init hello_init(void)
{
        run_test_type_1(strlcpy, "strlcpy");
        run_test_type_2(strscpy, "strscpy");
        run_test_type_1(strlscpy, "strlscpy");

        return 0;
}

static void __exit hello_exit(void)
{
}

module_init(hello_init);
module_exit(hello_exit);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Ajay Garg <ajaygargnsit@gmail.com>");
MODULE_DESCRIPTION("Hello-World Driver");
######################################################




The output is as follows :

######################################################
Using [strlcpy] ::

len = [16]
a = [123456789]


Using [strscpy] ::

len = [-7]
a = [123456789]


Using [strlscpy] ::

len = [9]
a = [123456789]
######################################################


As is seen, the intended strlscpy function returns the number of bytes
truly copied, whereas the other two functions return values that are
prone to causing overflows/underflows.


== Motivation for the intended new function ==

Currently, strlcpy is being used in the kernel in following ways :

i)
Return-value is not consumed by the client.

ii)
Return-value is consumed by the client, and the overflow-check done on
the return-value.

iii)
Return-value is consumed by the client, but no overflow-check done on
the return-value.


Case i) is error-free, Case ii) is also acceptable.
However, Case iii) is error-prone to overflows (thus defeating the
purpose of using strlcpy at the first place).


Thus, kindly let know your thoughts, on this proposal for the addition
of this new strlscpy function in lib/string.c, so that return-value is
exact and not prone to any overflows/underflows.


Thanks and Regards,
Ajay

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC for a new string-copy function, using mixtures of strlcpy and strscpy
       [not found] ` <CAHp75VcVZ6dDDm-k=Njo-jDq81bL4BTwrtkkAnm24b23qWKB_g@mail.gmail.com>
@ 2021-11-08  7:42   ` Ajay Garg
  2021-11-08  8:33     ` Andy Shevchenko
  0 siblings, 1 reply; 10+ messages in thread
From: Ajay Garg @ 2021-11-08  7:42 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: andy, Kees Cook, akpm, adobriyan, Nick Desaulniers,
	Linux Kernel Mailing List, linux-hardening

>>
>> The output is as follows :
>>
>> ######################################################
>> Using [strlcpy] ::
>>
>> len = [16]
>> a = [123456789]
>>
>>
>> Using [strscpy] ::
>>
>> len = [-7]
>> a = [123456789]
>>
>
> My gosh, this is error code that you must check. We do not need more string copy functions.
>
>

Hmm, having the additional function would make things a lot easier.

For example, in file fs/kernfs/dir.c, there are methods like
"kernfs_name_locked", "kernfs_path_from_node_locked" which simply
consume the return-value without any checks.

All the above functions have a simple motive : copy as much bytes as
possible in the destination buffer, and then consume/return the number
of bytes actually copied (minus the null-terminator byte of course).

If checks are to be put in-place, it would be too much code/churn,
adding if checks all over the place.
If, instead we do a simple replace of "strlcpy" with "strlscpy" at
these places, we would be good to go (while *really* meeting the
requirements of these clients at the same time).


Anyhow, may be we can wait for some more opinions.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC for a new string-copy function, using mixtures of strlcpy and strscpy
  2021-11-08  7:42   ` Ajay Garg
@ 2021-11-08  8:33     ` Andy Shevchenko
  2021-11-08  8:46       ` Ajay Garg
  0 siblings, 1 reply; 10+ messages in thread
From: Andy Shevchenko @ 2021-11-08  8:33 UTC (permalink / raw)
  To: Ajay Garg
  Cc: andy, Kees Cook, akpm, adobriyan, Nick Desaulniers,
	Linux Kernel Mailing List, linux-hardening

On Mon, Nov 8, 2021 at 9:42 AM Ajay Garg <ajaygargnsit@gmail.com> wrote:

...

> >> len = [-7]
> >> a = [123456789]
> >
> > My gosh, this is error code that you must check. We do not need more string copy functions.
>
> Hmm, having the additional function would make things a lot easier.
>
> For example, in file fs/kernfs/dir.c, there are methods like
> "kernfs_name_locked", "kernfs_path_from_node_locked" which simply
> consume the return-value without any checks.
>
> All the above functions have a simple motive : copy as much bytes as
> possible in the destination buffer, and then consume/return the number
> of bytes actually copied (minus the null-terminator byte of course).

Nope. Read the comment WRT strscpy().

> If checks are to be put in-place, it would be too much code/churn,
> adding if checks all over the place.

Yep, that's why in some cases where we know that there can't be
overflow the checks are not present. In some cases it's historically
like this, in some cases checks might be useful and so on. But no, we
do not need more chaos in the string functions.

> If, instead we do a simple replace of "strlcpy" with "strlscpy" at
> these places, we would be good to go (while *really* meeting the
> requirements of these clients at the same time).

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC for a new string-copy function, using mixtures of strlcpy and strscpy
  2021-11-08  8:33     ` Andy Shevchenko
@ 2021-11-08  8:46       ` Ajay Garg
  0 siblings, 0 replies; 10+ messages in thread
From: Ajay Garg @ 2021-11-08  8:46 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: andy, Kees Cook, akpm, adobriyan, Nick Desaulniers,
	Linux Kernel Mailing List, linux-hardening

On Mon, Nov 8, 2021 at 2:04 PM Andy Shevchenko <andy.shevchenko
> >
> > For example, in file fs/kernfs/dir.c, there are methods like
> > "kernfs_name_locked", "kernfs_path_from_node_locked" which simply
> > consume the return-value without any checks.
> >
> > All the above functions have a simple motive : copy as much bytes as
> > possible in the destination buffer, and then consume/return the number
> > of bytes actually copied (minus the null-terminator byte of course).
>
> Nope. Read the comment WRT strscpy().

Seems there is a confusion.

I meant the functions "kernfs_name_locked" and others in
fs/kernfs/dir.c, that use strlcpy and then simply consume/propogate
the return-value without any checks.


>
> > If checks are to be put in-place, it would be too much code/churn,
> > adding if checks all over the place.
>
> Yep, that's why in some cases where we know that there can't be
> overflow the checks are not present. In some cases it's historically
> like this, in some cases checks might be useful and so on. But no, we
> do not need more chaos in the string functions.
>

If the client knows that overflow cannot be there, it is better to use
the simple vanilla strcpy.
Using strlcpy means that the client believes there might be case when
src-buffer might be bigger.

Again, functions like "kernfs_name_locked" and others in
fs/kernfs/dir.c demonstrate that the client cannot be sure whether the
src-buffer is small enough to be fit into the dest-buffer.


Thanks and Regards,
Ajay

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC for a new string-copy function, using mixtures of strlcpy and strscpy
  2021-11-08  3:16 RFC for a new string-copy function, using mixtures of strlcpy and strscpy Ajay Garg
       [not found] ` <CAHp75VcVZ6dDDm-k=Njo-jDq81bL4BTwrtkkAnm24b23qWKB_g@mail.gmail.com>
@ 2021-11-08 12:22 ` Greg KH
  2021-11-08 12:45   ` Ajay Garg
  1 sibling, 1 reply; 10+ messages in thread
From: Greg KH @ 2021-11-08 12:22 UTC (permalink / raw)
  To: Ajay Garg
  Cc: andy, Kees Cook, akpm, adobriyan, Nick Desaulniers,
	Linux Kernel Mailing List, linux-hardening

On Mon, Nov 08, 2021 at 08:46:01AM +0530, Ajay Garg wrote:
> Hello everyone.
> 
> I recently came across the functions strlcpy and strscpy, and it seems
> something is missing/unclean.
> 
> This email is a RFC for another function, by the name "strlscpy".

Wait, why?  We have recently added new string copy functions to resolve
the type of issues you have pointed out.  The kernel is not yet
converted to use them, so why add yet-another-function that also is not
used?

I think you should work with the functions we have _first_ and help
convert the kernel to use them, before adding another one please.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC for a new string-copy function, using mixtures of strlcpy and strscpy
  2021-11-08 12:22 ` Greg KH
@ 2021-11-08 12:45   ` Ajay Garg
  2021-11-08 13:01     ` Greg KH
  2021-11-08 18:05     ` Kees Cook
  0 siblings, 2 replies; 10+ messages in thread
From: Ajay Garg @ 2021-11-08 12:45 UTC (permalink / raw)
  To: Greg KH
  Cc: andy, Kees Cook, akpm, adobriyan, Nick Desaulniers,
	Linux Kernel Mailing List, linux-hardening

Hi Greg,

Thanks for your time.

>
> Wait, why?  We have recently added new string copy functions to resolve
> the type of issues you have pointed out.  The kernel is not yet
> converted to use them, so why add yet-another-function that also is not
> used?

Greg, would request your couple of minutes more, in suggesting a
concrete way forward, by working through an example as below.


In the file fs/kernfs/dir.c, there is a statement

        return strlcpy(buf, kn->parent ? kn->name : "/", buflen);

Here, there is no information of how long kn->name might be, so there
is a definite chance of overflow happening. Thus, accordingly, strlcpy
is used, to bound copying of upto buflen bytes. Of course, buf
(destination-buffer) is safe from any overflow now.

However, iffff strlen(kn->name) is greater than (buflen - 1), then
strlcpy would return a different value than the number of bytes
actually copied. Since there is no check, this (wrong) return value
will be propagated to the clients down the stack.


Now, in the current situation, following is the remedy :

########################################
        int ret = strlcpy(buf, kn->parent ? kn->name : "/", buflen);
        if(ret >= buflen)
            ret = buflen - 1;

        return ret;
########################################

Lines changed : 4



On the other hand, with the proposed new function strlscpy, the fix
would simply be :

########################################
       return strlscpy(buf, kn->parent ? kn->name : "/", buflen);
########################################

Lines changed : 1


Personally, as a "generic third-party", I would prefer investing my
time which makes lives of everyone - present and future - easier, so
the RFC for the new method is a step in that direction. This new
method would reduce effort from 4-lines-change to 1-line-change.



>
> I think you should work with the functions we have _first_ and help
> convert the kernel to use them, before adding another one please.
>

I tried looking a similar function in lib/string.c, but could not find any.
If there indeed is already a function that behaves like the new
intended method, kindly point me to it, case would be closed
immediately :D


Once again, many thanks for your time.


Thanks and Regards,
Ajay

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC for a new string-copy function, using mixtures of strlcpy and strscpy
  2021-11-08 12:45   ` Ajay Garg
@ 2021-11-08 13:01     ` Greg KH
  2021-11-08 13:19       ` Ajay Garg
  2021-11-08 18:05     ` Kees Cook
  1 sibling, 1 reply; 10+ messages in thread
From: Greg KH @ 2021-11-08 13:01 UTC (permalink / raw)
  To: Ajay Garg
  Cc: andy, Kees Cook, akpm, adobriyan, Nick Desaulniers,
	Linux Kernel Mailing List, linux-hardening

On Mon, Nov 08, 2021 at 06:15:54PM +0530, Ajay Garg wrote:
> Hi Greg,
> 
> Thanks for your time.
> 
> >
> > Wait, why?  We have recently added new string copy functions to resolve
> > the type of issues you have pointed out.  The kernel is not yet
> > converted to use them, so why add yet-another-function that also is not
> > used?
> 
> Greg, would request your couple of minutes more, in suggesting a
> concrete way forward, by working through an example as below.
> 
> 
> In the file fs/kernfs/dir.c, there is a statement
> 
>         return strlcpy(buf, kn->parent ? kn->name : "/", buflen);
> 
> Here, there is no information of how long kn->name might be, so there
> is a definite chance of overflow happening. Thus, accordingly, strlcpy
> is used, to bound copying of upto buflen bytes. Of course, buf
> (destination-buffer) is safe from any overflow now.

We "know" that kn->name will not overflow here based on the callers of
this function, right?  I can not find any caller that passes in a buffer
that would overflow, or am I missing a callpath somewhere?

> However, iffff strlen(kn->name) is greater than (buflen - 1), then
> strlcpy would return a different value than the number of bytes
> actually copied. Since there is no check, this (wrong) return value
> will be propagated to the clients down the stack.

But in the existing kernel, can that happen today?  I can't find any
caller that would be using this in that manner.

And that's what matters.  Not the theoretical callers in the possible
future, but the in-kernel users of the functions today.

If these calls are wrong, then they should be fixed, but adding another
string function to the heap of ones we have should only be done after
considering the ones that we currently have, why they were added, and
why existing code has not yet been converted to use them.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC for a new string-copy function, using mixtures of strlcpy and strscpy
  2021-11-08 13:01     ` Greg KH
@ 2021-11-08 13:19       ` Ajay Garg
  0 siblings, 0 replies; 10+ messages in thread
From: Ajay Garg @ 2021-11-08 13:19 UTC (permalink / raw)
  To: Greg KH
  Cc: andy, Kees Cook, akpm, adobriyan, Nick Desaulniers,
	Linux Kernel Mailing List, linux-hardening

On Mon, Nov 8, 2021 at 6:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Mon, Nov 08, 2021 at 06:15:54PM +0530, Ajay Garg wrote:
> > Hi Greg,
> >
> > Thanks for your time.
> >
> > >
> > > Wait, why?  We have recently added new string copy functions to resolve
> > > the type of issues you have pointed out.  The kernel is not yet
> > > converted to use them, so why add yet-another-function that also is not
> > > used?
> >
> > Greg, would request your couple of minutes more, in suggesting a
> > concrete way forward, by working through an example as below.
> >
> >
> > In the file fs/kernfs/dir.c, there is a statement
> >
> >         return strlcpy(buf, kn->parent ? kn->name : "/", buflen);
> >
> > Here, there is no information of how long kn->name might be, so there
> > is a definite chance of overflow happening. Thus, accordingly, strlcpy
> > is used, to bound copying of upto buflen bytes. Of course, buf
> > (destination-buffer) is safe from any overflow now.
>
> We "know" that kn->name will not overflow here based on the callers of
> this function, right?  I can not find any caller that passes in a buffer
> that would overflow, or am I missing a callpath somewhere?
>
> > However, iffff strlen(kn->name) is greater than (buflen - 1), then
> > strlcpy would return a different value than the number of bytes
> > actually copied. Since there is no check, this (wrong) return value
> > will be propagated to the clients down the stack.
>
> But in the existing kernel, can that happen today?  I can't find any
> caller that would be using this in that manner.
>
> And that's what matters.  Not the theoretical callers in the possible
> future, but the in-kernel users of the functions today.
>
> If these calls are wrong, then they should be fixed, but adding another
> string function to the heap of ones we have should only be done after
> considering the ones that we currently have, why they were added, and
> why existing code has not yet been converted to use them.
>

Got it, thanks Greg !


Thanks and Regards,
Ajay

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC for a new string-copy function, using mixtures of strlcpy and strscpy
  2021-11-08 12:45   ` Ajay Garg
  2021-11-08 13:01     ` Greg KH
@ 2021-11-08 18:05     ` Kees Cook
  2021-11-08 19:20       ` Ajay Garg
  1 sibling, 1 reply; 10+ messages in thread
From: Kees Cook @ 2021-11-08 18:05 UTC (permalink / raw)
  To: Ajay Garg
  Cc: Greg KH, andy, akpm, adobriyan, Nick Desaulniers,
	Linux Kernel Mailing List, linux-hardening

On Mon, Nov 08, 2021 at 06:15:54PM +0530, Ajay Garg wrote:
> Hi Greg,
> 
> Thanks for your time.
> 
> >
> > Wait, why?  We have recently added new string copy functions to resolve
> > the type of issues you have pointed out.  The kernel is not yet
> > converted to use them, so why add yet-another-function that also is not
> > used?
> 
> Greg, would request your couple of minutes more, in suggesting a
> concrete way forward, by working through an example as below.
> 
> 
> In the file fs/kernfs/dir.c, there is a statement
> 
>         return strlcpy(buf, kn->parent ? kn->name : "/", buflen);
> 
> Here, there is no information of how long kn->name might be, so there
> is a definite chance of overflow happening. Thus, accordingly, strlcpy
> is used, to bound copying of upto buflen bytes. Of course, buf
> (destination-buffer) is safe from any overflow now.
> 
> However, iffff strlen(kn->name) is greater than (buflen - 1), then
> strlcpy would return a different value than the number of bytes
> actually copied. Since there is no check, this (wrong) return value
> will be propagated to the clients down the stack.

The propagation issues are the real problem here, IMO. strlcpy returns
strlen(src), which is bound to cause problems if the result in used to
adjust string index, and strscpy returns -E2BIG on overflow, which can
cause different problems if the result in used to adjust string indexes.

strlcpy has the added problem of potentially performing OOB reads when
src is unterminated, so it needs to be entirely removed from the kernel
in favor of strscpy[1]. But yes, as you say, it isn't a drop-in
replacement, but that's because it shouldn't be -- the return values
need to be checked in both cases.

> Now, in the current situation, following is the remedy :
> 
> ########################################
>         int ret = strlcpy(buf, kn->parent ? kn->name : "/", buflen);
>         if(ret >= buflen)
>             ret = buflen - 1;
> 
>         return ret;
> ########################################

We do have other functions that "hide" the overflow (e.g. scnprintf),
but those are usually more common in large string constructions, or
chains of copies. When doing a single string copy, I think it's
important that the caller decide explicitly what to do in the overflow
case.

For the specific fs/kerfs/dir.c case, I don't see any problems --
nothing uses the result (cgroup_name() is the only caller of
kernfs_name() that I see).

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: RFC for a new string-copy function, using mixtures of strlcpy and strscpy
  2021-11-08 18:05     ` Kees Cook
@ 2021-11-08 19:20       ` Ajay Garg
  0 siblings, 0 replies; 10+ messages in thread
From: Ajay Garg @ 2021-11-08 19:20 UTC (permalink / raw)
  To: Kees Cook
  Cc: Greg KH, andy, akpm, adobriyan, Nick Desaulniers,
	Linux Kernel Mailing List, linux-hardening

Thanks Keen for your time.

>
> For the specific fs/kerfs/dir.c case, I don't see any problems --
> nothing uses the result (cgroup_name() is the only caller of
> kernfs_name() that I see).
>

I am not worried about this single case as per say.

My intention is to make the lives easier for clients in general, who
have the simple motive : to copy as many bytes as possible, and then
consume/propogate the return-value containing number of bytes
*actually* copied, without having to resort to the identical
4-lines-per-check-fix everywhere.

I think you and me agree on the pain-points of using strlcpy/strscpy.

The general consensus is that no new string-functions should be added
as of now, so I guess every client would require 4-lines-per-check-fix
as of now (wherever applicable of course).

Maybe, the RFC for new function could be discussed in the next
opportune moment, which would then be a simple drop-in replacement,
resulting in 1-lines-per-check-fix (wherever applicable of course).


Thanks and Regards,
Ajay

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-11-08 19:21 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-08  3:16 RFC for a new string-copy function, using mixtures of strlcpy and strscpy Ajay Garg
     [not found] ` <CAHp75VcVZ6dDDm-k=Njo-jDq81bL4BTwrtkkAnm24b23qWKB_g@mail.gmail.com>
2021-11-08  7:42   ` Ajay Garg
2021-11-08  8:33     ` Andy Shevchenko
2021-11-08  8:46       ` Ajay Garg
2021-11-08 12:22 ` Greg KH
2021-11-08 12:45   ` Ajay Garg
2021-11-08 13:01     ` Greg KH
2021-11-08 13:19       ` Ajay Garg
2021-11-08 18:05     ` Kees Cook
2021-11-08 19:20       ` Ajay Garg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.