All of lore.kernel.org
 help / color / mirror / Atom feed
From: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
To: Alejandro Colomar <alx.manpages@gmail.com>
Cc: linux-man@vger.kernel.org, Alejandro Colomar <alx@kernel.org>,
	Martin Sebor <msebor@redhat.com>,
	"G. Branden Robinson" <g.branden.robinson@gmail.com>,
	Jakub Wilk <jwilk@jwilk.net>
Subject: Re: [PATCH v3 1/1] strcpy.3: Rewrite page to document all string-copying functions
Date: Wed, 14 Dec 2022 11:22:05 -0500	[thread overview]
Message-ID: <CAKH6PiUrQzb7vRZxUs0742WnfaLpcUec0QfdJQJ5Di8LqFg+NA@mail.gmail.com> (raw)
In-Reply-To: <20221214000341.39846-2-alx@kernel.org>

> a sequence of zero or more non-null characters followed by a null byte

Varying  terminology (character vs byte) is poor style in technical writing.

> concatenate

We began fighting this pomposity before v7. There has only been
backsliding since..
"Catenate" is crisper, means the same thing, and concurs with the "cat" command.
I invite you to join the battle for simplicity.

> chain copy

This term is never overtly defined. The definition might be inferred
from, "To chain copy
functions, they need to return a pointer to the end", but the
problematic grammar of the
sentence diverts attention from its content.

> strscpy

Doesn't it muddy the waters to include a non-library function in man3?

Doug

On Tue, Dec 13, 2022 at 7:03 PM Alejandro Colomar
<alx.manpages@gmail.com> wrote:
>
> This is an opportunity to use consistent language across the
> documentation for all string-copying functions.
>
> It is also easier to show the similarities and differences between all
> of the functions, so that a reader can use this page to know which
> function is needed for a given task.
>
> Many functions that are inferior to another one, have been marked as
> deprecated, notwithstanding the deprecation status in C libraries or
> any standards.  Alternatives have been given in the same page, with
> reference implementations.
>
> Cc: Martin Sebor <msebor@redhat.com>
> Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
> Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
> Cc: Jakub Wilk <jwilk@jwilk.net>
> Signed-off-by: Alejandro Colomar <alx@kernel.org>
> ---
>  man3/strcpy.3 | 1058 +++++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 970 insertions(+), 88 deletions(-)
>
> diff --git a/man3/strcpy.3 b/man3/strcpy.3
> index 74c3180ae..e04a7b149 100644
> --- a/man3/strcpy.3
> +++ b/man3/strcpy.3
> @@ -1,48 +1,767 @@
> -.\" Copyright (C) 1993 David Metcalfe (david@prism.demon.co.uk)
> +.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
>  .\"
> -.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> -.\"
> -.\" References consulted:
> -.\"     Linux libc source code
> -.\"     Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
> -.\"     386BSD man pages
> -.\" Modified Sat Jul 24 18:06:49 1993 by Rik Faith (faith@cs.unc.edu)
> -.\" Modified Fri Aug 25 23:17:51 1995 by Andries Brouwer (aeb@cwi.nl)
> -.\" Modified Wed Dec 18 00:47:18 1996 by Andries Brouwer (aeb@cwi.nl)
> -.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
> -.\"     Improve discussion of strncpy().
> +.\" SPDX-License-Identifier: BSD-3-Clause
>  .\"
>  .TH strcpy 3 (date) "Linux man-pages (unreleased)"
> +.\" ----- NAME :: -----------------------------------------------------/
>  .SH NAME
> -strcpy \- copy a string
> +stpcpy,
> +strcpy, strcat,
> +stpecpy, stpecpyx,
> +strlcpy, strlcat,
> +strscpy,
> +stpncpy,
> +strncpy,
> +ustr2stp,
> +strncat,
> +mempcpy
> +\- copy strings and character sequences
> +.\" ----- LIBRARY :: --------------------------------------------------/
>  .SH LIBRARY
> +.TP
> +.BR stpcpy (3)
> +.TQ
> +.BR strcpy "(3), \c"
> +.BR strcat (3)
> +.TQ
> +.BR stpncpy (3)
> +.TQ
> +.BR strncpy (3)
> +.TQ
> +.BR strncat (3)
> +.TQ
> +.BR mempcpy (3)
>  Standard C library
>  .RI ( libc ", " \-lc )
> +.TP
> +.BR stpecpy "(3), \c"
> +.BR stpecpyx (3)
> +Not provided by any library.
> +.TP
> +.BR strlcpy "(3), \c"
> +.BR strlcat (3)
> +Utility functions from BSD systems
> +.RI ( libbsd ", " \-lbsd )
> +.TP
> +.BR strscpy (3)
> +Not provided by any library.
> +It is a Linux kernel internal function.
> +.\" ----- SYNOPSIS :: -------------------------------------------------/
>  .SH SYNOPSIS
>  .nf
>  .B #include <string.h>
> +.fi
> +.\" ----- SYNOPSIS :: (Null-terminated) strings -----------------------/
> +.SS Strings
> +.nf
> +// Chain-copy a string.
> +.BI "char *stpcpy(char *restrict " dst ", const char *restrict " src );
>  .PP
> -.BI "char *strcpy(char *restrict " dest ", const char *restrict " src );
> +// Copy/concatenate a string.
> +.BI "char *strcpy(char *restrict " dst ", const char *restrict " src );
> +.BI "char *strcat(char *restrict " dst ", const char *restrict " src );
> +.PP
> +// Chain-copy a string with truncation.
> +.BI "char *stpecpy(char *" dst ", char " past_end "[0], \
> +const char *restrict " src );
> +.PP
> +// Chain-copy a string with truncation and SIGSEGV on UB.
> +.BI "char *stpecpyx(char *" dst ", char " past_end "[0], \
> +const char *restrict " src );
> +.PP
> +// Copy/concatenate a string with truncation and SIGSEGV on UB.
> +.BI "size_t strlcpy(char " dst "[restrict ." sz "], \
> +const char *restrict " src ,
> +.BI "               size_t " sz );
> +.BI "size_t strlcat(char " dst "[restrict ." sz "], \
> +const char *restrict " src ,
> +.BI "               size_t " sz );
> +.PP
> +// Copy a string with truncation.
> +.BI "ssize_t strscpy(char " dst "[restrict ." sz "], \
> +const char " src "[restrict ." sz ],
> +.BI "               size_t " sz );
> +.fi
> +.\" ----- SYNOPSIS :: Null-padded character sequences --------/
> +.SS Null-padded character sequences
> +.nf
> +// Zero a fixed-width buffer, and
> +// copy a string with truncation into a character sequence.
> +.BI "char *stpncpy(char " dst "[restrict ." sz "], \
> +const char *restrict " src ,
> +.BI "               size_t " sz );
> +.PP
> +// Zero a fixed-width buffer, and
> +// copy a string with truncation into a character sequence.
> +.BI "char *strncpy(char " dest "[restrict ." sz "], \
> +const char *restrict " src ,
> +.BI "               size_t " sz );
> +.PP
> +// Chain-copy a null-padded character sequence into a string.
> +.BI "char *ustr2stp(char *restrict " dst ", \
> +const char " src "[restrict ." sz ],
> +.BI "               size_t " sz );
> +.PP
> +// Concatenate a null-padded character sequence into a string.
> +.BI "char *strncat(char *restrict " dst ", const char " src "[restrict ." sz ],
> +.BI "               size_t " sz );
> +.fi
> +.\" ----- SYNOPSIS :: Measured character sequences --------------------/
> +.SS Measured character sequences
> +.nf
> +// Chain-copy a measured character sequence.
> +.BI "void *mempcpy(void *restrict " dst ", \
> +const void " src "[restrict ." len ],
> +.BI "               size_t " len );
> +.fi
> +.PP
> +.RS -4
> +Feature Test Macro Requirements for glibc (see
> +.BR feature_test_macros (7)):
> +.RE
> +.PP
> +.BR stpcpy (3),
> +.BR stpncpy (3):
> +.nf
> +    Since glibc 2.10:
> +        _POSIX_C_SOURCE >= 200809L
> +    Before glibc 2.10:
> +        _GNU_SOURCE
> +.fi
> +.PP
> +.BR mempcpy (3):
> +.nf
> +    _GNU_SOURCE
>  .fi
>  .SH DESCRIPTION
> -The
> -.BR strcpy ()
> -function copies the string pointed to by
> -.IR src ,
> -including the terminating null byte (\(aq\e0\(aq),
> -to the buffer pointed to by
> -.IR dest .
> -The strings may not overlap, and the destination string
> -.I dest
> -must be large enough to receive the copy.
> -.I Beware of buffer overruns!
> -(See BUGS.)
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: -----------------/
> +.SS Terms (and abbreviations)
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: string (str) ----/
> +.TP
> +.IR "string " ( str )
> +is a sequence of zero or more non-null characters followed by a null byte.
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: null-padded character seq
> +.TP
> +.IR "character sequence " ( ustr )
> +is a sequence of zero or more non-null characters.
> +A program should never usa a character sequence where a string is required.
> +However, with appropriate care,
> +a string can be used in the place of a character sequence.
> +.RS
> +.TP
> +.I null-padded character sequence
> +Character sequences can be contained in fixed-width buffers,
> +which contain padding null bytes after the character sequence,
> +to fill the rest of the buffer
> +without affecting the character sequence;
> +however, those padding null bytes are not part of the character sequence.
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: measured character sequence
> +.TP
> +.I measured character sequence
> +Character sequence delimited by its length.
> +.RE
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: length (len) ----/
> +.TP
> +.IR "length " ( len )
> +is the number of non-null characters in a string or character sequence.
> +It is the return value of
> +.I strlen(str)
> +and of
> +.IR "strnlen(ustr, sz)" .
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: size (sz) -------/
> +.TP
> +.IR "size " ( sz )
> +refers to the entire buffer
> +where the string or character sequence is contained.
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: end -------------/
> +.TP
> +.I end
> +is the name of a pointer to the terminating null byte of a string,
> +or a pointer to one past the last character of a character sequence.
> +This is the return value of functions that allow chaining.
> +It is equivalent to
> +.IR &str[len] .
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: past_end --------/
> +.TP
> +.I past_end
> +is the name of a pointer to one past the end of the buffer
> +that contains a string or character sequence.
> +It is equivalent to
> +.IR &str[sz] .
> +It is used as a sentinel value,
> +to be able to truncate strings or character sequences
> +instead of overrunning the containing buffer.
> +.\" ----- DESCRIPTION :: Copy, concatenate, and chain-copy ------------/
> +.SS Copy, concatenate, and chain-copy
> +Originally,
> +there was a distinction between functions that copy and those that concatenate.
> +However, newer functions that copy while allowing chaining
> +cover both use cases with a single API.
> +They are also algorithmically faster,
> +since they don't need to search for the end of the existing string.
> +However, functions that concatenate have a much simpler use,
> +so if performance is not important,
> +it can make sense to use them for improving readability.
> +.PP
> +To chain copy functions,
> +they need to return a pointer to the
> +.IR end .
> +That's a byproduct of the copy operation,
> +so it has no performance costs.
> +Functions that return such a pointer,
> +and thus can be chained,
> +have names of the form
> +.RB * stp *()
> +or
> +.RB * memp *(),
> +since it's also common to name the pointer just
> +.IR p .
> +.PP
> +Chain-copying functions that truncate
> +should accept a pointer to one past the end of the destination buffer,
> +and have names of the form
> +.RB * stpe *().
> +This allows not having to recalculate the remaining size after each call.
> +.\" ----- DESCRIPTION :: Truncate or not? -----------------------------/
> +.SS Truncate or not?
> +The first thing to note is that programmers should be careful with buffers,
> +so they always have the correct size,
> +and truncation is not necessary.
> +.PP
> +In most cases,
> +truncation is not desired,
> +and it is simpler to just do the copy.
> +Simpler code is safer code.
> +Programming against programming mistakes by adding more code
> +just adds more points where mistakes can be made.
> +.PP
> +Nowadays,
> +compilers can detect most programmer errors with features like
> +compiler warnings,
> +static analyzers, and
> +.BR \%_FORTIFY_SOURCE
> +(see
> +.BR ftm (7)).
> +Keeping the code simple
> +helps these overflow-detection features be more precise.
> +.PP
> +When validating user input,
> +however,
> +it makes sense to truncate.
> +Remember to check the return value of such function calls.
> +.PP
> +Functions that truncate:
> +.IP \(bu 3
> +.BR stpecpy (3)
> +is the most efficient string copy function that performs truncation.
> +It only requires to check for truncation once after all chained calls.
> +.IP \(bu
> +.BR stpecpyx (3)
> +is a variant of
> +.BR stpecpy (3)
> +that consumes the entire source string,
> +to catch bugs in the program
> +by forcing a segmentation fault (as
> +.BR strlcpy (3bsd)
> +and
> +.BR strlcat (3bsd)
> +do).
> +.IP \(bu
> +.BR strlcpy (3bsd)
> +and
> +.BR strlcat (3bsd)
> +are designed to crash if the input string is invalid
> +(doesn't contain a terminating null byte).
> +.IP \(bu
> +.BR strscpy (3)
> +reports an error instead of crashing (similar to
> +.BR stpecpy (3)).
> +.IP \(bu
> +.BR stpncpy (3)
> +and
> +.BR strncpy (3)
> +also truncate, but they don't write strings,
> +but rather null-padded character sequences.
> +.\" ----- DESCRIPTION :: Null-padded character sequences --------------/
> +.SS Null-padded character sequences
> +For historic reasons,
> +some standard APIs,
> +such as
> +.BR utmpx (5),
> +use null-padded character sequences in fixed-width buffers.
> +To interface with them,
> +specialized functions need to be used.
> +.PP
> +To copy strings into them, use
> +.BR stpncpy (3).
> +.PP
> +To copy from an unterminated string within a fixed-width buffer into a string,
> +ignoring any trailing null bytes in the source fixed-width buffer,
> +you should use
> +.BR ustr2stp (3)
> +or
> +.BR strncat (3).
> +.\" ----- DESCRIPTION :: Measured character sequences -----------------/
> +.SS Measured character sequences
> +The simplest character sequence copying function is
> +.BR mempcpy (3).
> +It requires always knowing the length of your character sequences,
> +for which structures can be used.
> +It makes the code much faster,
> +since you always know the length of your character sequences,
> +and can do the minimal copies and length measurements.
> +.BR mempcpy (3)
> +copies character sequences,
> +so you need to explicitly set the terminating null byte if you need a string.
> +.PP
> +The following code can be used to
> +chain-copy from a measured character sequence into a string:
> +.PP
> +.in +4n
> +.EX
> +p = mempcpy(p, foo\->ustr, foo\->len);
> +*p = \(aq\e0\(aq;
> +.EE
> +.in
> +.PP
> +The following code can be used to
> +chain-copy from a measured character sequence into an unterminated string:
> +.PP
> +.in +4n
> +.EX
> +p = mempcpy(p, bar\->ustr, bar\->len);
> +.EE
> +.in
> +.PP
> +In programs that make considerable use of strings or character sequences,
> +and need the best performance,
> +using overlapping character sequences can make a big difference.
> +It allows holding subsequences of a larger character sequence.
> +while not duplicating memory
> +nor using time to do a copy.
> +.PP
> +However, this is delicate,
> +since it requires using character sequences.
> +C library APIs use strings,
> +so programs that use character sequences
> +will have to take care of differentiating strings from character sequences.
> +.\" ----- DESCRIPTION :: String vs character sequence -----------------/
> +.SS String vs character sequence
> +Some functions only operate on strings.
> +Those require that the input
> +.I src
> +is a string,
> +and guarantee an output string
> +(even when truncation occurs).
> +Functions that concatenate
> +also require that
> +.I dst
> +holds a string before the call.
> +List of functions:
> +.IP \(bu 3
> +.PD 0
> +.BR stpcpy (3)
> +.IP \(bu
> +.BR strcpy "(3), \c"
> +.BR strcat (3)
> +.IP \(bu
> +.BR stpecpy "(3), \c"
> +.BR stpecpyx (3)
> +.IP \(bu
> +.BR strlcpy "(3bsd), \c"
> +.BR strlcat (3bsd)
> +.IP \(bu
> +.BR strscpy (3)
> +.PD
> +.PP
> +Other functions require an input string,
> +but create a character sequence as output.
> +These functions have confusing names,
> +and have a long history of misuse.
> +List of functions:
> +.IP \(bu 3
> +.PD 0
> +.BR stpncpy (3)
> +.IP \(bu
> +.BR strncpy (3)
> +.PD
> +.PP
> +Other functions operate on an input character sequence,
> +and create an output string.
> +Functions that concatenate
> +also require that
> +.I dst
> +holds a string before the call.
> +.BR strncat (3)
> +has an even more misleading name than the functions above.
> +List of functions:
> +.IP \(bu 3
> +.PD 0
> +.BR ustr2stp (3)
> +.IP \(bu
> +.BR strncat (3)
> +.PD
> +.PP
> +And the last one,
> +operates on an input character sequence
> +to create an output character sequence.
> +But because it asks for the length,
> +and a string is by nature composed of a character sequence of the same length
> +plus a terminating null byte,
> +a string is also accepted as input.
> +Function:
> +.IP \(bu 3
> +.BR mempcpy (3)
> +.\" ----- DESCRIPTION :: Functions :: ---------------------------------/
> +.SS Functions
> +.\" ----- DESCRIPTION :: Functions :: stpcpy(3) -----------------------/
> +.TP
> +.BR stpcpy (3)
> +This function copies the input string into a destination string.
> +The programmer is responsible for allocating a buffer large enough.
> +It returns a pointer suitable for chaining.
> +.IP
> +An implementation of this function might be:
> +.IP
> +.in +4n
> +.EX
> +char *
> +stpcpy(char *restrict dst, const char *restrict src)
> +{
> +    return mempcpy(dst, src, strlen(src));
> +}
> +.EE
> +.in
> +.\" ----- DESCRIPTION :: Functions :: strcpy(3), strcat(3) ------------/
> +.TP
> +.BR strcpy (3)
> +.TQ
> +.BR strcat (3)
> +These functions copy the input string into a destination string.
> +The programmer is responsible for allocating a buffer large enough.
> +The return value is useless.
> +.IP
> +.BR stpcpy (3)
> +is a faster alternative to these functions.
> +.IP
> +An implementation of these functions might be:
> +.IP
> +.in +4n
> +.EX
> +char *
> +strcpy(char *restrict dst, const char *restrict src)
> +{
> +    stpcpy(dst, src);
> +    return dst;
> +}
> +
> +char *
> +strcat(char *restrict dst, const char *restrict src)
> +{
> +    stpcpy(dst + strlen(dst), src);
> +    return dst;
> +}
> +.EE
> +.in
> +.\" ----- DESCRIPTION :: Functions :: stpecpy(3), stpecpyx(3) ---------/
> +.TP
> +.BR stpecpy (3)
> +.TQ
> +.BR stpecpyx (3)
> +These functions copy the input string into a destination string.
> +If the destination buffer,
> +limited by a pointer to one past the end of it,
> +isn't large enough to hold the copy,
> +the resulting string is truncated
> +(but it is guaranteed to be null-terminated).
> +They return a pointer suitable for chaining.
> +Truncation needs to be detected only once after the last chained call.
> +.BR stpecpyx (3)
> +has identical semantics to
> +.BR stpecpy (3),
> +except that it forces a SIGSEGV if the
> +.I src
> +pointer is not a string.
> +.IP
> +These functions are not provided by any library,
> +but you can define them with the following reference implementations:
> +.IP
> +.in +4n
> +.EX
> +/* This code is in the public domain. */
> +char *
> +stpecpy(char *dst, char past_end[0],
> +        const char *restrict src)
> +{
> +    char *p;
> +
> +    if (dst == past_end)
> +        return past_end;
> +
> +    p = memccpy(dst, src, \(aq\e0\(aq, past_end \- dst);
> +    if (p != NULL)
> +        return p \- 1;
> +
> +    /* truncation detected */
> +    past_end[\-1] = \(aq\e0\(aq;
> +    return past_end;
> +}
> +
> +/* This code is in the public domain. */
> +char *
> +stpecpyx(char *dst, char past_end[0],
> +         const char *restrict src)
> +{
> +    if (src[strlen(src)] != \(aq\e0\(aq)
> +        raise(SIGSEGV);
> +
> +    return stpecpy(dst, past_end, src);
> +}
> +.EE
> +.in
> +.\" ----- DESCRIPTION :: Functions :: strlcpy(3bsd), strlcat(3bsd) ----/
> +.TP
> +.BR strlcpy (3bsd)
> +.TQ
> +.BR strlcat (3bsd)
> +These functions copy the input string into a destination string.
> +If the destination buffer,
> +limited by its size,
> +isn't large enough to hold the copy,
> +the resulting string is truncated
> +(but it is guaranteed to be null-terminated).
> +They return the length of the total string they tried to create.
> +These functions force a SIGSEGV if the
> +.I src
> +pointer is not a string.
> +.IP
> +.BR stpecpyx (3)
> +is a faster alternative to these functions.
> +.\" ----- DESCRIPTION :: Functions :: strscpy(3) ----------------------/
> +.TP
> +.BR strscpy (3)
> +This function copies the input string into a destination string.
> +If the destination buffer,
> +limited by its size,
> +isn't large enough to hold the copy,
> +the resulting string is truncated
> +(but it is guaranteed to be null-terminated).
> +It returns the length of the destination string, or
> +.B \-E2BIG
> +on truncation.
> +.IP
> +.BR stpecpy (3)
> +is a simpler and faster alternative to this function.
> +.RE
> +.\" ----- DESCRIPTION :: Functions :: stpncpy(3) ----------------------/
> +.TP
> +.BR stpncpy (3)
> +This function copies the input string into
> +a destination null-padded character sequence in a fixed-width buffer.
> +If the destination buffer,
> +limited by its size,
> +isn't large enough to hold the copy,
> +the resulting character sequence is truncated.
> +Since it creates a character sequence,
> +it doesn't need to write a terminating null byte.
> +It returns a pointer suitable for chaining,
> +but it's not ideal for that.
> +Truncation needs to be detected only once after the last chained call.
> +.IP
> +If you're going to use this function in chained calls,
> +it would be useful to develop a similar function
> +that accepts a pointer to one past the end of the buffer instead of a size.
> +.IP
> +An implementation of this function might be:
> +.IP
> +.in +4n
> +.EX
> +char *
> +stpncpy(char *restrict dst, const char *restrict src,
> +        size_t sz)
> +{
> +    char  *p;
> +
> +    bzero(dst, sz);
> +    p = memccpy(dst, src, \(aq\e0\(aq, sz);
> +    if (p == NULL)
> +        return dst + sz;
> +
> +    return p \- 1;
> +}
> +.EE
> +.in
> +.\" ----- DESCRIPTION :: Functions :: ustr2stp(3) ---------------------/
> +.TP
> +.BR ustr2stp (3)
> +This function copies the input character sequence
> +contained in a null-padded wixed-width buffer,
> +into a destination string.
> +The programmer is responsible for allocating a buffer large enough.
> +It returns a pointer suitable for chaining.
> +.IP
> +A truncating version of this function doesn't exist,
> +since the size of the original character sequence is always known,
> +so it wouldn't be very useful.
> +.IP
> +This function is not provided by any library,
> +but you can define it with the following reference implementation:
> +.IP
> +.in +4n
> +.EX
> +/* This code is in the public domain. */
> +char *
> +ustr2stp(char *restrict dst, const char *restrict src,
> +         size_t sz)
> +{
> +    char  *end;
> +
> +    end = memccpy(dst, src, \(aq\e0\(aq, sz)) ?: dst + sz;
> +    *end = \(aq\e0\(aq;
> +
> +    return end;
> +}
> +.EE
> +.in
> +.\" ----- DESCRIPTION :: Functions :: strncpy(3) ----------------------/
> +.TP
> +.BR strncpy (3)
> +This function is identical to
> +.BR stpncpy (3)
> +except for the useless return value.
> +Due to the return value,
> +with this function it's hard to correctly check for truncation.
> +.IP
> +.BR stpncpy (3)
> +is a simpler alternative to this function.
> +.IP
> +An implementation of this function might be:
> +.IP
> +.in +4n
> +.EX
> +char *
> +strncpy(char *restrict dst, const char *restrict src,
> +        size_t sz)
> +{
> +    stpncpy(dst, src, sz);
> +    return dst;
> +}
> +.EE
> +.in
> +.\" ----- DESCRIPTION :: Functions :: strncat(3) ----------------------/
> +.TP
> +.BR strncat (3)
> +Do not confuse this function with
> +.BR strncpy (3);
> +they are not related at all.
> +.IP
> +This function concatenates the input character sequence
> +contained in a null-padded wixed-width buffer,
> +into a destination string.
> +The programmer is responsible for allocating a buffer large enough.
> +The return value is useless.
> +.IP
> +.BR ustr2stp (3)
> +is a faster alternative to this function.
> +.IP
> +An implementation of this function might be:
> +.IP
> +.in +4n
> +.EX
> +char *
> +strncat(char *restrict dst, const char *restrict src,
> +        size_t sz)
> +{
> +    ustr2stp(dst + strlen(dst), src, sz);
> +    return dst;
> +}
> +.EE
> +.in
> +.\" ----- DESCRIPTION :: Functions :: mempcpy(3) ----------------------/
> +.TP
> +.BR mempcpy (3)
> +This function copies the input character sequence,
> +limited by its length,
> +into a destination character sequence.
> +The programmer is responsible for allocating a buffer large enough.
> +It returns a pointer suitable for chaining.
> +.IP
> +An implementation of this function might be:
> +.IP
> +.in +4n
> +.EX
> +void *
> +mempcpy(void *restrict dst, const void *restrict src,
> +        size_t len)
> +{
> +    return memcpy(dst, src, len) + len;
> +}
> +.EE
> +.in
> +.\" ----- RETURN VALUE :: ---------------------------------------------/
>  .SH RETURN VALUE
> -The
> -.BR strcpy ()
> -function returns a pointer to
> -the destination string
> -.IR dest .
> +The following functions return
> +a pointer to the terminating null byte in the destination string.
> +.IP \(bu 3
> +.PD 0
> +.BR stpcpy (3)
> +.IP \(bu
> +.BR ustr2stp (3)
> +.PD
> +.PP
> +The following functions return
> +a pointer to the terminating null byte in the destination string,
> +except when truncation occurs;
> +if truncation occurs,
> +they return a pointer to one past the end of the destination buffer
> +.RI ( past_end ).
> +.IP \(bu 3
> +.BR stpecpy (3),
> +.BR stpecpyx (3)
> +.PP
> +The following function returns
> +a pointer to one after the last character
> +in the destination character sequence;
> +if truncation occurs,
> +that pointer is equivalent to
> +a pointer to one past the end of the destination buffer.
> +.IP \(bu 3
> +.BR stpncpy (3)
> +.PP
> +The following function returns
> +a pointer to one after the last character
> +in the destination character sequence.
> +.IP \(bu 3
> +.BR mempcpy (3)
> +.PP
> +The following functions return
> +the length of the total string that they tried to create
> +(as if truncation didn't occur).
> +.IP \(bu 3
> +.BR strlcpy (3bsd),
> +.BR strlcat (3bsd)
> +.PP
> +The following function returns
> +the length of the destination string, or
> +.B \-E2BIG
> +on truncation.
> +.IP \(bu 3
> +.BR strscpy (3)
> +.PP
> +The following functions return the
> +.I dst
> +pointer,
> +which is useless.
> +.IP \(bu 3
> +.PD 0
> +.BR strcpy (3),
> +.BR strcat (3)
> +.IP \(bu
> +.BR strncpy (3)
> +.IP \(bu
> +.BR strncat (3)
> +.PD
> +.\" ----- ATTRIBUTES :: -----------------------------------------------/
>  .SH ATTRIBUTES
>  For an explanation of the terms used in this section, see
>  .BR attributes (7).
> @@ -54,73 +773,236 @@ .SH ATTRIBUTES
>  l l l.
>  Interface      Attribute       Value
>  T{
> -.BR strcpy ()
> +.BR stpcpy (),
> +.BR strcpy (),
> +.BR strcat (),
> +.BR stpecpy (),
> +.BR stpecpyx ()
> +.BR strlcpy (),
> +.BR strlcat (),
> +.BR strscpy (),
> +.BR stpncpy (),
> +.BR strncpy (),
> +.BR ustr2stp (),
> +.BR strncat (),
> +.BR mempcpy ()
>  T}     Thread safety   MT-Safe
>  .TE
>  .hy
>  .ad
>  .sp 1
> +.\" ----- STANDARDS :: ------------------------------------------------/
>  .SH STANDARDS
> -POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
> -.SH NOTES
> -.SS strlcpy()
> -Some systems (the BSDs, Solaris, and others) provide the following function:
> +.TP
> +.BR strcpy "(3), \c"
> +.BR strcat (3)
> +.TQ
> +.BR strncpy (3)
> +.TQ
> +.BR strncat (3)
> +POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
> +.TP
> +.BR stpcpy (3)
> +.\" This function was added to POSIX.1-2008.
> +.\" Before that, it was not part of
> +.\" the C or POSIX.1 standards, nor customary on UNIX systems.
> +.\" It first appeared at least as early as 1986,
> +.\" in the Lattice C AmigaDOS compiler,
> +.\" then in the GNU fileutils and GNU textutils in 1989,
> +.\" and in the GNU C library by 1992.
> +.\" It is also present on the BSDs.
> +.TQ
> +.BR stpncpy (3)
> +.\" This function was added to POSIX.1-2008.
> +.\" Before that, it was a GNU extension.
> +.\" It first appeared in glibc 1.07 in 1993.
> +POSIX.1-2008.
> +.TP
> +.BR strlcpy "(3bsd), \c"
> +.BR strlcat (3bsd)
> +Functions originated in OpenBSD and present in some Unix systems.
> +.TP
> +.BR mempcpy (3)
> +This function is a GNU extension.
> +.TP
> +.BR strscpy (3)
> +Linux kernel internal function.
> +.TP
> +.BR stpecpy "(3), \c"
> +.BR stpecpyx (3)
> +.TQ
> +.BR ustr2stp (3)
> +Not defined by any standards nor libraries.
> +.\" ----- CAVEATS :: --------------------------------------------------/
> +.SH CAVEATS
> +Don't mix chain calls to truncating and non-truncating functions.
> +It is conceptually wrong
> +unless you know that the first part of a copy will always fit.
> +Anyway, the performance difference will probably be negligible,
> +so it will probably be more clear if you use consistent semantics:
> +either truncating or non-truncating.
> +Calling a non-truncating function after a truncating one is necessarily wrong.
>  .PP
> +Some of the functions described here are not provided by any library;
> +you should write your own copy if you want to use them.
> +See STANDARDS.
> +.\" ----- BUGS :: -----------------------------------------------------/
> +.SH BUGS
> +All concatenation
> +.RB (* cat ())
> +functions share the same performance problem:
> +.UR https://www.joelonsoftware.com/\:2001/12/11/\:back\-to\-basics/
> +Shlemiel the painter
> +.UE .
> +.\" ----- EXAMPLES :: -------------------------------------------------/
> +.SH EXAMPLES
> +The following are examples of correct use of each of these functions.
> +.\" ----- EXAMPLES :: stpcpy(3) ---------------------------------------/
> +.TP
> +.BR stpcpy (3)
>  .in +4n
>  .EX
> -size_t strlcpy(char *dest, const char *src, size_t size);
> +p = buf;
> +p = stpcpy(p, "Hello ");
> +p = stpcpy(p, "world");
> +p = stpcpy(p, "!");
> +len = p \- buf;
> +puts(buf);
>  .EE
>  .in
> -.PP
> -.\" http://static.usenix.org/event/usenix99/full_papers/millert/millert_html/index.html
> -.\"     "strlcpy and strlcat - consistent, safe, string copy and concatenation"
> -.\"     1999 USENIX Annual Technical Conference
> -This function is similar to
> -.BR strcpy (),
> -but it copies at most
> -.I size\-1
> -bytes to
> -.IR dest ,
> -truncating the string as necessary.
> -It always adds a terminating null byte.
> -This function fixes some of the problems of
> -.BR strcpy ()
> -but the caller must still handle the possibility of data loss if
> -.I size
> -is too small.
> -The return value of the function is the length of
> -.IR src ,
> -which allows truncation to be easily detected:
> -if the return value is greater than or equal to
> -.IR size ,
> -truncation occurred.
> -If loss of data matters, the caller
> -.I must
> -either check the arguments before the call,
> -or test the function return value.
> -.BR strlcpy ()
> -is not present in glibc and is not standardized by POSIX,
> -.\" https://lwn.net/Articles/506530/
> -but is available on Linux via the
> -.I libbsd
> -library.
> -.SH BUGS
> -If the destination string of a
> -.BR strcpy ()
> -is not large enough, then anything might happen.
> -Overflowing fixed-length string buffers is a favorite cracker technique
> -for taking complete control of the machine.
> -Any time a program reads or copies data into a buffer,
> -the program first needs to check that there's enough space.
> -This may be unnecessary if you can show that overflow is impossible,
> -but be careful: programs can get changed over time,
> -in ways that may make the impossible possible.
> +.\" ----- EXAMPLES :: strcpy(3), strcat(3) ----------------------------/
> +.TP
> +.BR strcpy (3)
> +.TQ
> +.BR strcat (3)
> +.in +4n
> +.EX
> +strcpy(buf, "Hello ");
> +strcat(buf, "world");
> +strcat(buf, "!");
> +len = strlen(buf);
> +puts(buf);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: stpecpy(3), stpecpyx(3) -------------------------/
> +.TP
> +.BR stpecpy (3)
> +.TQ
> +.BR stpecpyx (3)
> +.in +4n
> +.EX
> +past_end = buf + sizeof(buf);
> +p = buf;
> +p = stpecpy(p, past_end, "Hello ");
> +p = stpecpy(p, past_end, "world");
> +p = stpecpy(p, past_end, "!");
> +if (p == past_end) {
> +    p\-\-;
> +    goto toolong;
> +}
> +len = p \- buf;
> +puts(buf);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: strlcpy(3bsd), strlcat(3bsd) --------------------/
> +.TP
> +.BR strlcpy (3bsd)
> +.TQ
> +.BR strlcat (3bsd)
> +.in +4n
> +.EX
> +if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
> +    goto toolong;
> +if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
> +    goto toolong;
> +len = strlcat(buf, "!", sizeof(buf));
> +if (len >= sizeof(buf))
> +    goto toolong;
> +puts(buf);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: strscpy(3) --------------------------------------/
> +.TP
> +.BR strscpy (3)
> +.in +4n
> +.EX
> +len = strscpy(buf, "Hello world!", sizeof(buf));
> +if (len == \-E2BIG)
> +    goto toolong;
> +puts(buf);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: stpncpy(3) --------------------------------------/
> +.TP
> +.BR stpncpy (3)
> +.in +4n
> +.EX
> +past_end = buf + sizeof(buf);
> +end = stpncpy(buf, "Hello world!", sizeof(buf));
> +if (end == past_end)
> +    goto toolong;
> +len = end \- buf;
> +for (size_t i = 0; i < sizeof(buf); i++)
> +    putchar(buf[i]);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: strncpy(3) --------------------------------------/
> +.TP
> +.BR strncpy (3)
> +.in +4n
> +.EX
> +strncpy(buf, "Hello world!", sizeof(buf));
> +if (buf + sizeof(buf) \- 1 == \(aq\e0\(aq)
> +    goto toolong;
> +len = strnlen(buf, sizeof(buf));
> +for (size_t i = 0; i < sizeof(buf); i++)
> +    putchar(buf[i]);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: ustr2stp(3) -------------------------------------/
> +.TP
> +.BR ustr2stp (3)
> +.in +4n
> +.EX
> +p = buf;
> +p = ustr2stp(p, "Hello ", 6);
> +p = ustr2stp(p, "world", 42);  // Padding null bytes ignored.
> +p = ustr2stp(p, "!", 1);
> +len = p \- buf;
> +puts(buf);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: strncat(3) --------------------------------------/
> +.TP
> +.BR strncat (3)
> +.in +4n
> +.EX
> +buf[0] = \(aq\e0\(aq;  // There's no 'cpy' function to this 'cat'.
> +strncat(buf, "Hello ", 6);
> +strncat(buf, "world", 42);  // Padding null bytes ignored.
> +strncat(buf, "!", 1);
> +len = strlen(buf);
> +puts(buf);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: mempcpy(3) --------------------------------------/
> +.TP
> +.BR mempcpy (3)
> +.in +4n
> +.EX
> +p = buf;
> +p = mempcpy(p, "Hello ", 6);
> +p = mempcpy(p, "world", 5);
> +p = mempcpy(p, "!", 1);
> +p = \(aq\e0\(aq;
> +len = p \- buf;
> +puts(buf);
> +.EE
> +.in
> +.\" ----- SEE ALSO :: -------------------------------------------------/
>  .SH SEE ALSO
> -.BR bcopy (3),
> -.BR memccpy (3),
> +.BR bzero (3),
>  .BR memcpy (3),
> -.BR memmove (3),
> -.BR stpcpy (3),
> -.BR strdup (3),
> -.BR string (3),
> -.BR wcscpy (3)
> +.BR memccpy (3),
> +.BR mempcpy (3),
> +.BR string (3)
> --
> 2.38.1
>

  reply	other threads:[~2022-12-14 16:22 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-11 23:59 string_copy(7): New manual page documenting string copying functions Alejandro Colomar
2022-12-12  0:17 ` Alejandro Colomar
2022-12-12  0:25 ` Alejandro Colomar
2022-12-12  0:32 ` Alejandro Colomar
2022-12-12 14:24 ` [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions Alejandro Colomar
2022-12-12 17:33   ` Alejandro Colomar
2022-12-12 18:38     ` groff man(7) extensions (was: [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions) G. Branden Robinson
2022-12-13 15:45       ` a Q quotation macro for man(7) (was: groff man(7) extensions) G. Branden Robinson
2022-12-12 23:00   ` [PATCH v2 0/3] Rewrite strcpy(3) Alejandro Colomar
2022-12-13 20:56     ` Jakub Wilk
2022-12-13 20:57       ` Alejandro Colomar
2022-12-13 22:05       ` Alejandro Colomar
2022-12-13 22:46         ` Alejandro Colomar
2022-12-14  0:03     ` [PATCH v3 0/1] Rewritten page for string-copying functions Alejandro Colomar
2022-12-14  0:14       ` Alejandro Colomar
2022-12-14  0:16         ` Alejandro Colomar
2022-12-14 16:17       ` [PATCH v4 " Alejandro Colomar
2022-12-15  0:26         ` [PATCH v5 0/5] Rewrite pages about " Alejandro Colomar
2022-12-19 21:02           ` [PATCH v6 0/5] Rewrite documentation for " Alejandro Colomar
2022-12-19 21:02           ` [PATCH v6 1/5] string_copy.7: Add page to document all " Alejandro Colomar
2022-12-20 15:00             ` Stefan Puiu
2022-12-20 15:03               ` Alejandro Colomar
2023-01-20  3:43             ` Eric Biggers
2023-01-20 12:55               ` Alejandro Colomar
2022-12-19 21:02           ` [PATCH v6 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7) Alejandro Colomar
2022-12-19 21:02           ` [PATCH v6 3/5] stpcpy.3, strcpy.3, strcat.3: Document in a single page Alejandro Colomar
2022-12-19 21:02           ` [PATCH v6 4/5] stpncpy.3, strncpy.3: " Alejandro Colomar
2022-12-19 21:02           ` [PATCH v6 5/5] strncat.3: Rewrite to be consistent with string_copy.7 Alejandro Colomar
2022-12-15  0:26         ` [PATCH v5 1/5] string_copy.7: Add page to document all string-copying functions Alejandro Colomar
2022-12-15  0:30           ` Alejandro Colomar
2022-12-15  0:26         ` [PATCH v5 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7) Alejandro Colomar
2022-12-15  0:27           ` Alejandro Colomar
2022-12-16 18:47             ` Stefan Puiu
2022-12-16 19:03               ` Alejandro Colomar
2022-12-16 19:09                 ` Alejandro Colomar
2022-12-15  0:26         ` [PATCH v5 3/5] stpcpy.3, strcpy.3, strcat.3: Document in a single page Alejandro Colomar
2022-12-16 14:46           ` Alejandro Colomar
2022-12-16 14:47             ` Alejandro Colomar
2022-12-15  0:26         ` [PATCH v5 4/5] stpncpy.3, strncpy.3: " Alejandro Colomar
2022-12-15  0:28           ` Alejandro Colomar
2022-12-15  0:26         ` [PATCH v5 5/5] strncat.3: Rewrite to be consistent with string_copy.7 Alejandro Colomar
2022-12-15  0:29           ` Alejandro Colomar
2022-12-14 16:17       ` [PATCH v4 1/1] strcpy.3: Rewrite page to document all string-copying functions Alejandro Colomar
2022-12-14  0:03     ` [PATCH v3 " Alejandro Colomar
2022-12-14 16:22       ` Douglas McIlroy [this message]
2022-12-14 16:36         ` Alejandro Colomar
2022-12-14 17:11           ` Alejandro Colomar
2022-12-14 17:19             ` Alejandro Colomar
2022-12-12 23:00   ` [PATCH v2 1/3] " Alejandro Colomar
2022-12-12 23:00   ` [PATCH v2 2/3] stpcpy.3, stpncpy.3, strcat.3, strncat.3, strncpy.3: Transform the old pages into links to strcpy(3) Alejandro Colomar
2022-12-12 23:00   ` [PATCH v2 3/3] stpecpy.3, stpecpyx.3, strlcat.3, strlcpy.3, strscpy.3: Add new " Alejandro Colomar
2022-12-12 14:24 ` [PATCH 2/3] stpcpy.3, stpncpy.3, strcat.3, strncat.3, strncpy.3: Transform the old pages into " Alejandro Colomar
2022-12-12 14:24 ` [PATCH 3/3] stpecpy.3, stpecpyx.3, strlcat.3, strlcpy.3, strscpy.3: Add new " Alejandro Colomar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKH6PiUrQzb7vRZxUs0742WnfaLpcUec0QfdJQJ5Di8LqFg+NA@mail.gmail.com \
    --to=douglas.mcilroy@dartmouth.edu \
    --cc=alx.manpages@gmail.com \
    --cc=alx@kernel.org \
    --cc=g.branden.robinson@gmail.com \
    --cc=jwilk@jwilk.net \
    --cc=linux-man@vger.kernel.org \
    --cc=msebor@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.