* string_copy(7): New manual page documenting string copying functions.
@ 2022-12-11 23:59 Alejandro Colomar
2022-12-12 0:17 ` Alejandro Colomar
` (5 more replies)
0 siblings, 6 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-11 23:59 UTC (permalink / raw)
To: linux-man
[-- Attachment #1.1: Type: text/plain, Size: 24289 bytes --]
Hi all!
I'm planning to add a new manual page that documents all string copying
functions. It covers more detail than any of the existing manual pages (and in
fact, I've discovered some properties of the functions documented while working
on this page). The intention is to remove the existing separate manual pages
for all string copying functions, and make them links to this new page. It
intends to be the only reference documentation for copying strings in C, and
hopefully fix the half century of suboptimal string copying library with which
we've lived. (Say goodbye to std::string, here come back C strings ;)
The formatted manual page is below.
Alex
P.S.: I'm sorry for your beloved string copying function(s); it has high chances
of being dreaded by the page below. Not sorry. Oh well, at least I justified
it, or I tried :-)
---
string_copy(7) Miscellaneous Information Manual string_copy(7)
NAME
stpcpy, stpecpy, stpecpyx, strlcpy, strlcat, strscpy, strcpy, strcat,
stpncpy, ustr2stp, strncpy, strncat, mempcpy - copy strings
SYNOPSIS
(Null‐terminated) strings
// Chain‐copy a string.
char *stpcpy(char *restrict dst, const char *restrict src);
// Chain‐copy a string with truncation (not in libc).
char *stpecpy(char *dst, char past_end[0], const char *restrict src);
// Chain‐copy a string with truncation and SIGSEGV on invalid input.
char *stpecpyx(char *dst, char past_end[0], const char *restrict src);
// Copy a string with truncation and SIGSEGV on invalid input.
[[deprecated]] // Use stpecpyx() instead.
size_t strlcpy(char dst[restrict .sz], const char *restrict src,
size_t sz);
// Concatenate a string with truncation.
[[deprecated]] // Use stpecpyx() instead.
size_t strlcat(char dst[restrict .sz], const char *restrict src,
size_t sz);
// Copy a string with truncation (not in libc).
[[deprecated]] // Use stpecpy() instead.
ssize_t strscpy(char dst[restrict .sz], const char src[restrict .sz],
size_t sz);
// Copy a string.
[[deprecated]] // Use stpcpy(3) instead.
char *strcpy(char *restrict dst, const char *restrict src);
// Concatenate a string.
[[deprecated]] // Use stpcpy(3) instead.
char *strcat(char *restrict dst, const char *restrict src);
Unterminated strings (null‐padded fixed‐width buffers)
// Zero a fixed‐width buffer, and
// copy a string with truncation into an unterminated string.
char *stpncpy(char dst[restrict .sz], const char *restrict src,
size_t sz);
// Chain‐copy an unterminated string into a string (not in libc).
char *ustr2stp(char *restrict dst, const char src[restrict .sz],
size_t sz);
// Zero a fixed‐width buffer, and
// copy a string with truncation into an unterminated string
[[deprecated]] // Use stpncpy(3) instead.
char *strncpy(char dest[restrict .sz], const char *restrict src,
size_t sz);
// Concatenate an unterminated string into a string.
[[deprecated]] // Use ustr2stp() instead.
char *strncat(char *restrict dst, const char src[restrict .sz],
size_t sz);
String structures
// (Null‐terminated) string structure.
struct str_s {
size_t len;
char *str;
};
// Unterminated string structure (overlapping strings).
struct ustr_s {
size_t len;
char *ustr;
};
// Chain‐copy a string structure into an unterminated string.
void *mempcpy(void *restrict dst, const void src[restrict len],
size_t len);
DESCRIPTION
Terms (and abbreviations)
string (str)
is a sequence of zero or more non‐null characters, followed by a
null byte.
unterminated string (ustr)
is a sequence of zero or more non‐null characters. They are
sometimes contained in fixed‐width buffers, which usually con‐
tain padding null bytes after the unterminated string, to fill
the rest of the buffer without affecting the unterminated
string; however, those padding null bytes are not part of the
unterminated string.
length (len)
is the number of non‐null characters in a string. It is the re‐
turn value of strlen(str) and of strnlen(ustr, sz).
size (sz)
refers to the entire buffer where the string is contained.
end is the name of a pointer to the terminating null byte of a
string, or a pointer to one past the last character of an unter‐
minated string. This is the return value of functions that al‐
low chaining. It is equivalent to &str[len].
past_end
is the name of a pointer to one past the end of the buffer that
contains a string. It is equivalent to &str[sz]. It is used as
a sentinel value, to be able to truncate strings instead of
overrunning a buffer.
string structure
unterminated string structure
Structure that contains the length of a string, as well as the
string or the unterminated string.
Types of functions
Copy, concatenate, and chain‐copy
Originally, there was a distinction between functions that copy
and those that concatenate. However, newer functions that copy
while allowing chaining cover both use cases with a single API.
They are also algorithmically faster, since they don’t need to
search for the end of the existing string.
To chain copy functions, they need to return a pointer to the
end. That’s a byproduct of the copy operation, so it has no
performance costs. These functions are preferred over copy or
concatenation functions. Functions that return such a pointer,
and thus can be chained, have names of the form *stp*(), since
it’s also common to name the pointer just p.
Truncate or not?
The first thing to note is that programmers should be careful
with buffers, so they always have the correct size, and trunca‐
tion is not necessary.
In most cases, truncation is not desired, and it is simpler to
just do the copy. Simpler code is safer code. Programming
against programming mistakes by adding more code just adds more
points where mistakes can be made.
Nowadays, compilers can detect most programmer errors with fea‐
tures like compiler warnings, static analyzers, and
_FORTIFY_SOURCE (see ftm(7)). Keeping the code simple helps
these error‐detection features be more precise.
When validating user input, however, it makes sense to truncate.
Remember to check the return value of such function calls.
Functions that truncate:
• stpecpy() is the most efficient string copy function that
performs truncation. It only requires to check for trunca‐
tion once after all chained calls.
• stpecpyx() is a variant of stpecpy() that consumes the entire
source string, to catch bugs in the program by forcing a seg‐
mentation fault (as strlcpy(3bsd) and strlcat(3bsd) do).
• strlcpy(3bsd) and strlcat(3bsd), which originated in OpenBSD,
are designed to crash if the input string is invalid (doesn’t
contain a null byte).
• strscpy(9) is a function in the Linux kernel which reports an
error instead of crashing.
• stpncpy(3) and strncpy(3) also truncate, but they don’t write
strings, but rather unterminated strings.
Unterminated strings (null‐padded fixed‐width buffers)
For historic reasons, some standard APIs, such as utmpx(5), use unter‐
minated strings in fixed‐width buffers. To interface with them, spe‐
cialized functions need to be used.
To copy strings into them, use stpncpy(3).
To copy from an unterminated string within a fixed‐width buffer into a
string, ignoring any trailing null bytes in the source fixed‐width
buffer, you should use ustr2stp().
String structures
The simplest string copying function is mempcpy(3). It requires always
knowing the length of your strings, for which string structures can be
used. It makes the code simpler, since you always know the length of
your strings, and it’s also faster, since it doesn’t need to repeatedly
calculate those lengths. mempcpy(3) always creates an unterminated
string, so you need to explicitly set the terminating null byte.
String structure
The following code can be used to chain‐copy from a string
structure into a string:
p = mempcpy(p, src->str, src->len);
*p = '\0';
The following code can be used to chain‐copy from a string
structure into an unterminated string:
p = mempcpy(p, src->str, src->len);
Unterminated string structure (overlapping strings)
In programs that make considerable use of strings, and need the
best performance, using overlapping strings can make a big dif‐
ference. It allows holding substrings of a bigger string while
not duplicating memory nor using time to do a copy.
However, this is delicate, since it requires using unterminated
strings. C library APIs use strings, so programs that use un‐
terminated strings will have to take care to differentiate
strings from unterminated strings.
The following code can be used to chain‐copy from an untermi‐
nated string structure to a string:
p = mempcpy(p, src->ustr, src->len);
*p = '\0';
The following code can be used to chain‐copy from an untermi‐
nated string structure to an unterminated string:
p = mempcpy(p, src->ustr, src->len);
Functions
stpcpy(3)
This function copies the input string into a destination string.
The programmer is responsible for allocating a buffer large
enough. It returns a pointer suitable for chaining.
stpecpy()
stpecpyx()
These functions copy the input string into a destination string.
If the destination buffer, limited by a pointer to one past the
end of it, isn’t large enough to hold the copy, the resulting
string is truncated (but it is guaranteed to be null‐termi‐
nated). They return a pointer suitable for chaining. Trunca‐
tion needs to be detected only once after the last chained call.
stpecpyx() has identical semantics to stpecpy(), except that it
forces a SIGSEGV on Undefined Behavior.
These functions are not provided by any library, but you can de‐
fine them with the following reference implementations:
/* This code is in the public domain. */
char *
stpecpy(char *dst, char past_end[0],
const char *restrict src)
{
char *p;
if (dst == past_end)
return past_end;
p = memccpy(dst, src, '\0', past_end - dst);
if (p != NULL)
return p - 1;
/* truncation detected */
past_end[-1] = '\0';
return past_end;
}
/* This code is in the public domain. */
char *
stpecpyx(char *dst, char past_end[0],
const char *restrict src)
{
if (src[strlen(src)] != '\0')
raise(SIGSEGV);
return stpecpy(dst, past_end, src);
}
stpncpy(3)
This function copies the input string into a destination null‐
padded fixed‐width unterminated string. If the destination
buffer, limited by its size, isn’t large enough to hold the
copy, the resulting string is truncated. Since it creates an
unterminated string, it doesn’t need to write a terminating null
byte. It returns a pointer suitable for chaining, but it’s not
ideal for that. Truncation needs to be detected only once after
the last chained call.
If you’re going to use this function in chained calls, it would
probably be useful to develop a function similar to stpecpy().
ustr2stp()
This function copies the input unterminated string contained in
a null‐padded wixed‐width buffer, into a destination (null‐ter‐
minated) string. The programmer is responsible for allocating a
buffer large enough. It returns a pointer suitable for chain‐
ing.
This function is not provided by any library, but you can write
it with the definition above in this page.
A truncating version of this function doesn’t exist, since the
size of the original string is always known, so it wouldn’t be
very useful.
This function is not provided by any library, but you can define
it with the following reference implementation:
/* This code is in the public domain. */
char *
ustr2stp(char *restrict dst, const char *restrict src,
size_t sz)
{
char *end;
end = memccpy(dst, src, '\0', sz)) ?: dst + sz;
*end = '\0';
return end;
}
mempcpy(3)
This function copies the input string, limited by its length,
into a destination unterminated string. The programmer is re‐
sponsible for allocating a buffer large enough. It returns a
pointer suitable for chaining.
Deprecated functions
strlcpy(3bsd)
strlcat(3bsd)
Deprecated. These functions copy the input string into a desti‐
nation string. If the destination buffer, limited by its size,
isn’t large enough to hold the copy, the resulting string is
truncated (but it is guaranteed to be null‐terminated). They
return the length of the total string they tried to create.
These functions force a SIGSEGV on Undefined Behavior.
stpecpyx() is a better replacement for these functions for the
following reasons:
• Better performance (chain copy instead of concatenating).
• Only requires detecting truncation once per chain of calls.
strscpy(9)
Deprecated. This function copies the input string into a desti‐
nation string. If the destination buffer, limited by its size,
isn’t large enough to hold the copy, the resulting string is
truncated (but it is guaranteed to be null‐terminated). It re‐
turns the length of the destination string, or -E2BIG on trunca‐
tion.
stpecpy() is a better replacement for this function, since it
has a much simpler interface.
strcpy(3)
strcat(3)
Deprecated. These functions copy the input string into a desti‐
nation string. The programmer is responsible for allocating a
buffer large enough. The return value is useless.
strcpy(3) is identical to stpcpy(3) except for the useless re‐
turn value.
stpcpy(3) is a better replacement for these functions for the
following reasons:
• Better performance (chain copy instead of concatenating).
• No need to call strlen(3), thanks to the useful return value.
strncpy(3)
Deprecated. strncpy(3) is identical to stpncpy(3) except for
the useless return value. Due to the return value, with this
function it’s hard to correctly check for truncation. Use stp‐
ncpy(3) instead.
strncat(3)
Deprecated. Do not confuse this function with strncpy(3); they
are not related at all.
This function concatenates the input unterminated string con‐
tained in a null‐padded wixed‐width buffer, into a destination
(null‐terminated) string. The programmer is responsible for al‐
locating a buffer large enough. The return value is useless.
ustr2stp() is a better replacement for this function for the
following reasons:
• Better performance (chain copy instead of concatenating).
• No need to call strlen(3), thanks to the useful return value.
• Function name that is not actively confusing.
RETURN VALUE
The following functions return a pointer to the terminating null byte
in the destination string (they never truncate).
• stpcpy(3)
• ustr2stp()
• mempcpy(3)
The following functions return a pointer to the terminating null byte
in the destination string, except when truncation occurs; if truncation
occurs, they return a pointer to one past the end of the destination
buffer.
• stpecpy()
• stpecpyx()
The following function returns a pointer to one after the last charac‐
ter in the destination unterminated string; if truncation occurs, that
pointer is equivalent to a pointer to one past the end of the destina‐
tion buffer.
• stpncpy(3)
Deprecated
The following functions return the length of the total string that they
tried to create (as if truncation didn’t occur).
• strlcpy(3bsd)
• strlcat(3bsd)
The following function returns the length of the destination string, or
-E2BIG on truncation.
• strscpy(9)
The following functions return the dst pointer, which is useless.
• strcpy(3)
• strcat(3)
• strncpy(3)
• strncat(3)
CAVEATS
Some of the functions described here are not provided by any library;
you should write your own copy if you want to use them.
The deprecated status of these functions varies from system to system.
This page declares as deprecated those functions that have a better re‐
placement documented in this same page.
EXAMPLES
The following are examples of correct use of each of these functions.
stpcpy(3)
p = buf;
p = stpcpy(p, "Hello ");
p = stpcpy(p, "world");
p = stpcpy(p, "!");
len = p - buf;
puts(buf);
stpecpy()
stpecpyx()
past_end = buf + sizeof(buf);
p = buf;
p = stpecpy(p, past_end, "Hello ");
p = stpecpy(p, past_end, "world");
p = stpecpy(p, past_end, "!");
if (p == past_end) {
p--;
goto toolong;
}
len = p - buf;
puts(buf);
stpncpy(3)
past_end = buf + sizeof(buf);
end = stpncpy(buf, "Hello world!", sizeof(buf));
if (end == past_end)
goto toolong;
len = end - buf;
for (size_t i = 0; i < sizeof(buf); i++)
putchar(buf[i]);
ustr2stp()
p = buf;
p = ustr2stp(p, "Hello ", 6);
p = ustr2stp(p, "world", 42); // Padding null bytes ignored.
p = ustr2stp(p, "!", 1);
len = p - buf;
puts(buf);
mempcpy(3)
p = buf;
p = mempcpy(p, "Hello ", 6);
p = mempcpy(p, "world", 5);
p = mempcpy(p, "!", 1);
p = '\0';
len = p - buf;
puts(buf);
Deprecated
strlcpy(3bsd)
strlcat(3bsd)
if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
goto toolong;
if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
goto toolong;
len = strlcat(buf, "!", sizeof(buf));
if (len >= sizeof(buf))
goto toolong;
puts(buf);
strscpy(9)
len = strscpy(buf, "Hello world!", sizeof(buf));
if (len == -E2BIG)
goto toolong;
puts(buf);
strcpy(3)
strcat(3)
strcpy(buf, "Hello ");
strcat(buf, "world");
strcat(buf, "!");
len = strlen(buf);
puts(buf);
strncpy(3)
strncpy(buf, "Hello world!", sizeof(buf));
if (buf + sizeof(buf) - 1 == '\0')
goto toolong;
len = strnlen(buf, sizeof(buf));
for (size_t i = 0; i < sizeof(buf); i++)
putchar(buf[i]);
strncat(3)
strncpy(buf, "Hello ", 6);
strncat(buf, "world", 42); // Padding null bytes ignored.
strncat(buf, "!", 1);
puts(buf);
SEE ALSO
memcpy(3), memccpy(3), mempcpy(3), string(3)
Linux man‐pages (unreleased) (date) string_copy(7)
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: string_copy(7): New manual page documenting string copying functions.
2022-12-11 23:59 string_copy(7): New manual page documenting string copying functions Alejandro Colomar
@ 2022-12-12 0:17 ` Alejandro Colomar
2022-12-12 0:25 ` Alejandro Colomar
` (4 subsequent siblings)
5 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-12 0:17 UTC (permalink / raw)
To: linux-man
[-- Attachment #1.1: Type: text/plain, Size: 31738 bytes --]
On 12/12/22 00:59, Alejandro Colomar wrote:
> Hi all!
>
> I'm planning to add a new manual page that documents all string copying
> functions. It covers more detail than any of the existing manual pages (and in
> fact, I've discovered some properties of the functions documented while working
> on this page). The intention is to remove the existing separate manual pages
> for all string copying functions, and make them links to this new page. It
> intends to be the only reference documentation for copying strings in C, and
> hopefully fix the half century of suboptimal string copying library with which
> we've lived. (Say goodbye to std::string, here come back C strings ;)
>
> The formatted manual page is below.
>
> Alex
>
> P.S.: I'm sorry for your beloved string copying function(s); it has high chances
> of being dreaded by the page below. Not sorry. Oh well, at least I justified
> it, or I tried :-)
>
> ---
>
> string_copy(7) Miscellaneous Information Manual string_copy(7)
>
> NAME
> stpcpy, stpecpy, stpecpyx, strlcpy, strlcat, strscpy, strcpy, strcat,
> stpncpy, ustr2stp, strncpy, strncat, mempcpy - copy strings
>
> SYNOPSIS
> (Null‐terminated) strings
> // Chain‐copy a string.
> char *stpcpy(char *restrict dst, const char *restrict src);
>
> // Chain‐copy a string with truncation (not in libc).
> char *stpecpy(char *dst, char past_end[0], const char *restrict src);
>
> // Chain‐copy a string with truncation and SIGSEGV on invalid input.
> char *stpecpyx(char *dst, char past_end[0], const char *restrict src);
>
> // Copy a string with truncation and SIGSEGV on invalid input.
> [[deprecated]] // Use stpecpyx() instead.
> size_t strlcpy(char dst[restrict .sz], const char *restrict src,
> size_t sz);
>
> // Concatenate a string with truncation.
> [[deprecated]] // Use stpecpyx() instead.
> size_t strlcat(char dst[restrict .sz], const char *restrict src,
> size_t sz);
>
> // Copy a string with truncation (not in libc).
> [[deprecated]] // Use stpecpy() instead.
> ssize_t strscpy(char dst[restrict .sz], const char src[restrict .sz],
> size_t sz);
>
> // Copy a string.
> [[deprecated]] // Use stpcpy(3) instead.
> char *strcpy(char *restrict dst, const char *restrict src);
>
> // Concatenate a string.
> [[deprecated]] // Use stpcpy(3) instead.
> char *strcat(char *restrict dst, const char *restrict src);
>
> Unterminated strings (null‐padded fixed‐width buffers)
> // Zero a fixed‐width buffer, and
> // copy a string with truncation into an unterminated string.
> char *stpncpy(char dst[restrict .sz], const char *restrict src,
> size_t sz);
>
> // Chain‐copy an unterminated string into a string (not in libc).
> char *ustr2stp(char *restrict dst, const char src[restrict .sz],
> size_t sz);
>
> // Zero a fixed‐width buffer, and
> // copy a string with truncation into an unterminated string
> [[deprecated]] // Use stpncpy(3) instead.
> char *strncpy(char dest[restrict .sz], const char *restrict src,
> size_t sz);
>
> // Concatenate an unterminated string into a string.
> [[deprecated]] // Use ustr2stp() instead.
> char *strncat(char *restrict dst, const char src[restrict .sz],
> size_t sz);
>
> String structures
> // (Null‐terminated) string structure.
> struct str_s {
> size_t len;
> char *str;
> };
>
> // Unterminated string structure (overlapping strings).
> struct ustr_s {
> size_t len;
> char *ustr;
> };
>
> // Chain‐copy a string structure into an unterminated string.
> void *mempcpy(void *restrict dst, const void src[restrict len],
> size_t len);
>
> DESCRIPTION
> Terms (and abbreviations)
> string (str)
> is a sequence of zero or more non‐null characters, followed by a
> null byte.
>
> unterminated string (ustr)
> is a sequence of zero or more non‐null characters. They are
> sometimes contained in fixed‐width buffers, which usually con‐
> tain padding null bytes after the unterminated string, to fill
> the rest of the buffer without affecting the unterminated
> string; however, those padding null bytes are not part of the
> unterminated string.
>
> length (len)
> is the number of non‐null characters in a string. It is the re‐
> turn value of strlen(str) and of strnlen(ustr, sz).
>
> size (sz)
> refers to the entire buffer where the string is contained.
>
> end is the name of a pointer to the terminating null byte of a
> string, or a pointer to one past the last character of an unter‐
> minated string. This is the return value of functions that al‐
> low chaining. It is equivalent to &str[len].
>
> past_end
> is the name of a pointer to one past the end of the buffer that
> contains a string. It is equivalent to &str[sz]. It is used as
> a sentinel value, to be able to truncate strings instead of
> overrunning a buffer.
>
> string structure
> unterminated string structure
> Structure that contains the length of a string, as well as the
> string or the unterminated string.
>
> Types of functions
> Copy, concatenate, and chain‐copy
> Originally, there was a distinction between functions that copy
> and those that concatenate. However, newer functions that copy
> while allowing chaining cover both use cases with a single API.
> They are also algorithmically faster, since they don’t need to
> search for the end of the existing string.
>
> To chain copy functions, they need to return a pointer to the
> end. That’s a byproduct of the copy operation, so it has no
> performance costs. These functions are preferred over copy or
> concatenation functions. Functions that return such a pointer,
> and thus can be chained, have names of the form *stp*(), since
> it’s also common to name the pointer just p.
>
> Truncate or not?
> The first thing to note is that programmers should be careful
> with buffers, so they always have the correct size, and trunca‐
> tion is not necessary.
>
> In most cases, truncation is not desired, and it is simpler to
> just do the copy. Simpler code is safer code. Programming
> against programming mistakes by adding more code just adds more
> points where mistakes can be made.
>
> Nowadays, compilers can detect most programmer errors with fea‐
> tures like compiler warnings, static analyzers, and
> _FORTIFY_SOURCE (see ftm(7)). Keeping the code simple helps
> these error‐detection features be more precise.
>
> When validating user input, however, it makes sense to truncate.
> Remember to check the return value of such function calls.
>
> Functions that truncate:
>
> • stpecpy() is the most efficient string copy function that
> performs truncation. It only requires to check for trunca‐
> tion once after all chained calls.
>
> • stpecpyx() is a variant of stpecpy() that consumes the entire
> source string, to catch bugs in the program by forcing a seg‐
> mentation fault (as strlcpy(3bsd) and strlcat(3bsd) do).
>
> • strlcpy(3bsd) and strlcat(3bsd), which originated in OpenBSD,
> are designed to crash if the input string is invalid (doesn’t
> contain a null byte).
>
> • strscpy(9) is a function in the Linux kernel which reports an
> error instead of crashing.
>
> • stpncpy(3) and strncpy(3) also truncate, but they don’t write
> strings, but rather unterminated strings.
>
> Unterminated strings (null‐padded fixed‐width buffers)
> For historic reasons, some standard APIs, such as utmpx(5), use unter‐
> minated strings in fixed‐width buffers. To interface with them, spe‐
> cialized functions need to be used.
>
> To copy strings into them, use stpncpy(3).
>
> To copy from an unterminated string within a fixed‐width buffer into a
> string, ignoring any trailing null bytes in the source fixed‐width
> buffer, you should use ustr2stp().
>
> String structures
> The simplest string copying function is mempcpy(3). It requires always
> knowing the length of your strings, for which string structures can be
> used. It makes the code simpler, since you always know the length of
> your strings, and it’s also faster, since it doesn’t need to repeatedly
> calculate those lengths. mempcpy(3) always creates an unterminated
> string, so you need to explicitly set the terminating null byte.
>
> String structure
> The following code can be used to chain‐copy from a string
> structure into a string:
>
> p = mempcpy(p, src->str, src->len);
> *p = '\0';
>
> The following code can be used to chain‐copy from a string
> structure into an unterminated string:
>
> p = mempcpy(p, src->str, src->len);
>
> Unterminated string structure (overlapping strings)
> In programs that make considerable use of strings, and need the
> best performance, using overlapping strings can make a big dif‐
> ference. It allows holding substrings of a bigger string while
> not duplicating memory nor using time to do a copy.
>
> However, this is delicate, since it requires using unterminated
> strings. C library APIs use strings, so programs that use un‐
> terminated strings will have to take care to differentiate
> strings from unterminated strings.
>
> The following code can be used to chain‐copy from an untermi‐
> nated string structure to a string:
>
> p = mempcpy(p, src->ustr, src->len);
> *p = '\0';
>
> The following code can be used to chain‐copy from an untermi‐
> nated string structure to an unterminated string:
>
> p = mempcpy(p, src->ustr, src->len);
>
> Functions
> stpcpy(3)
> This function copies the input string into a destination string.
> The programmer is responsible for allocating a buffer large
> enough. It returns a pointer suitable for chaining.
>
> stpecpy()
> stpecpyx()
> These functions copy the input string into a destination string.
> If the destination buffer, limited by a pointer to one past the
> end of it, isn’t large enough to hold the copy, the resulting
> string is truncated (but it is guaranteed to be null‐termi‐
> nated). They return a pointer suitable for chaining. Trunca‐
> tion needs to be detected only once after the last chained call.
> stpecpyx() has identical semantics to stpecpy(), except that it
> forces a SIGSEGV on Undefined Behavior.
>
> These functions are not provided by any library, but you can de‐
> fine them with the following reference implementations:
>
> /* This code is in the public domain. */
> char *
> stpecpy(char *dst, char past_end[0],
> const char *restrict src)
> {
> char *p;
>
> if (dst == past_end)
> return past_end;
>
> p = memccpy(dst, src, '\0', past_end - dst);
> if (p != NULL)
> return p - 1;
>
> /* truncation detected */
> past_end[-1] = '\0';
> return past_end;
> }
>
> /* This code is in the public domain. */
> char *
> stpecpyx(char *dst, char past_end[0],
> const char *restrict src)
> {
> if (src[strlen(src)] != '\0')
> raise(SIGSEGV);
>
> return stpecpy(dst, past_end, src);
> }
>
> stpncpy(3)
> This function copies the input string into a destination null‐
> padded fixed‐width unterminated string. If the destination
> buffer, limited by its size, isn’t large enough to hold the
> copy, the resulting string is truncated. Since it creates an
> unterminated string, it doesn’t need to write a terminating null
> byte. It returns a pointer suitable for chaining, but it’s not
> ideal for that. Truncation needs to be detected only once after
> the last chained call.
>
> If you’re going to use this function in chained calls, it would
> probably be useful to develop a function similar to stpecpy().
>
> ustr2stp()
> This function copies the input unterminated string contained in
> a null‐padded wixed‐width buffer, into a destination (null‐ter‐
> minated) string. The programmer is responsible for allocating a
> buffer large enough. It returns a pointer suitable for chain‐
> ing.
>
> This function is not provided by any library, but you can write
> it with the definition above in this page.
>
> A truncating version of this function doesn’t exist, since the
> size of the original string is always known, so it wouldn’t be
> very useful.
>
> This function is not provided by any library, but you can define
> it with the following reference implementation:
>
> /* This code is in the public domain. */
> char *
> ustr2stp(char *restrict dst, const char *restrict src,
> size_t sz)
> {
> char *end;
>
> end = memccpy(dst, src, '\0', sz)) ?: dst + sz;
> *end = '\0';
>
> return end;
> }
>
> mempcpy(3)
> This function copies the input string, limited by its length,
> into a destination unterminated string. The programmer is re‐
> sponsible for allocating a buffer large enough. It returns a
> pointer suitable for chaining.
>
> Deprecated functions
> strlcpy(3bsd)
> strlcat(3bsd)
> Deprecated. These functions copy the input string into a desti‐
> nation string. If the destination buffer, limited by its size,
> isn’t large enough to hold the copy, the resulting string is
> truncated (but it is guaranteed to be null‐terminated). They
> return the length of the total string they tried to create.
> These functions force a SIGSEGV on Undefined Behavior.
>
> stpecpyx() is a better replacement for these functions for the
> following reasons:
>
> • Better performance (chain copy instead of concatenating).
>
> • Only requires detecting truncation once per chain of calls.
>
> strscpy(9)
> Deprecated. This function copies the input string into a desti‐
> nation string. If the destination buffer, limited by its size,
> isn’t large enough to hold the copy, the resulting string is
> truncated (but it is guaranteed to be null‐terminated). It re‐
> turns the length of the destination string, or -E2BIG on trunca‐
> tion.
>
> stpecpy() is a better replacement for this function, since it
> has a much simpler interface.
>
> strcpy(3)
> strcat(3)
> Deprecated. These functions copy the input string into a desti‐
> nation string. The programmer is responsible for allocating a
> buffer large enough. The return value is useless.
>
> strcpy(3) is identical to stpcpy(3) except for the useless re‐
> turn value.
>
> stpcpy(3) is a better replacement for these functions for the
> following reasons:
>
> • Better performance (chain copy instead of concatenating).
>
> • No need to call strlen(3), thanks to the useful return value.
>
> strncpy(3)
> Deprecated. strncpy(3) is identical to stpncpy(3) except for
> the useless return value. Due to the return value, with this
> function it’s hard to correctly check for truncation. Use stp‐
> ncpy(3) instead.
>
> strncat(3)
> Deprecated. Do not confuse this function with strncpy(3); they
> are not related at all.
>
> This function concatenates the input unterminated string con‐
> tained in a null‐padded wixed‐width buffer, into a destination
> (null‐terminated) string. The programmer is responsible for al‐
> locating a buffer large enough. The return value is useless.
>
> ustr2stp() is a better replacement for this function for the
> following reasons:
>
> • Better performance (chain copy instead of concatenating).
>
> • No need to call strlen(3), thanks to the useful return value.
>
> • Function name that is not actively confusing.
>
> RETURN VALUE
> The following functions return a pointer to the terminating null byte
> in the destination string (they never truncate).
>
> • stpcpy(3)
>
> • ustr2stp()
>
> • mempcpy(3)
>
> The following functions return a pointer to the terminating null byte
> in the destination string, except when truncation occurs; if truncation
> occurs, they return a pointer to one past the end of the destination
> buffer.
>
> • stpecpy()
>
> • stpecpyx()
>
> The following function returns a pointer to one after the last charac‐
> ter in the destination unterminated string; if truncation occurs, that
> pointer is equivalent to a pointer to one past the end of the destina‐
> tion buffer.
>
> • stpncpy(3)
>
> Deprecated
> The following functions return the length of the total string that they
> tried to create (as if truncation didn’t occur).
>
> • strlcpy(3bsd)
>
> • strlcat(3bsd)
>
> The following function returns the length of the destination string, or
> -E2BIG on truncation.
>
> • strscpy(9)
>
> The following functions return the dst pointer, which is useless.
>
> • strcpy(3)
>
> • strcat(3)
>
> • strncpy(3)
>
> • strncat(3)
And here goes the STANDARDS section:
STANDARDS
stpcpy(3)
POSIX.1‐2008.
stpecpy()
stpecpyx()
ustr2stp()
Not defined by any standards nor libraries.
stpncpy(3)
POSIX.1‐2008.
mempcpy(3)
This function is a GNU extension.
strlcpy(3bsd)
strlcat(3bsd)
Functions originated in OpenBSD and present in some Unix sys‐
tems. They are provided in GNU/Linux systems by libbsd.
strscpy(9)
Linux kernel internal function.
strcpy(3)
strcat(3)
POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
strncpy(3)
POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
strncat(3)
POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
>
> CAVEATS
> Some of the functions described here are not provided by any library;
> you should write your own copy if you want to use them.
>
> The deprecated status of these functions varies from system to system.
> This page declares as deprecated those functions that have a better re‐
> placement documented in this same page.
>
> EXAMPLES
> The following are examples of correct use of each of these functions.
>
> stpcpy(3)
> p = buf;
> p = stpcpy(p, "Hello ");
> p = stpcpy(p, "world");
> p = stpcpy(p, "!");
> len = p - buf;
> puts(buf);
>
> stpecpy()
> stpecpyx()
> past_end = buf + sizeof(buf);
> p = buf;
> p = stpecpy(p, past_end, "Hello ");
> p = stpecpy(p, past_end, "world");
> p = stpecpy(p, past_end, "!");
> if (p == past_end) {
> p--;
> goto toolong;
> }
> len = p - buf;
> puts(buf);
>
> stpncpy(3)
> past_end = buf + sizeof(buf);
> end = stpncpy(buf, "Hello world!", sizeof(buf));
> if (end == past_end)
> goto toolong;
> len = end - buf;
> for (size_t i = 0; i < sizeof(buf); i++)
> putchar(buf[i]);
>
> ustr2stp()
> p = buf;
> p = ustr2stp(p, "Hello ", 6);
> p = ustr2stp(p, "world", 42); // Padding null bytes ignored.
> p = ustr2stp(p, "!", 1);
> len = p - buf;
> puts(buf);
>
> mempcpy(3)
> p = buf;
> p = mempcpy(p, "Hello ", 6);
> p = mempcpy(p, "world", 5);
> p = mempcpy(p, "!", 1);
> p = '\0';
> len = p - buf;
> puts(buf);
>
> Deprecated
> strlcpy(3bsd)
> strlcat(3bsd)
> if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
> goto toolong;
> if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
> goto toolong;
> len = strlcat(buf, "!", sizeof(buf));
> if (len >= sizeof(buf))
> goto toolong;
> puts(buf);
>
> strscpy(9)
> len = strscpy(buf, "Hello world!", sizeof(buf));
> if (len == -E2BIG)
> goto toolong;
> puts(buf);
>
> strcpy(3)
> strcat(3)
> strcpy(buf, "Hello ");
> strcat(buf, "world");
> strcat(buf, "!");
> len = strlen(buf);
> puts(buf);
>
> strncpy(3)
> strncpy(buf, "Hello world!", sizeof(buf));
> if (buf + sizeof(buf) - 1 == '\0')
> goto toolong;
> len = strnlen(buf, sizeof(buf));
> for (size_t i = 0; i < sizeof(buf); i++)
> putchar(buf[i]);
>
> strncat(3)
> strncpy(buf, "Hello ", 6);
> strncat(buf, "world", 42); // Padding null bytes ignored.
> strncat(buf, "!", 1);
> puts(buf);
>
> SEE ALSO
> memcpy(3), memccpy(3), mempcpy(3), string(3)
>
> Linux man‐pages (unreleased) (date) string_copy(7)
>
>
>
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: string_copy(7): New manual page documenting string copying functions.
2022-12-11 23:59 string_copy(7): New manual page documenting string copying functions Alejandro Colomar
2022-12-12 0:17 ` Alejandro Colomar
@ 2022-12-12 0:25 ` Alejandro Colomar
2022-12-12 0:32 ` Alejandro Colomar
` (3 subsequent siblings)
5 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-12 0:25 UTC (permalink / raw)
To: linux-man
[-- Attachment #1.1: Type: text/plain, Size: 31166 bytes --]
On 12/12/22 00:59, Alejandro Colomar wrote:
> Hi all!
>
> I'm planning to add a new manual page that documents all string copying
> functions. It covers more detail than any of the existing manual pages (and in
> fact, I've discovered some properties of the functions documented while working
> on this page). The intention is to remove the existing separate manual pages
> for all string copying functions, and make them links to this new page. It
> intends to be the only reference documentation for copying strings in C, and
> hopefully fix the half century of suboptimal string copying library with which
> we've lived. (Say goodbye to std::string, here come back C strings ;)
>
> The formatted manual page is below.
>
> Alex
>
> P.S.: I'm sorry for your beloved string copying function(s); it has high chances
> of being dreaded by the page below. Not sorry. Oh well, at least I justified
> it, or I tried :-)
>
> ---
>
> string_copy(7) Miscellaneous Information Manual string_copy(7)
>
> NAME
> stpcpy, stpecpy, stpecpyx, strlcpy, strlcat, strscpy, strcpy, strcat,
> stpncpy, ustr2stp, strncpy, strncat, mempcpy - copy strings
>
> SYNOPSIS
> (Null‐terminated) strings
> // Chain‐copy a string.
> char *stpcpy(char *restrict dst, const char *restrict src);
>
> // Chain‐copy a string with truncation (not in libc).
> char *stpecpy(char *dst, char past_end[0], const char *restrict src);
>
> // Chain‐copy a string with truncation and SIGSEGV on invalid input.
> char *stpecpyx(char *dst, char past_end[0], const char *restrict src);
>
> // Copy a string with truncation and SIGSEGV on invalid input.
> [[deprecated]] // Use stpecpyx() instead.
> size_t strlcpy(char dst[restrict .sz], const char *restrict src,
> size_t sz);
>
> // Concatenate a string with truncation.
> [[deprecated]] // Use stpecpyx() instead.
> size_t strlcat(char dst[restrict .sz], const char *restrict src,
> size_t sz);
>
> // Copy a string with truncation (not in libc).
> [[deprecated]] // Use stpecpy() instead.
> ssize_t strscpy(char dst[restrict .sz], const char src[restrict .sz],
> size_t sz);
>
> // Copy a string.
> [[deprecated]] // Use stpcpy(3) instead.
> char *strcpy(char *restrict dst, const char *restrict src);
>
> // Concatenate a string.
> [[deprecated]] // Use stpcpy(3) instead.
> char *strcat(char *restrict dst, const char *restrict src);
>
> Unterminated strings (null‐padded fixed‐width buffers)
> // Zero a fixed‐width buffer, and
> // copy a string with truncation into an unterminated string.
> char *stpncpy(char dst[restrict .sz], const char *restrict src,
> size_t sz);
>
> // Chain‐copy an unterminated string into a string (not in libc).
> char *ustr2stp(char *restrict dst, const char src[restrict .sz],
> size_t sz);
>
> // Zero a fixed‐width buffer, and
> // copy a string with truncation into an unterminated string
> [[deprecated]] // Use stpncpy(3) instead.
> char *strncpy(char dest[restrict .sz], const char *restrict src,
> size_t sz);
>
> // Concatenate an unterminated string into a string.
> [[deprecated]] // Use ustr2stp() instead.
> char *strncat(char *restrict dst, const char src[restrict .sz],
> size_t sz);
>
> String structures
> // (Null‐terminated) string structure.
> struct str_s {
> size_t len;
> char *str;
> };
>
> // Unterminated string structure (overlapping strings).
> struct ustr_s {
> size_t len;
> char *ustr;
> };
>
> // Chain‐copy a string structure into an unterminated string.
> void *mempcpy(void *restrict dst, const void src[restrict len],
> size_t len);
>
> DESCRIPTION
> Terms (and abbreviations)
> string (str)
> is a sequence of zero or more non‐null characters, followed by a
> null byte.
>
> unterminated string (ustr)
> is a sequence of zero or more non‐null characters. They are
> sometimes contained in fixed‐width buffers, which usually con‐
> tain padding null bytes after the unterminated string, to fill
> the rest of the buffer without affecting the unterminated
> string; however, those padding null bytes are not part of the
> unterminated string.
>
> length (len)
> is the number of non‐null characters in a string. It is the re‐
> turn value of strlen(str) and of strnlen(ustr, sz).
>
> size (sz)
> refers to the entire buffer where the string is contained.
>
> end is the name of a pointer to the terminating null byte of a
> string, or a pointer to one past the last character of an unter‐
> minated string. This is the return value of functions that al‐
> low chaining. It is equivalent to &str[len].
>
> past_end
> is the name of a pointer to one past the end of the buffer that
> contains a string. It is equivalent to &str[sz]. It is used as
> a sentinel value, to be able to truncate strings instead of
> overrunning a buffer.
>
> string structure
> unterminated string structure
> Structure that contains the length of a string, as well as the
> string or the unterminated string.
>
> Types of functions
> Copy, concatenate, and chain‐copy
> Originally, there was a distinction between functions that copy
> and those that concatenate. However, newer functions that copy
> while allowing chaining cover both use cases with a single API.
> They are also algorithmically faster, since they don’t need to
> search for the end of the existing string.
>
> To chain copy functions, they need to return a pointer to the
> end. That’s a byproduct of the copy operation, so it has no
> performance costs. These functions are preferred over copy or
> concatenation functions. Functions that return such a pointer,
> and thus can be chained, have names of the form *stp*(), since
> it’s also common to name the pointer just p.
>
> Truncate or not?
> The first thing to note is that programmers should be careful
> with buffers, so they always have the correct size, and trunca‐
> tion is not necessary.
>
> In most cases, truncation is not desired, and it is simpler to
> just do the copy. Simpler code is safer code. Programming
> against programming mistakes by adding more code just adds more
> points where mistakes can be made.
>
> Nowadays, compilers can detect most programmer errors with fea‐
> tures like compiler warnings, static analyzers, and
> _FORTIFY_SOURCE (see ftm(7)). Keeping the code simple helps
> these error‐detection features be more precise.
>
> When validating user input, however, it makes sense to truncate.
> Remember to check the return value of such function calls.
>
> Functions that truncate:
>
> • stpecpy() is the most efficient string copy function that
> performs truncation. It only requires to check for trunca‐
> tion once after all chained calls.
>
> • stpecpyx() is a variant of stpecpy() that consumes the entire
> source string, to catch bugs in the program by forcing a seg‐
> mentation fault (as strlcpy(3bsd) and strlcat(3bsd) do).
>
> • strlcpy(3bsd) and strlcat(3bsd), which originated in OpenBSD,
> are designed to crash if the input string is invalid (doesn’t
> contain a null byte).
>
> • strscpy(9) is a function in the Linux kernel which reports an
> error instead of crashing.
>
> • stpncpy(3) and strncpy(3) also truncate, but they don’t write
> strings, but rather unterminated strings.
>
> Unterminated strings (null‐padded fixed‐width buffers)
> For historic reasons, some standard APIs, such as utmpx(5), use unter‐
> minated strings in fixed‐width buffers. To interface with them, spe‐
> cialized functions need to be used.
>
> To copy strings into them, use stpncpy(3).
>
> To copy from an unterminated string within a fixed‐width buffer into a
> string, ignoring any trailing null bytes in the source fixed‐width
> buffer, you should use ustr2stp().
>
> String structures
> The simplest string copying function is mempcpy(3). It requires always
> knowing the length of your strings, for which string structures can be
> used. It makes the code simpler, since you always know the length of
> your strings, and it’s also faster, since it doesn’t need to repeatedly
> calculate those lengths. mempcpy(3) always creates an unterminated
> string, so you need to explicitly set the terminating null byte.
>
> String structure
> The following code can be used to chain‐copy from a string
> structure into a string:
>
> p = mempcpy(p, src->str, src->len);
> *p = '\0';
>
> The following code can be used to chain‐copy from a string
> structure into an unterminated string:
>
> p = mempcpy(p, src->str, src->len);
>
> Unterminated string structure (overlapping strings)
> In programs that make considerable use of strings, and need the
> best performance, using overlapping strings can make a big dif‐
> ference. It allows holding substrings of a bigger string while
> not duplicating memory nor using time to do a copy.
>
> However, this is delicate, since it requires using unterminated
> strings. C library APIs use strings, so programs that use un‐
> terminated strings will have to take care to differentiate
> strings from unterminated strings.
>
> The following code can be used to chain‐copy from an untermi‐
> nated string structure to a string:
>
> p = mempcpy(p, src->ustr, src->len);
> *p = '\0';
>
> The following code can be used to chain‐copy from an untermi‐
> nated string structure to an unterminated string:
>
> p = mempcpy(p, src->ustr, src->len);
>
> Functions
> stpcpy(3)
> This function copies the input string into a destination string.
> The programmer is responsible for allocating a buffer large
> enough. It returns a pointer suitable for chaining.
>
> stpecpy()
> stpecpyx()
> These functions copy the input string into a destination string.
> If the destination buffer, limited by a pointer to one past the
> end of it, isn’t large enough to hold the copy, the resulting
> string is truncated (but it is guaranteed to be null‐termi‐
> nated). They return a pointer suitable for chaining. Trunca‐
> tion needs to be detected only once after the last chained call.
> stpecpyx() has identical semantics to stpecpy(), except that it
> forces a SIGSEGV on Undefined Behavior.
>
> These functions are not provided by any library, but you can de‐
> fine them with the following reference implementations:
>
> /* This code is in the public domain. */
> char *
> stpecpy(char *dst, char past_end[0],
> const char *restrict src)
> {
> char *p;
>
> if (dst == past_end)
> return past_end;
>
> p = memccpy(dst, src, '\0', past_end - dst);
> if (p != NULL)
> return p - 1;
>
> /* truncation detected */
> past_end[-1] = '\0';
> return past_end;
> }
>
> /* This code is in the public domain. */
> char *
> stpecpyx(char *dst, char past_end[0],
> const char *restrict src)
> {
> if (src[strlen(src)] != '\0')
> raise(SIGSEGV);
>
> return stpecpy(dst, past_end, src);
> }
>
> stpncpy(3)
> This function copies the input string into a destination null‐
> padded fixed‐width unterminated string. If the destination
> buffer, limited by its size, isn’t large enough to hold the
> copy, the resulting string is truncated. Since it creates an
> unterminated string, it doesn’t need to write a terminating null
> byte. It returns a pointer suitable for chaining, but it’s not
> ideal for that. Truncation needs to be detected only once after
> the last chained call.
>
> If you’re going to use this function in chained calls, it would
> probably be useful to develop a function similar to stpecpy().
>
> ustr2stp()
> This function copies the input unterminated string contained in
> a null‐padded wixed‐width buffer, into a destination (null‐ter‐
> minated) string. The programmer is responsible for allocating a
> buffer large enough. It returns a pointer suitable for chain‐
> ing.
>
> This function is not provided by any library, but you can write
> it with the definition above in this page.
>
> A truncating version of this function doesn’t exist, since the
> size of the original string is always known, so it wouldn’t be
> very useful.
>
> This function is not provided by any library, but you can define
> it with the following reference implementation:
>
> /* This code is in the public domain. */
> char *
> ustr2stp(char *restrict dst, const char *restrict src,
> size_t sz)
> {
> char *end;
>
> end = memccpy(dst, src, '\0', sz)) ?: dst + sz;
> *end = '\0';
>
> return end;
> }
>
> mempcpy(3)
> This function copies the input string, limited by its length,
> into a destination unterminated string. The programmer is re‐
> sponsible for allocating a buffer large enough. It returns a
> pointer suitable for chaining.
>
> Deprecated functions
> strlcpy(3bsd)
> strlcat(3bsd)
> Deprecated. These functions copy the input string into a desti‐
> nation string. If the destination buffer, limited by its size,
> isn’t large enough to hold the copy, the resulting string is
> truncated (but it is guaranteed to be null‐terminated). They
> return the length of the total string they tried to create.
> These functions force a SIGSEGV on Undefined Behavior.
>
> stpecpyx() is a better replacement for these functions for the
> following reasons:
>
> • Better performance (chain copy instead of concatenating).
>
> • Only requires detecting truncation once per chain of calls.
>
> strscpy(9)
> Deprecated. This function copies the input string into a desti‐
> nation string. If the destination buffer, limited by its size,
> isn’t large enough to hold the copy, the resulting string is
> truncated (but it is guaranteed to be null‐terminated). It re‐
> turns the length of the destination string, or -E2BIG on trunca‐
> tion.
>
> stpecpy() is a better replacement for this function, since it
> has a much simpler interface.
>
> strcpy(3)
> strcat(3)
> Deprecated. These functions copy the input string into a desti‐
> nation string. The programmer is responsible for allocating a
> buffer large enough. The return value is useless.
>
> strcpy(3) is identical to stpcpy(3) except for the useless re‐
> turn value.
>
> stpcpy(3) is a better replacement for these functions for the
> following reasons:
>
> • Better performance (chain copy instead of concatenating).
>
> • No need to call strlen(3), thanks to the useful return value.
>
> strncpy(3)
> Deprecated. strncpy(3) is identical to stpncpy(3) except for
> the useless return value. Due to the return value, with this
> function it’s hard to correctly check for truncation. Use stp‐
> ncpy(3) instead.
>
> strncat(3)
> Deprecated. Do not confuse this function with strncpy(3); they
> are not related at all.
>
> This function concatenates the input unterminated string con‐
> tained in a null‐padded wixed‐width buffer, into a destination
> (null‐terminated) string. The programmer is responsible for al‐
> locating a buffer large enough. The return value is useless.
>
> ustr2stp() is a better replacement for this function for the
> following reasons:
>
> • Better performance (chain copy instead of concatenating).
>
> • No need to call strlen(3), thanks to the useful return value.
>
> • Function name that is not actively confusing.
>
> RETURN VALUE
> The following functions return a pointer to the terminating null byte
> in the destination string (they never truncate).
>
> • stpcpy(3)
>
> • ustr2stp()
>
> • mempcpy(3)
>
> The following functions return a pointer to the terminating null byte
> in the destination string, except when truncation occurs; if truncation
> occurs, they return a pointer to one past the end of the destination
> buffer.
>
> • stpecpy()
>
> • stpecpyx()
>
> The following function returns a pointer to one after the last charac‐
> ter in the destination unterminated string; if truncation occurs, that
> pointer is equivalent to a pointer to one past the end of the destina‐
> tion buffer.
>
> • stpncpy(3)
>
> Deprecated
> The following functions return the length of the total string that they
> tried to create (as if truncation didn’t occur).
>
> • strlcpy(3bsd)
>
> • strlcat(3bsd)
>
> The following function returns the length of the destination string, or
> -E2BIG on truncation.
>
> • strscpy(9)
>
> The following functions return the dst pointer, which is useless.
>
> • strcpy(3)
>
> • strcat(3)
>
> • strncpy(3)
>
> • strncat(3)
>
> CAVEATS
> Some of the functions described here are not provided by any library;
> you should write your own copy if you want to use them.
>
> The deprecated status of these functions varies from system to system.
> This page declares as deprecated those functions that have a better re‐
> placement documented in this same page.
>
> EXAMPLES
> The following are examples of correct use of each of these functions.
>
> stpcpy(3)
> p = buf;
> p = stpcpy(p, "Hello ");
> p = stpcpy(p, "world");
> p = stpcpy(p, "!");
> len = p - buf;
> puts(buf);
>
> stpecpy()
> stpecpyx()
> past_end = buf + sizeof(buf);
> p = buf;
> p = stpecpy(p, past_end, "Hello ");
> p = stpecpy(p, past_end, "world");
> p = stpecpy(p, past_end, "!");
> if (p == past_end) {
> p--;
> goto toolong;
> }
> len = p - buf;
> puts(buf);
>
> stpncpy(3)
> past_end = buf + sizeof(buf);
> end = stpncpy(buf, "Hello world!", sizeof(buf));
> if (end == past_end)
> goto toolong;
> len = end - buf;
> for (size_t i = 0; i < sizeof(buf); i++)
> putchar(buf[i]);
>
> ustr2stp()
> p = buf;
> p = ustr2stp(p, "Hello ", 6);
> p = ustr2stp(p, "world", 42); // Padding null bytes ignored.
> p = ustr2stp(p, "!", 1);
> len = p - buf;
> puts(buf);
>
> mempcpy(3)
> p = buf;
> p = mempcpy(p, "Hello ", 6);
> p = mempcpy(p, "world", 5);
> p = mempcpy(p, "!", 1);
> p = '\0';
> len = p - buf;
> puts(buf);
>
> Deprecated
> strlcpy(3bsd)
> strlcat(3bsd)
> if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
> goto toolong;
> if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
> goto toolong;
> len = strlcat(buf, "!", sizeof(buf));
> if (len >= sizeof(buf))
> goto toolong;
> puts(buf);
>
> strscpy(9)
> len = strscpy(buf, "Hello world!", sizeof(buf));
> if (len == -E2BIG)
> goto toolong;
> puts(buf);
>
> strcpy(3)
> strcat(3)
> strcpy(buf, "Hello ");
> strcat(buf, "world");
> strcat(buf, "!");
> len = strlen(buf);
> puts(buf);
>
> strncpy(3)
> strncpy(buf, "Hello world!", sizeof(buf));
> if (buf + sizeof(buf) - 1 == '\0')
> goto toolong;
> len = strnlen(buf, sizeof(buf));
> for (size_t i = 0; i < sizeof(buf); i++)
> putchar(buf[i]);
>
> strncat(3)
> strncpy(buf, "Hello ", 6);
> strncat(buf, "world", 42); // Padding null bytes ignored.
> strncat(buf, "!", 1);
> puts(buf);
Oops, that example was mistaken; too much cut and paste.
strncat(3)
buf[0] = '\0';
strncat(buf, "Hello ", 6);
strncat(buf, "world", 42); // Padding null bytes ignored.
strncat(buf, "!", 1);
len = strlen(buf);
puts(buf);
>
> SEE ALSO
> memcpy(3), memccpy(3), mempcpy(3), string(3)
>
> Linux man‐pages (unreleased) (date) string_copy(7)
>
>
>
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: string_copy(7): New manual page documenting string copying functions.
2022-12-11 23:59 string_copy(7): New manual page documenting string copying functions Alejandro Colomar
2022-12-12 0:17 ` Alejandro Colomar
2022-12-12 0:25 ` Alejandro Colomar
@ 2022-12-12 0:32 ` Alejandro Colomar
2022-12-12 14:24 ` [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions Alejandro Colomar
` (2 subsequent siblings)
5 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-12 0:32 UTC (permalink / raw)
To: linux-man
[-- Attachment #1.1: Type: text/plain, Size: 31367 bytes --]
On 12/12/22 00:59, Alejandro Colomar wrote:
> Hi all!
>
> I'm planning to add a new manual page that documents all string copying
> functions. It covers more detail than any of the existing manual pages (and in
> fact, I've discovered some properties of the functions documented while working
> on this page). The intention is to remove the existing separate manual pages
> for all string copying functions, and make them links to this new page. It
> intends to be the only reference documentation for copying strings in C, and
> hopefully fix the half century of suboptimal string copying library with which
> we've lived. (Say goodbye to std::string, here come back C strings ;)
>
> The formatted manual page is below.
>
> Alex
>
> P.S.: I'm sorry for your beloved string copying function(s); it has high chances
> of being dreaded by the page below. Not sorry. Oh well, at least I justified
> it, or I tried :-)
>
> ---
>
> string_copy(7) Miscellaneous Information Manual string_copy(7)
>
> NAME
> stpcpy, stpecpy, stpecpyx, strlcpy, strlcat, strscpy, strcpy, strcat,
> stpncpy, ustr2stp, strncpy, strncat, mempcpy - copy strings
>
> SYNOPSIS
> (Null‐terminated) strings
> // Chain‐copy a string.
> char *stpcpy(char *restrict dst, const char *restrict src);
>
> // Chain‐copy a string with truncation (not in libc).
> char *stpecpy(char *dst, char past_end[0], const char *restrict src);
>
> // Chain‐copy a string with truncation and SIGSEGV on invalid input.
> char *stpecpyx(char *dst, char past_end[0], const char *restrict src);
>
> // Copy a string with truncation and SIGSEGV on invalid input.
> [[deprecated]] // Use stpecpyx() instead.
> size_t strlcpy(char dst[restrict .sz], const char *restrict src,
> size_t sz);
>
> // Concatenate a string with truncation.
> [[deprecated]] // Use stpecpyx() instead.
> size_t strlcat(char dst[restrict .sz], const char *restrict src,
> size_t sz);
>
> // Copy a string with truncation (not in libc).
> [[deprecated]] // Use stpecpy() instead.
> ssize_t strscpy(char dst[restrict .sz], const char src[restrict .sz],
> size_t sz);
>
> // Copy a string.
> [[deprecated]] // Use stpcpy(3) instead.
> char *strcpy(char *restrict dst, const char *restrict src);
>
> // Concatenate a string.
> [[deprecated]] // Use stpcpy(3) instead.
> char *strcat(char *restrict dst, const char *restrict src);
>
> Unterminated strings (null‐padded fixed‐width buffers)
> // Zero a fixed‐width buffer, and
> // copy a string with truncation into an unterminated string.
> char *stpncpy(char dst[restrict .sz], const char *restrict src,
> size_t sz);
>
> // Chain‐copy an unterminated string into a string (not in libc).
> char *ustr2stp(char *restrict dst, const char src[restrict .sz],
> size_t sz);
>
> // Zero a fixed‐width buffer, and
> // copy a string with truncation into an unterminated string
> [[deprecated]] // Use stpncpy(3) instead.
> char *strncpy(char dest[restrict .sz], const char *restrict src,
> size_t sz);
>
> // Concatenate an unterminated string into a string.
> [[deprecated]] // Use ustr2stp() instead.
> char *strncat(char *restrict dst, const char src[restrict .sz],
> size_t sz);
>
> String structures
> // (Null‐terminated) string structure.
> struct str_s {
> size_t len;
> char *str;
> };
>
> // Unterminated string structure (overlapping strings).
> struct ustr_s {
> size_t len;
> char *ustr;
> };
>
> // Chain‐copy a string structure into an unterminated string.
> void *mempcpy(void *restrict dst, const void src[restrict len],
> size_t len);
>
> DESCRIPTION
> Terms (and abbreviations)
> string (str)
> is a sequence of zero or more non‐null characters, followed by a
> null byte.
>
> unterminated string (ustr)
> is a sequence of zero or more non‐null characters. They are
> sometimes contained in fixed‐width buffers, which usually con‐
> tain padding null bytes after the unterminated string, to fill
> the rest of the buffer without affecting the unterminated
> string; however, those padding null bytes are not part of the
> unterminated string.
>
> length (len)
> is the number of non‐null characters in a string. It is the re‐
> turn value of strlen(str) and of strnlen(ustr, sz).
>
> size (sz)
> refers to the entire buffer where the string is contained.
>
> end is the name of a pointer to the terminating null byte of a
> string, or a pointer to one past the last character of an unter‐
> minated string. This is the return value of functions that al‐
> low chaining. It is equivalent to &str[len].
>
> past_end
> is the name of a pointer to one past the end of the buffer that
> contains a string. It is equivalent to &str[sz]. It is used as
> a sentinel value, to be able to truncate strings instead of
> overrunning a buffer.
>
> string structure
> unterminated string structure
> Structure that contains the length of a string, as well as the
> string or the unterminated string.
>
> Types of functions
> Copy, concatenate, and chain‐copy
> Originally, there was a distinction between functions that copy
> and those that concatenate. However, newer functions that copy
> while allowing chaining cover both use cases with a single API.
> They are also algorithmically faster, since they don’t need to
> search for the end of the existing string.
>
> To chain copy functions, they need to return a pointer to the
> end. That’s a byproduct of the copy operation, so it has no
> performance costs. These functions are preferred over copy or
> concatenation functions. Functions that return such a pointer,
> and thus can be chained, have names of the form *stp*(), since
> it’s also common to name the pointer just p.
>
> Truncate or not?
> The first thing to note is that programmers should be careful
> with buffers, so they always have the correct size, and trunca‐
> tion is not necessary.
>
> In most cases, truncation is not desired, and it is simpler to
> just do the copy. Simpler code is safer code. Programming
> against programming mistakes by adding more code just adds more
> points where mistakes can be made.
>
> Nowadays, compilers can detect most programmer errors with fea‐
> tures like compiler warnings, static analyzers, and
> _FORTIFY_SOURCE (see ftm(7)). Keeping the code simple helps
> these error‐detection features be more precise.
>
> When validating user input, however, it makes sense to truncate.
> Remember to check the return value of such function calls.
>
> Functions that truncate:
>
> • stpecpy() is the most efficient string copy function that
> performs truncation. It only requires to check for trunca‐
> tion once after all chained calls.
>
> • stpecpyx() is a variant of stpecpy() that consumes the entire
> source string, to catch bugs in the program by forcing a seg‐
> mentation fault (as strlcpy(3bsd) and strlcat(3bsd) do).
>
> • strlcpy(3bsd) and strlcat(3bsd), which originated in OpenBSD,
> are designed to crash if the input string is invalid (doesn’t
> contain a null byte).
>
> • strscpy(9) is a function in the Linux kernel which reports an
> error instead of crashing.
>
> • stpncpy(3) and strncpy(3) also truncate, but they don’t write
> strings, but rather unterminated strings.
>
> Unterminated strings (null‐padded fixed‐width buffers)
> For historic reasons, some standard APIs, such as utmpx(5), use unter‐
> minated strings in fixed‐width buffers. To interface with them, spe‐
> cialized functions need to be used.
>
> To copy strings into them, use stpncpy(3).
>
> To copy from an unterminated string within a fixed‐width buffer into a
> string, ignoring any trailing null bytes in the source fixed‐width
> buffer, you should use ustr2stp().
>
> String structures
> The simplest string copying function is mempcpy(3). It requires always
> knowing the length of your strings, for which string structures can be
> used. It makes the code simpler, since you always know the length of
> your strings, and it’s also faster, since it doesn’t need to repeatedly
> calculate those lengths. mempcpy(3) always creates an unterminated
> string, so you need to explicitly set the terminating null byte.
>
> String structure
> The following code can be used to chain‐copy from a string
> structure into a string:
>
> p = mempcpy(p, src->str, src->len);
> *p = '\0';
>
> The following code can be used to chain‐copy from a string
> structure into an unterminated string:
>
> p = mempcpy(p, src->str, src->len);
>
> Unterminated string structure (overlapping strings)
> In programs that make considerable use of strings, and need the
> best performance, using overlapping strings can make a big dif‐
> ference. It allows holding substrings of a bigger string while
> not duplicating memory nor using time to do a copy.
>
> However, this is delicate, since it requires using unterminated
> strings. C library APIs use strings, so programs that use un‐
> terminated strings will have to take care to differentiate
> strings from unterminated strings.
>
> The following code can be used to chain‐copy from an untermi‐
> nated string structure to a string:
>
> p = mempcpy(p, src->ustr, src->len);
> *p = '\0';
>
> The following code can be used to chain‐copy from an untermi‐
> nated string structure to an unterminated string:
>
> p = mempcpy(p, src->ustr, src->len);
>
> Functions
> stpcpy(3)
> This function copies the input string into a destination string.
> The programmer is responsible for allocating a buffer large
> enough. It returns a pointer suitable for chaining.
>
> stpecpy()
> stpecpyx()
> These functions copy the input string into a destination string.
> If the destination buffer, limited by a pointer to one past the
> end of it, isn’t large enough to hold the copy, the resulting
> string is truncated (but it is guaranteed to be null‐termi‐
> nated). They return a pointer suitable for chaining. Trunca‐
> tion needs to be detected only once after the last chained call.
> stpecpyx() has identical semantics to stpecpy(), except that it
> forces a SIGSEGV on Undefined Behavior.
>
> These functions are not provided by any library, but you can de‐
> fine them with the following reference implementations:
>
> /* This code is in the public domain. */
> char *
> stpecpy(char *dst, char past_end[0],
> const char *restrict src)
> {
> char *p;
>
> if (dst == past_end)
> return past_end;
>
> p = memccpy(dst, src, '\0', past_end - dst);
> if (p != NULL)
> return p - 1;
>
> /* truncation detected */
> past_end[-1] = '\0';
> return past_end;
> }
>
> /* This code is in the public domain. */
> char *
> stpecpyx(char *dst, char past_end[0],
> const char *restrict src)
> {
> if (src[strlen(src)] != '\0')
> raise(SIGSEGV);
>
> return stpecpy(dst, past_end, src);
> }
>
> stpncpy(3)
> This function copies the input string into a destination null‐
> padded fixed‐width unterminated string. If the destination
> buffer, limited by its size, isn’t large enough to hold the
> copy, the resulting string is truncated. Since it creates an
> unterminated string, it doesn’t need to write a terminating null
> byte. It returns a pointer suitable for chaining, but it’s not
> ideal for that. Truncation needs to be detected only once after
> the last chained call.
>
> If you’re going to use this function in chained calls, it would
> probably be useful to develop a function similar to stpecpy().
>
> ustr2stp()
> This function copies the input unterminated string contained in
> a null‐padded wixed‐width buffer, into a destination (null‐ter‐
> minated) string. The programmer is responsible for allocating a
> buffer large enough. It returns a pointer suitable for chain‐
> ing.
>
> This function is not provided by any library, but you can write
> it with the definition above in this page.
>
> A truncating version of this function doesn’t exist, since the
> size of the original string is always known, so it wouldn’t be
> very useful.
>
> This function is not provided by any library, but you can define
> it with the following reference implementation:
>
> /* This code is in the public domain. */
> char *
> ustr2stp(char *restrict dst, const char *restrict src,
> size_t sz)
> {
> char *end;
>
> end = memccpy(dst, src, '\0', sz)) ?: dst + sz;
> *end = '\0';
>
> return end;
> }
>
> mempcpy(3)
> This function copies the input string, limited by its length,
> into a destination unterminated string. The programmer is re‐
> sponsible for allocating a buffer large enough. It returns a
> pointer suitable for chaining.
>
> Deprecated functions
> strlcpy(3bsd)
> strlcat(3bsd)
> Deprecated. These functions copy the input string into a desti‐
> nation string. If the destination buffer, limited by its size,
> isn’t large enough to hold the copy, the resulting string is
> truncated (but it is guaranteed to be null‐terminated). They
> return the length of the total string they tried to create.
> These functions force a SIGSEGV on Undefined Behavior.
>
> stpecpyx() is a better replacement for these functions for the
> following reasons:
>
> • Better performance (chain copy instead of concatenating).
>
> • Only requires detecting truncation once per chain of calls.
>
> strscpy(9)
> Deprecated. This function copies the input string into a desti‐
> nation string. If the destination buffer, limited by its size,
> isn’t large enough to hold the copy, the resulting string is
> truncated (but it is guaranteed to be null‐terminated). It re‐
> turns the length of the destination string, or -E2BIG on trunca‐
> tion.
>
> stpecpy() is a better replacement for this function, since it
> has a much simpler interface.
>
> strcpy(3)
> strcat(3)
> Deprecated. These functions copy the input string into a desti‐
> nation string. The programmer is responsible for allocating a
> buffer large enough. The return value is useless.
>
> strcpy(3) is identical to stpcpy(3) except for the useless re‐
> turn value.
>
> stpcpy(3) is a better replacement for these functions for the
> following reasons:
>
> • Better performance (chain copy instead of concatenating).
>
> • No need to call strlen(3), thanks to the useful return value.
>
> strncpy(3)
> Deprecated. strncpy(3) is identical to stpncpy(3) except for
> the useless return value. Due to the return value, with this
> function it’s hard to correctly check for truncation. Use stp‐
> ncpy(3) instead.
>
> strncat(3)
> Deprecated. Do not confuse this function with strncpy(3); they
> are not related at all.
>
> This function concatenates the input unterminated string con‐
> tained in a null‐padded wixed‐width buffer, into a destination
> (null‐terminated) string. The programmer is responsible for al‐
> locating a buffer large enough. The return value is useless.
>
> ustr2stp() is a better replacement for this function for the
> following reasons:
>
> • Better performance (chain copy instead of concatenating).
>
> • No need to call strlen(3), thanks to the useful return value.
>
> • Function name that is not actively confusing.
>
> RETURN VALUE
> The following functions return a pointer to the terminating null byte
> in the destination string (they never truncate).
>
> • stpcpy(3)
>
> • ustr2stp()
>
> • mempcpy(3)
>
> The following functions return a pointer to the terminating null byte
> in the destination string, except when truncation occurs; if truncation
> occurs, they return a pointer to one past the end of the destination
> buffer.
>
> • stpecpy()
>
> • stpecpyx()
>
> The following function returns a pointer to one after the last charac‐
> ter in the destination unterminated string; if truncation occurs, that
> pointer is equivalent to a pointer to one past the end of the destina‐
> tion buffer.
>
> • stpncpy(3)
>
> Deprecated
> The following functions return the length of the total string that they
> tried to create (as if truncation didn’t occur).
>
> • strlcpy(3bsd)
>
> • strlcat(3bsd)
>
> The following function returns the length of the destination string, or
> -E2BIG on truncation.
>
> • strscpy(9)
>
> The following functions return the dst pointer, which is useless.
>
> • strcpy(3)
>
> • strcat(3)
>
> • strncpy(3)
>
> • strncat(3)
>
> CAVEATS
And a new caveat. I think it's obvious, but better safe than sorry.
Don’t chain calls to truncating and non‐truncating functions. It is
conceptually wrong unless you know that the first part of a copy will
always fit. Anyway, the performance difference will probably be negli‐
gible, so it will probably be more clear if you use consistent seman‐
tics: either truncating or non‐truncating. Calling a non‐truncating
function after a truncating one is necessarily wrong.
> Some of the functions described here are not provided by any library;
> you should write your own copy if you want to use them.
>
> The deprecated status of these functions varies from system to system.
> This page declares as deprecated those functions that have a better re‐
> placement documented in this same page.
>
> EXAMPLES
> The following are examples of correct use of each of these functions.
>
> stpcpy(3)
> p = buf;
> p = stpcpy(p, "Hello ");
> p = stpcpy(p, "world");
> p = stpcpy(p, "!");
> len = p - buf;
> puts(buf);
>
> stpecpy()
> stpecpyx()
> past_end = buf + sizeof(buf);
> p = buf;
> p = stpecpy(p, past_end, "Hello ");
> p = stpecpy(p, past_end, "world");
> p = stpecpy(p, past_end, "!");
> if (p == past_end) {
> p--;
> goto toolong;
> }
> len = p - buf;
> puts(buf);
>
> stpncpy(3)
> past_end = buf + sizeof(buf);
> end = stpncpy(buf, "Hello world!", sizeof(buf));
> if (end == past_end)
> goto toolong;
> len = end - buf;
> for (size_t i = 0; i < sizeof(buf); i++)
> putchar(buf[i]);
>
> ustr2stp()
> p = buf;
> p = ustr2stp(p, "Hello ", 6);
> p = ustr2stp(p, "world", 42); // Padding null bytes ignored.
> p = ustr2stp(p, "!", 1);
> len = p - buf;
> puts(buf);
>
> mempcpy(3)
> p = buf;
> p = mempcpy(p, "Hello ", 6);
> p = mempcpy(p, "world", 5);
> p = mempcpy(p, "!", 1);
> p = '\0';
> len = p - buf;
> puts(buf);
>
> Deprecated
> strlcpy(3bsd)
> strlcat(3bsd)
> if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
> goto toolong;
> if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
> goto toolong;
> len = strlcat(buf, "!", sizeof(buf));
> if (len >= sizeof(buf))
> goto toolong;
> puts(buf);
>
> strscpy(9)
> len = strscpy(buf, "Hello world!", sizeof(buf));
> if (len == -E2BIG)
> goto toolong;
> puts(buf);
>
> strcpy(3)
> strcat(3)
> strcpy(buf, "Hello ");
> strcat(buf, "world");
> strcat(buf, "!");
> len = strlen(buf);
> puts(buf);
>
> strncpy(3)
> strncpy(buf, "Hello world!", sizeof(buf));
> if (buf + sizeof(buf) - 1 == '\0')
> goto toolong;
> len = strnlen(buf, sizeof(buf));
> for (size_t i = 0; i < sizeof(buf); i++)
> putchar(buf[i]);
>
> strncat(3)
> strncpy(buf, "Hello ", 6);
> strncat(buf, "world", 42); // Padding null bytes ignored.
> strncat(buf, "!", 1);
> puts(buf);
>
> SEE ALSO
> memcpy(3), memccpy(3), mempcpy(3), string(3)
>
> Linux man‐pages (unreleased) (date) string_copy(7)
>
>
>
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions
2022-12-11 23:59 string_copy(7): New manual page documenting string copying functions Alejandro Colomar
` (2 preceding siblings ...)
2022-12-12 0:32 ` Alejandro Colomar
@ 2022-12-12 14:24 ` Alejandro Colomar
2022-12-12 17:33 ` Alejandro Colomar
` (4 more replies)
2022-12-12 14:24 ` [PATCH 2/3] stpcpy.3, stpncpy.3, strcat.3, strncat.3, strncpy.3: Transform the old pages into " Alejandro Colomar
2022-12-12 14:24 ` [PATCH 3/3] stpecpy.3, stpecpyx.3, strlcat.3, strlcpy.3, strscpy.3: Add new " Alejandro Colomar
5 siblings, 5 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-12 14:24 UTC (permalink / raw)
To: linux-man; +Cc: Alejandro Colomar
This is an opportunity to use consistent language across the
documentation for all string-copying functions.
It is also easier to show the similarities and differences between all
of the functions, so that a reader can use this page to know which
function is needed for a given task.
Many functions that are inferior to another one, have been marked as
deprecated, notwithstanding the deprecation status in C libraries or
any standards. Alternatives have been given in the same page, with
reference implementations.
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/strcpy.3 | 1053 ++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 965 insertions(+), 88 deletions(-)
diff --git a/man3/strcpy.3 b/man3/strcpy.3
index 74c3180ae..661319f0d 100644
--- a/man3/strcpy.3
+++ b/man3/strcpy.3
@@ -1,48 +1,764 @@
-.\" Copyright (C) 1993 David Metcalfe (david@prism.demon.co.uk)
+.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\" References consulted:
-.\" Linux libc source code
-.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
-.\" 386BSD man pages
-.\" Modified Sat Jul 24 18:06:49 1993 by Rik Faith (faith@cs.unc.edu)
-.\" Modified Fri Aug 25 23:17:51 1995 by Andries Brouwer (aeb@cwi.nl)
-.\" Modified Wed Dec 18 00:47:18 1996 by Andries Brouwer (aeb@cwi.nl)
-.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
-.\" Improve discussion of strncpy().
+.\" SPDX-License-Identifier: BSD-3-Clause
.\"
.TH strcpy 3 (date) "Linux man-pages (unreleased)"
+.\" ----- NAME :: -----------------------------------------------------/
.SH NAME
-strcpy \- copy a string
+stpcpy,
+stpecpy, stpecpyx,
+strlcpy, strlcat,
+strscpy,
+strcpy, strcat,
+stpncpy,
+ustr2stp,
+strncpy,
+strncat,
+mempcpy
+\- copy strings
+.\" ----- LIBRARY :: --------------------------------------------------/
.SH LIBRARY
+.TP
+.BR stpcpy (3)
+.TQ
+.BR stpncpy (3)
+.TQ
+.BR mempcpy (3)
+.TQ
+.BR strcpy "(3), \c"
+.BR strcat (3)
+.TQ
+.BR strncpy (3)
+.TQ
+.BR strncat (3)
Standard C library
.RI ( libc ", " \-lc )
+.TP
+.BR stpecpy "(3), \c"
+.BR stpecpyx (3)
+Not provided by any library.
+.TP
+.BR strlcpy "(3), \c"
+.BR strlcat (3)
+Utility functions from BSD systems
+.RI ( libbsd ", " \-lbsd )
+.TP
+.BR strscpy (9)
+Not provided by any library.
+It is a Linux kernel internal function.
+.\" ----- SYNOPSIS :: -------------------------------------------------/
.SH SYNOPSIS
+.\" ----- SYNOPSIS :: (Null-terminated) strings :: --------------------/
.nf
.B #include <string.h>
+.fi
+.SS (Null-terminated) strings
+.nf
+// Chain-copy a string.
+.BI "char *stpcpy(char *restrict " dst ", const char *restrict " src );
.PP
-.BI "char *strcpy(char *restrict " dest ", const char *restrict " src );
+// Chain-copy a string with truncation.
+// Not defined in libc.
+.BI "char *stpecpy(char *" dst ", char " past_end "[0], \
+const char *restrict " src );
+.PP
+// Chain-copy a string with truncation and SIGSEGV on invalid input.
+// Not defined in libc.
+.BI "char *stpecpyx(char *" dst ", char " past_end "[0], \
+const char *restrict " src );
+.PP
+// Copy a string with truncation and SIGSEGV on invalid input.
+.BR [[deprecated]] " // Use stpecpyx(3) instead."
+.BI "size_t strlcpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Concatenate a string with truncation.
+.BR [[deprecated]] " // Use stpecpyx(3) instead."
+.BI "size_t strlcat(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Copy a string with truncation.
+// Not defined in libc.
+.BR [[deprecated]] " // Use stpecpy(3) instead."
+.BI "ssize_t strscpy(char " dst "[restrict ." sz "], \
+const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.PP
+// Copy a string.
+.BR [[deprecated]] " // Use stpcpy(3) instead."
+.BI "char *strcpy(char *restrict " dst ", const char *restrict " src );
+.PP
+// Concatenate a string.
+.BR [[deprecated]] " // Use stpcpy(3) instead."
+.BI "char *strcat(char *restrict " dst ", const char *restrict " src );
+.fi
+.\" ----- SYNOPSIS :: Unterminated strings (null-padded fixed-width buffers)
+.SS Unterminated strings (null-padded fixed-width buffers)
+.nf
+// Zero a fixed-width buffer, and
+// copy a string with truncation into an unterminated string.
+.BI "char *stpncpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Chain-copy an unterminated string into a string.
+// Not defined in libc.
+.BI "char *ustr2stp(char *restrict " dst ", \
+const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.PP
+// Zero a fixed-width buffer, and
+// copy a string with truncation into an unterminated string
+.BR [[deprecated]] " // Use stpncpy(3) instead."
+.BI "char *strncpy(char " dest "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Concatenate an unterminated string into a string.
+.BR [[deprecated]] " // Use ustr2stp(3) instead."
+.BI "char *strncat(char *restrict " dst ", const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.fi
+.\" ----- SYNOPSIS :: String structures :: ----------------------------/
+.SS String structures
+.nf
+// (Null-terminated) string structure.
+// Not defined in libc.
+.B struct str_s {
+.B " size_t len;"
+.B " char *str;"
+.B };
+.PP
+// Unterminated string structure (overlapping strings).
+// Not defined in libc.
+.B struct ustr_s {
+.B " size_t len;"
+.B " char *ustr;"
+.B };
+.PP
+// Chain-copy a string structure into an unterminated string.
+.BI "void *mempcpy(void *restrict " dst ", \
+const void " src "[restrict ." len ],
+.BI " size_t " len );
+.fi
+.PP
+.RS -4
+Feature Test Macro Requirements for glibc (see
+.BR feature_test_macros (7)):
+.RE
+.PP
+.BR stpcpy (3),
+.BR stpncpy (3):
+.nf
+ Since glibc 2.10:
+ _POSIX_C_SOURCE >= 200809L
+ Before glibc 2.10:
+ _GNU_SOURCE
+.fi
+.PP
+.BR mempcpy (3):
+.nf
+ _GNU_SOURCE
.fi
.SH DESCRIPTION
-The
-.BR strcpy ()
-function copies the string pointed to by
-.IR src ,
-including the terminating null byte (\(aq\e0\(aq),
-to the buffer pointed to by
-.IR dest .
-The strings may not overlap, and the destination string
-.I dest
-must be large enough to receive the copy.
-.I Beware of buffer overruns!
-(See BUGS.)
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: -----------------/
+.SS Terms (and abbreviations)
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: string (str) ----/
+.TP
+.IR "string " ( str )
+is a sequence of zero or more non-null characters, followed by a null byte.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: unterminated string (ustr)
+.TP
+.IR "unterminated string " ( ustr )
+is a sequence of zero or more non-null characters.
+They are sometimes contained in fixed-width buffers,
+which usually contain padding null bytes after the unterminated string,
+to fill the rest of the buffer
+without affecting the unterminated string;
+however, those padding null bytes are not part of the unterminated string.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: length (len) ----/
+.TP
+.IR "length " ( len )
+is the number of non-null characters in a string.
+It is the return value of
+.I strlen(str)
+and of
+.IR "strnlen(ustr, sz)" .
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: size (sz) -------/
+.TP
+.IR "size " ( sz )
+refers to the entire buffer where the string is contained.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: end -------------/
+.TP
+.I end
+is the name of a pointer to the terminating null byte of a string,
+or a pointer to one past the last character of an unterminated string.
+This is the return value of functions that allow chaining.
+It is equivalent to
+.IR &str[len] .
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: past_end --------/
+.TP
+.I past_end
+is the name of a pointer to one past the end of the buffer
+that contains a string.
+It is equivalent to
+.IR &str[sz] .
+It is used as a sentinel value,
+to be able to truncate strings instead of overrunning a buffer.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: string structures
+.TP
+.I string structure
+.TQ
+.I unterminated string structure
+Structure that contains the length of a string,
+as well as the string or the unterminated string.
+.\" ----- DESCRIPTION :: Types of functions :: ------------------------/
+.SS Types of functions
+.\" ----- DESCRIPTION :: Types of functions :: Copy, concatenate, and chain-copy
+.TP
+Copy, concatenate, and chain-copy
+Originally,
+there was a distinction between functions that copy and those that concatenate.
+However, newer functions that copy while allowing chaining
+cover both use cases with a single API.
+They are also algorithmically faster,
+since they don't need to search for the end of the existing string.
+.IP
+To chain copy functions,
+they need to return a pointer to the
+.IR end .
+That's a byproduct of the copy operation,
+so it has no performance costs.
+These functions are preferred over copy or concatenation functions.
+Functions that return such a pointer,
+and thus can be chained,
+have names of the form
+.RB * stp *(),
+since it's also common to name the pointer just
+.IR p .
+.IP
+Chain-copying functions that truncate
+should accept a pointer to one past the end of the destination buffer.
+This allows not having to recalculate the remaining size after each call.
+.\" ----- DESCRIPTION :: Types of functions :: Truncate or not? -------/
+.TP
+Truncate or not?
+The first thing to note is that programmers should be careful with buffers,
+so they always have the correct size,
+and truncation is not necessary.
+.IP
+In most cases,
+truncation is not desired,
+and it is simpler to just do the copy.
+Simpler code is safer code.
+Programming against programming mistakes by adding more code
+just adds more points where mistakes can be made.
+.IP
+Nowadays,
+compilers can detect most programmer errors with features like
+compiler warnings,
+static analyzers, and
+.BR \%_FORTIFY_SOURCE
+(see
+.BR ftm (7)).
+Keeping the code simple
+helps these error-detection features be more precise.
+.IP
+When validating user input,
+however,
+it makes sense to truncate.
+Remember to check the return value of such function calls.
+.IP
+Functions that truncate:
+.RS
+.IP \(bu 3
+.BR stpecpy (3)
+is the most efficient string copy function that performs truncation.
+It only requires to check for truncation once after all chained calls.
+.IP \(bu
+.BR stpecpyx (3)
+is a variant of
+.BR stpecpy (3)
+that consumes the entire source string,
+to catch bugs in the program
+by forcing a segmentation fault (as
+.BR strlcpy (3bsd)
+and
+.BR strlcat (3bsd)
+do).
+.IP \(bu
+.BR strlcpy (3bsd)
+and
+.BR strlcat (3bsd)
+are designed to crash if the input string is invalid
+(doesn't contain a null byte).
+.IP \(bu
+.BR strscpy (9)
+reports an error instead of crashing (similar to
+.BR stpecpy (3)).
+.IP \(bu
+.BR stpncpy (3)
+and
+.BR strncpy (3)
+also truncate, but they don't write strings,
+but rather unterminated strings.
+.RE
+.\" ----- DESCRIPTION :: Unterminated strings :: ----------------------/
+.SS Unterminated strings (null-padded fixed-width buffers)
+For historic reasons,
+some standard APIs,
+such as
+.BR utmpx (5),
+use unterminated strings in fixed-width buffers.
+To interface with them,
+specialized functions need to be used.
+.PP
+To copy strings into them, use
+.BR stpncpy (3).
+.PP
+To copy from an unterminated string within a fixed-width buffer into a string,
+ignoring any trailing null bytes in the source fixed-width buffer,
+you should use
+.BR ustr2stp (3).
+.\" ----- DESCRIPTION :: String structures :: -------------------------/
+.SS String structures
+The simplest string copying function is
+.BR mempcpy (3).
+It requires always knowing the length of your strings,
+for which string structures can be used.
+It makes the code simpler,
+since you always know the length of your strings,
+and it's also faster,
+since it doesn't need to repeatedly calculate those lengths.
+.BR mempcpy (3)
+always creates an unterminated string,
+so you need to explicitly set the terminating null byte.
+.PP
+.\" ----- DESCRIPTION :: String structures :: String structure --------/
+.TP
+String structure
+The following code can be used to
+chain-copy from a string structure into a string:
+.IP
+.in +4n
+.EX
+p = mempcpy(p, src\->str, src\->len);
+*p = \(aq\e0\(aq;
+.EE
+.in
+.IP
+The following code can be used to
+chain-copy from a string structure into an unterminated string:
+.IP
+.in +4n
+.EX
+p = mempcpy(p, src\->str, src\->len);
+.EE
+.in
+.\" ----- DESCRIPTION :: String structures :: Unterminated string structure
+.TP
+Unterminated string structure (overlapping strings)
+In programs that make considerable use of strings,
+and need the best performance,
+using overlapping strings can make a big difference.
+It allows holding substrings of a bigger string
+while not duplicating memory
+nor using time to do a copy.
+.IP
+However, this is delicate,
+since it requires using unterminated strings.
+C library APIs use strings,
+so programs that use unterminated strings
+will have to take care to differentiate strings from unterminated strings.
+.IP
+The following code can be used to
+chain-copy from an unterminated string structure to a string:
+.IP
+.in +4n
+.EX
+p = mempcpy(p, src\->ustr, src\->len);
+*p = \(aq\e0\(aq;
+.EE
+.in
+.IP
+The following code can be used to
+chain-copy from an unterminated string structure to an unterminated string:
+.IP
+.in +4n
+.EX
+p = mempcpy(p, src\->ustr, src\->len);
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: ---------------------------------/
+.SS Functions
+.\" ----- DESCRIPTION :: Functions :: stpcpy(3) -----------------------/
+.TP
+.BR stpcpy (3)
+This function copies the input string into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+A simple implementation of
+.BR stpcpy (3)
+might be:
+.IP
+.in +4n
+.EX
+char *
+stpcpy(char *restrict dst, const char *restrict src)
+{
+ return mempcpy(dst, src, strlen(src));
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: stpecpy(3), stpecpyx(3) ---------/
+.TP
+.BR stpecpy (3)
+.TQ
+.BR stpecpyx (3)
+These functions copy the input string into a destination string.
+If the destination buffer,
+limited by a pointer to one past the end of it,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+They return a pointer suitable for chaining.
+Truncation needs to be detected only once after the last chained call.
+.BR stpecpyx (3)
+has identical semantics to
+.BR stpecpy (3),
+except that it forces a SIGSEGV on Undefined Behavior.
+.IP
+These functions are not provided by any library,
+but you can define them with the following reference implementations:
+.IP
+.in +4n
+.EX
+/* This code is in the public domain. */
+char *
+stpecpy(char *dst, char past_end[0],
+ const char *restrict src)
+{
+ char *p;
+
+ if (dst == past_end)
+ return past_end;
+
+ p = memccpy(dst, src, \(aq\e0\(aq, past_end \- dst);
+ if (p != NULL)
+ return p \- 1;
+
+ /* truncation detected */
+ past_end[\-1] = \(aq\e0\(aq;
+ return past_end;
+}
+
+/* This code is in the public domain. */
+char *
+stpecpyx(char *dst, char past_end[0],
+ const char *restrict src)
+{
+ if (src[strlen(src)] != \(aq\e0\(aq)
+ raise(SIGSEGV);
+
+ return stpecpy(dst, past_end, src);
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: stpncpy(3) ----------------------/
+.TP
+.BR stpncpy (3)
+This function copies the input string into
+a destination null-padded fixed-width unterminated string.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting string is truncated.
+Since it creates an unterminated string,
+it doesn't need to write a terminating null byte.
+It returns a pointer suitable for chaining,
+but it's not ideal for that.
+Truncation needs to be detected only once after the last chained call.
+.IP
+If you're going to use this function in chained calls,
+it would be useful to develop a similar function
+that accepts a pointer to one past the end of the buffer instead of a size.
+.IP
+A simple implementation of
+.BR stpncpy (3)
+might be:
+.IP
+.in +4n
+.EX
+char *
+stpncpy(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ char *p;
+
+ bzero(dst, sz);
+ p = memccpy(dst, src, \(aq\e0\(aq, sz);
+ if (p == NULL)
+ return dst + sz;
+
+ return p \- 1;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: ustr2stp(3) ---------------------/
+.TP
+.BR ustr2stp (3)
+This function copies the input unterminated string
+contained in a null-padded wixed-width buffer,
+into a destination (null-terminated) string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+A truncating version of this function doesn't exist,
+since the size of the original string is always known,
+so it wouldn't be very useful.
+.IP
+This function is not provided by any library,
+but you can define it with the following reference implementation:
+.IP
+.in +4n
+.EX
+/* This code is in the public domain. */
+char *
+ustr2stp(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ char *end;
+
+ end = memccpy(dst, src, \(aq\e0\(aq, sz)) ?: dst + sz;
+ *end = \(aq\e0\(aq;
+
+ return end;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: mempcpy(3) ----------------------/
+.TP
+.BR mempcpy (3)
+This function copies the input string,
+limited by its length,
+into a destination unterminated string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+A simple implementation of
+.BR mempcpy (3)
+might be:
+.IP
+.in +4n
+.EX
+void *
+mempcpy(void *restrict dst, const void *restrict src,
+ size_t len)
+{
+ return memcpy(dst, src, len) + len;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Deprecated functions :: ----------------------/
+.SS Deprecated functions
+.\" ----- DESCRIPTION :: Deprecated functions :: strlcpy(3bsd), strlcat(3bsd)
+.TP
+.BR strlcpy (3bsd)
+.TQ
+.BR strlcat (3bsd)
+.IR Deprecated .
+These functions copy the input string into a destination string.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+They return the length of the total string they tried to create.
+These functions force a SIGSEGV on Undefined Behavior.
+.IP
+.BR stpecpyx (3)
+is a better replacement for these functions.
+.\" ----- DESCRIPTION :: Deprecated functions :: strscpy(9) -----------/
+.TP
+.BR strscpy (9)
+.IR Deprecated .
+This function copies the input string into a destination string.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+It returns the length of the destination string, or
+.B \-E2BIG
+on truncation.
+.IP
+.BR stpecpy (3)
+is a better replacement for this function.
+.RE
+.\" ----- DESCRIPTION :: Deprecated functions :: strcpy(3), strcat(3) -/
+.TP
+.BR strcpy (3)
+.TQ
+.BR strcat (3)
+.IR Deprecated .
+These functions copy the input string into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+The return value is useless.
+.IP
+.BR stpcpy (3)
+is a better replacement for these functions.
+.IP
+A simple implementation of
+.BR strcpy (3)
+and
+.BR strcat (3)
+might be:
+.IP
+.in +4n
+.EX
+char *
+strcpy(char *restrict dst, const char *restrict src)
+{
+ stpcpy(dst, src);
+ return dst;
+}
+
+char *
+strcat(char *restrict dst, const char *restrict src)
+{
+ stpcpy(dst + strlen(dst), src);
+ return dst;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Deprecated functions :: strncpy(3) -----------/
+.TP
+.BR strncpy (3)
+.IR Deprecated .
+.BR strncpy (3)
+is identical to
+.BR stpncpy (3)
+except for the useless return value.
+Due to the return value,
+with this function it's hard to correctly check for truncation.
+.IP
+.BR stpncpy (3)
+is a better replacement for this function.
+.IP
+A simple implementation of
+.BR strncpy (3)
+might be:
+.IP
+.in +4n
+.EX
+char *
+strncpy(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ stpncpy(dst, src, sz);
+ return dst;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Deprecated functions :: strncat(3) -----------/
+.TP
+.BR strncat (3)
+.IR Deprecated .
+Do not confuse this function with
+.BR strncpy (3);
+they are not related at all.
+.IP
+This function concatenates the input unterminated string
+contained in a null-padded wixed-width buffer,
+into a destination (null-terminated) string.
+The programmer is responsible for allocating a buffer large enough.
+The return value is useless.
+.IP
+.BR ustr2stp (3)
+is a better replacement for this function.
+.IP
+A simple implementation of
+.BR strncat (3)
+might be:
+.IP
+.in +4n
+.EX
+char *
+strncat(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ ustr2stp(dst + strlen(dst), src, sz);
+ return dst;
+}
+.EE
+.in
+.\" ----- RETURN VALUE :: ---------------------------------------------/
.SH RETURN VALUE
-The
-.BR strcpy ()
-function returns a pointer to
-the destination string
-.IR dest .
+The following functions return
+a pointer to the terminating null byte in the destination string.
+.PD 0
+.IP \(bu 3
+.BR stpcpy (3)
+.IP \(bu
+.BR ustr2stp (3)
+.PD
+.PP
+The following functions return
+a pointer to the terminating null byte in the destination string,
+except when truncation occurs;
+if truncation occurs,
+they return a pointer to one past the end of the destination buffer.
+.IP \(bu 3
+.BR stpecpy (3),
+.BR stpecpyx (3)
+.PP
+The following function returns
+a pointer to one after the last character
+in the destination unterminated string;
+if truncation occurs,
+that pointer is equivalent to
+a pointer to one past the end of the destination buffer.
+.IP \(bu 3
+.BR stpncpy (3)
+.PP
+The following function returns
+a pointer to one after the last character
+in the destination unterminated string.
+.IP \(bu 3
+.BR mempcpy (3)
+.\" ----- RETURN VALUE :: Deprecated ----------------------------------/
+.SS Deprecated
+The following functions return
+the length of the total string that they tried to create
+(as if truncation didn't occur).
+.IP \(bu 3
+.BR strlcpy (3bsd),
+.BR strlcat (3bsd)
+.PP
+The following function returns
+the length of the destination string, or
+.B \-E2BIG
+on truncation.
+.IP \(bu 3
+.BR strscpy (9)
+.PP
+The following functions return the
+.I dst
+pointer,
+which is useless.
+.PD 0
+.IP \(bu 3
+.BR strcpy (3),
+.BR strcat (3)
+.IP \(bu
+.BR strncpy (3)
+.IP \(bu
+.BR strncat (3)
+.PD
+.\" ----- ATTRIBUTES :: -----------------------------------------------/
.SH ATTRIBUTES
For an explanation of the terms used in this section, see
.BR attributes (7).
@@ -54,73 +770,234 @@ .SH ATTRIBUTES
l l l.
Interface Attribute Value
T{
-.BR strcpy ()
+.BR stpcpy (),
+.BR stpecpy (),
+.BR stpecpyx ()
+.BR strlcpy (),
+.BR strlcat (),
+.BR strscpy (),
+.BR strcpy (),
+.BR strcat (),
+.BR stpncpy (),
+.BR ustr2stp (),
+.BR strncpy (),
+.BR strncat (),
+.BR mempcpy ()
T} Thread safety MT-Safe
.TE
.hy
.ad
.sp 1
+.\" ----- STANDARDS :: ------------------------------------------------/
.SH STANDARDS
-POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH NOTES
-.SS strlcpy()
-Some systems (the BSDs, Solaris, and others) provide the following function:
+.TP
+.BR stpcpy (3)
+.\" This function was added to POSIX.1-2008.
+.\" Before that, it was not part of
+.\" the C or POSIX.1 standards, nor customary on UNIX systems.
+.\" It first appeared at least as early as 1986,
+.\" in the Lattice C AmigaDOS compiler,
+.\" then in the GNU fileutils and GNU textutils in 1989,
+.\" and in the GNU C library by 1992.
+.\" It is also present on the BSDs.
+.TQ
+.BR stpncpy (3)
+.\" This function was added to POSIX.1-2008.
+.\" Before that, it was a GNU extension.
+.\" It first appeared in glibc 1.07 in 1993.
+POSIX.1-2008.
+.TP
+.BR stpecpy "(3), \c"
+.BR stpecpyx (3)
+.TQ
+.BR ustr2stp (3)
+Not defined by any standards nor libraries.
+.TP
+.BR mempcpy (3)
+This function is a GNU extension.
+.TP
+.BR strlcpy "(3bsd), \c"
+.BR strlcat (3bsd)
+Functions originated in OpenBSD and present in some Unix systems.
+.TP
+.BR strscpy (9)
+Linux kernel internal function.
+.TP
+.BR strcpy "(3), \c"
+.BR strcat (3)
+.TQ
+.BR strncpy (3)
+.TQ
+.BR strncat (3)
+POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
+.\" ----- CAVEATS :: --------------------------------------------------/
+.SH CAVEATS
+Don't mix chain calls to truncating and non-truncating functions.
+It is conceptually wrong
+unless you know that the first part of a copy will always fit.
+Anyway, the performance difference will probably be negligible,
+so it will probably be more clear if you use consistent semantics:
+either truncating or non-truncating.
+Calling a non-truncating function after a truncating one is necessarily wrong.
.PP
+Some of the functions described here are not provided by any library;
+you should write your own copy if you want to use them.
+See STANDARDS.
+.PP
+The deprecation status of these functions varies from system to system.
+This page declares as deprecated
+those functions that have a better replacement documented in this same page.
+.\" ----- EXAMPLES :: -------------------------------------------------/
+.SH EXAMPLES
+The following are examples of correct use of each of these functions.
+.\" ----- EXAMPLES :: stpcpy(3) ---------------------------------------/
+.TP
+.BR stpcpy (3)
.in +4n
.EX
-size_t strlcpy(char *dest, const char *src, size_t size);
+p = buf;
+p = stpcpy(p, "Hello ");
+p = stpcpy(p, "world");
+p = stpcpy(p, "!");
+len = p \- buf;
+puts(buf);
.EE
.in
-.PP
-.\" http://static.usenix.org/event/usenix99/full_papers/millert/millert_html/index.html
-.\" "strlcpy and strlcat - consistent, safe, string copy and concatenation"
-.\" 1999 USENIX Annual Technical Conference
-This function is similar to
-.BR strcpy (),
-but it copies at most
-.I size\-1
-bytes to
-.IR dest ,
-truncating the string as necessary.
-It always adds a terminating null byte.
-This function fixes some of the problems of
-.BR strcpy ()
-but the caller must still handle the possibility of data loss if
-.I size
-is too small.
-The return value of the function is the length of
-.IR src ,
-which allows truncation to be easily detected:
-if the return value is greater than or equal to
-.IR size ,
-truncation occurred.
-If loss of data matters, the caller
-.I must
-either check the arguments before the call,
-or test the function return value.
-.BR strlcpy ()
-is not present in glibc and is not standardized by POSIX,
-.\" https://lwn.net/Articles/506530/
-but is available on Linux via the
-.I libbsd
-library.
-.SH BUGS
-If the destination string of a
-.BR strcpy ()
-is not large enough, then anything might happen.
-Overflowing fixed-length string buffers is a favorite cracker technique
-for taking complete control of the machine.
-Any time a program reads or copies data into a buffer,
-the program first needs to check that there's enough space.
-This may be unnecessary if you can show that overflow is impossible,
-but be careful: programs can get changed over time,
-in ways that may make the impossible possible.
+.\" ----- EXAMPLES :: stpecpy(3), stpecpyx(3) -------------------------/
+.TP
+.BR stpecpy (3)
+.TQ
+.BR stpecpyx (3)
+.in +4n
+.EX
+past_end = buf + sizeof(buf);
+p = buf;
+p = stpecpy(p, past_end, "Hello ");
+p = stpecpy(p, past_end, "world");
+p = stpecpy(p, past_end, "!");
+if (p == past_end) {
+ p\-\-;
+ goto toolong;
+}
+len = p \- buf;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: stpncpy(3) --------------------------------------/
+.TP
+.BR stpncpy (3)
+.in +4n
+.EX
+past_end = buf + sizeof(buf);
+end = stpncpy(buf, "Hello world!", sizeof(buf));
+if (end == past_end)
+ goto toolong;
+len = end \- buf;
+for (size_t i = 0; i < sizeof(buf); i++)
+ putchar(buf[i]);
+.EE
+.in
+.\" ----- EXAMPLES :: ustr2stp(3) -------------------------------------/
+.TP
+.BR ustr2stp (3)
+.in +4n
+.EX
+p = buf;
+p = ustr2stp(p, "Hello ", 6);
+p = ustr2stp(p, "world", 42); // Padding null bytes ignored.
+p = ustr2stp(p, "!", 1);
+len = p \- buf;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: mempcpy(3) --------------------------------------/
+.TP
+.BR mempcpy (3)
+.in +4n
+.EX
+p = buf;
+p = mempcpy(p, "Hello ", 6);
+p = mempcpy(p, "world", 5);
+p = mempcpy(p, "!", 1);
+p = \(aq\e0\(aq;
+len = p \- buf;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: Deprecated :: -----------------------------------/
+.SS Deprecated
+.\" ----- EXAMPLES :: Deprecated :: strlcpy(3bsd), strlcat(3bsd) ------/
+.TP
+.BR strlcpy (3bsd)
+.TQ
+.BR strlcat (3bsd)
+.in +4n
+.EX
+if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
+ goto toolong;
+if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
+ goto toolong;
+len = strlcat(buf, "!", sizeof(buf));
+if (len >= sizeof(buf))
+ goto toolong;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: Deprecated :: strscpy(9) ------------------------/
+.TP
+.BR strscpy (9)
+.in +4n
+.EX
+len = strscpy(buf, "Hello world!", sizeof(buf));
+if (len == \-E2BIG)
+ goto toolong;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: Deprecated :: strcpy(3), strcat(3) --------------/
+.TP
+.BR strcpy (3)
+.TQ
+.BR strcat (3)
+.in +4n
+.EX
+strcpy(buf, "Hello ");
+strcat(buf, "world");
+strcat(buf, "!");
+len = strlen(buf);
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: Deprecated :: strncpy(3) ------------------------/
+.TP
+.BR strncpy (3)
+.in +4n
+.EX
+strncpy(buf, "Hello world!", sizeof(buf));
+if (buf + sizeof(buf) \- 1 == \(aq\e0\(aq)
+ goto toolong;
+len = strnlen(buf, sizeof(buf));
+for (size_t i = 0; i < sizeof(buf); i++)
+ putchar(buf[i]);
+.EE
+.in
+.\" ----- EXAMPLES :: Deprecated :: strncat(3) ------------------------/
+.TP
+.BR strncat (3)
+.in +4n
+.EX
+buf[0] = \(aq\e0\(aq; // There's no 'cpy' function to this 'cat'.
+strncat(buf, "Hello ", 6);
+strncat(buf, "world", 42); // Padding null bytes ignored.
+strncat(buf, "!", 1);
+len = strlen(buf);
+puts(buf);
+.EE
+.in
+.\" ----- SEE ALSO :: -------------------------------------------------/
.SH SEE ALSO
-.BR bcopy (3),
-.BR memccpy (3),
+.BR bzero (3)
.BR memcpy (3),
-.BR memmove (3),
-.BR stpcpy (3),
-.BR strdup (3),
-.BR string (3),
-.BR wcscpy (3)
+.BR memccpy (3),
+.BR mempcpy (3),
+.BR string (3)
--
2.38.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 2/3] stpcpy.3, stpncpy.3, strcat.3, strncat.3, strncpy.3: Transform the old pages into links to strcpy(3)
2022-12-11 23:59 string_copy(7): New manual page documenting string copying functions Alejandro Colomar
` (3 preceding siblings ...)
2022-12-12 14:24 ` [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions Alejandro Colomar
@ 2022-12-12 14:24 ` Alejandro Colomar
2022-12-12 14:24 ` [PATCH 3/3] stpecpy.3, stpecpyx.3, strlcat.3, strlcpy.3, strscpy.3: Add new " Alejandro Colomar
5 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-12 14:24 UTC (permalink / raw)
To: linux-man; +Cc: Alejandro Colomar
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/stpcpy.3 | 115 +--------------------------------
man3/stpncpy.3 | 123 +----------------------------------
man3/strcat.3 | 161 +--------------------------------------------
man3/strncat.3 | 172 +------------------------------------------------
man3/strncpy.3 | 130 +------------------------------------
5 files changed, 5 insertions(+), 696 deletions(-)
diff --git a/man3/stpcpy.3 b/man3/stpcpy.3
index 5770790fc..ff7476a84 100644
--- a/man3/stpcpy.3
+++ b/man3/stpcpy.3
@@ -1,114 +1 @@
-.\" Copyright 1995 James R. Van Zandt <jrv@vanzandt.mv.com>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.TH stpcpy 3 (date) "Linux man-pages (unreleased)"
-.SH NAME
-stpcpy \- copy a string returning a pointer to its end
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <string.h>
-.PP
-.BI "char *stpcpy(char *restrict " dest ", const char *restrict " src );
-.fi
-.PP
-.RS -4
-Feature Test Macro Requirements for glibc (see
-.BR feature_test_macros (7)):
-.RE
-.PP
-.BR stpcpy ():
-.nf
- Since glibc 2.10:
- _POSIX_C_SOURCE >= 200809L
- Before glibc 2.10:
- _GNU_SOURCE
-.fi
-.SH DESCRIPTION
-The
-.BR stpcpy ()
-function copies the string pointed to by
-.I src
-(including the terminating null byte (\(aq\e0\(aq)) to the array pointed to by
-.IR dest .
-The strings may not overlap, and the destination string
-.I dest
-must be large enough to receive the copy.
-.SH RETURN VALUE
-.BR stpcpy ()
-returns a pointer to the
-.B end
-of the string
-.I dest
-(that is, the address of the terminating null byte)
-rather than the beginning.
-.SH ATTRIBUTES
-For an explanation of the terms used in this section, see
-.BR attributes (7).
-.ad l
-.nh
-.TS
-allbox;
-lbx lb lb
-l l l.
-Interface Attribute Value
-T{
-.BR stpcpy ()
-T} Thread safety MT-Safe
-.TE
-.hy
-.ad
-.sp 1
-.SH STANDARDS
-This function was added to POSIX.1-2008.
-Before that, it was not part of
-the C or POSIX.1 standards, nor customary on UNIX systems.
-It first appeared at least as early as 1986,
-in the Lattice C AmigaDOS compiler,
-then in the GNU fileutils and GNU textutils in 1989,
-and in the GNU C library by 1992.
-It is also present on the BSDs.
-.SH BUGS
-This function may overrun the buffer
-.IR dest .
-.SH EXAMPLES
-For example, this program uses
-.BR stpcpy ()
-to concatenate
-.B foo
-and
-.B bar
-to produce
-.BR foobar ,
-which it then prints.
-.PP
-.\" SRC BEGIN (stpcpy.c)
-.EX
-#define _GNU_SOURCE
-#include <stdio.h>
-#include <string.h>
-
-int
-main(void)
-{
- char buffer[20];
- char *to = buffer;
-
- to = stpcpy(to, "foo");
- to = stpcpy(to, "bar");
- printf("%s\en", buffer);
-}
-.EE
-.\" SRC END
-.SH SEE ALSO
-.BR bcopy (3),
-.BR memccpy (3),
-.BR memcpy (3),
-.BR memmove (3),
-.BR stpncpy (3),
-.BR strcpy (3),
-.BR string (3),
-.BR wcpcpy (3)
+.so man3/strcpy.3
diff --git a/man3/stpncpy.3 b/man3/stpncpy.3
index 0a62e3055..ff7476a84 100644
--- a/man3/stpncpy.3
+++ b/man3/stpncpy.3
@@ -1,122 +1 @@
-.\" Copyright (c) Bruno Haible <haible@clisp.cons.org>
-.\" Copyright (c) 2022 Alejandro Colomar <alx@kernel.org>
-.\"
-.\" SPDX-License-Identifier: GPL-2.0-or-later
-.\"
-.\" References consulted:
-.\" GNU glibc-2 source code and manual
-.\"
-.\" Corrected, aeb, 990824
-.TH stpncpy 3 (date) "Linux man-pages (unreleased)"
-.SH NAME
-stpncpy \- copy string into a fixed-length buffer and zero the rest of it
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <string.h>
-.PP
-.BI "char *stpncpy(char " dest "[restrict ." n "], \
-const char " src "[restrict ." n ],
-.BI " size_t " n );
-.fi
-.PP
-.RS -4
-Feature Test Macro Requirements for glibc (see
-.BR feature_test_macros (7)):
-.RE
-.PP
-.BR stpncpy ():
-.nf
- Since glibc 2.10:
- _POSIX_C_SOURCE >= 200809L
- Before glibc 2.10:
- _GNU_SOURCE
-.fi
-.SH DESCRIPTION
-.IR Note :
-This is probably not the function you want to use.
-For string copying with truncation, see
-.BR strlcpy (3bsd).
-.PP
-The
-.BR stpncpy ()
-function copies at most
-.I n
-characters of
-.I src
-and fills the rest of the
-.I dest
-buffer with null bytes.
-.BR Warning :
-If there is no null character among the first
-.I n
-bytes of
-.IR src ,
-the string placed in
-.I dest
-will not be null-terminated.
-.PP
-A simple implementation of
-.BR strncpy ()
-might be:
-.PP
-.in +4n
-.EX
-char *
-stpncpy(char *dest, const char *src, size_t n)
-{
- char *p
-
- bzero(dest, n);
- p = memccpy(dest, src, \(aq\e0\(aq, n);
- if (p == NULL)
- return dest + n;
-
- return p - 1;
-}
-.EE
-.in
-.PP
-The use of
-.BR strncpy ()
-is to copy a C string to a fixed-length buffer
-while ensuring that unused bytes in the destination buffer are zeroed out
-(perhaps to prevent information leaks if the buffer is to be
-written to media or transmitted to another process via an
-interprocess communication technique).
-.SH RETURN VALUE
-.BR stpncpy ()
-returns a pointer to the terminating null byte
-in
-.IR dest ,
-or, if
-.I dest
-is not null-terminated,
-.IR dest + n
-(that is, a pointer to one-past-the-end of the array).
-.SH ATTRIBUTES
-For an explanation of the terms used in this section, see
-.BR attributes (7).
-.ad l
-.nh
-.TS
-allbox;
-lbx lb lb
-l l l.
-Interface Attribute Value
-T{
-.BR stpncpy ()
-T} Thread safety MT-Safe
-.TE
-.hy
-.ad
-.sp 1
-.SH STANDARDS
-This function was added to POSIX.1-2008.
-Before that, it was a GNU extension.
-It first appeared in glibc 1.07 in 1993.
-.SH SEE ALSO
-.BR strlcpy (3bsd)
-.BR wcpncpy (3)
+.so man3/strcpy.3
diff --git a/man3/strcat.3 b/man3/strcat.3
index 277e5b1e4..ff7476a84 100644
--- a/man3/strcat.3
+++ b/man3/strcat.3
@@ -1,160 +1 @@
-.\" Copyright 1993 David Metcalfe (david@prism.demon.co.uk)
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\" References consulted:
-.\" Linux libc source code
-.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
-.\" 386BSD man pages
-.\" Modified Sat Jul 24 18:11:47 1993 by Rik Faith (faith@cs.unc.edu)
-.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
-.\" Improve discussion of strncat().
-.TH strcat 3 (date) "Linux man-pages (unreleased)"
-.SH NAME
-strcat \- concatenate two strings
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <string.h>
-.PP
-.BI "char *strcat(char *restrict " dest ", const char *restrict " src );
-.fi
-.SH DESCRIPTION
-The
-.BR strcat ()
-function appends the
-.I src
-string to the
-.I dest
-string,
-overwriting the terminating null byte (\(aq\e0\(aq) at the end of
-.IR dest ,
-and then adds a terminating null byte.
-The strings may not overlap, and the
-.I dest
-string must have
-enough space for the result.
-If
-.I dest
-is not large enough, program behavior is unpredictable;
-.IR "buffer overruns are a favorite avenue for attacking secure programs" .
-.SH RETURN VALUE
-The
-.BR strcat ()
-function returns a pointer to the resulting string
-.IR dest .
-.SH ATTRIBUTES
-For an explanation of the terms used in this section, see
-.BR attributes (7).
-.ad l
-.nh
-.TS
-allbox;
-lbx lb lb
-l l l.
-Interface Attribute Value
-T{
-.BR strcat (),
-.BR strncat ()
-T} Thread safety MT-Safe
-.TE
-.hy
-.ad
-.sp 1
-.SH STANDARDS
-POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH NOTES
-Some systems (the BSDs, Solaris, and others) provide the following function:
-.PP
-.in +4n
-.EX
-size_t strlcat(char *dest, const char *src, size_t size);
-.EE
-.in
-.PP
-This function appends the null-terminated string
-.I src
-to the string
-.IR dest ,
-copying at most
-.I size\-strlen(dest)\-1
-from
-.IR src ,
-and adds a terminating null byte to the result,
-.I unless
-.I size
-is less than
-.IR strlen(dest) .
-This function fixes the buffer overrun problem of
-.BR strcat (),
-but the caller must still handle the possibility of data loss if
-.I size
-is too small.
-The function returns the length of the string
-.BR strlcat ()
-tried to create; if the return value is greater than or equal to
-.IR size ,
-data loss occurred.
-If data loss matters, the caller
-.I must
-either check the arguments before the call, or test the function return value.
-.BR strlcat ()
-is not present in glibc and is not standardized by POSIX,
-.\" https://lwn.net/Articles/506530/
-but is available on Linux via the
-.I libbsd
-library.
-.\"
-.SH EXAMPLES
-Because
-.BR strcat ()
-must find the null byte that terminates the string
-.I dest
-using a search that starts at the beginning of the string,
-the execution time of this function
-scales according to the length of the string
-.IR dest .
-This can be demonstrated by running the program below.
-(If the goal is to concatenate many strings to one target,
-then manually copying the bytes from each source string
-while maintaining a pointer to the end of the target string
-will provide better performance.)
-.\"
-.SS Program source
-\&
-.\" SRC BEGIN (strcat.c)
-.EX
-#include <stdint.h>
-#include <stdio.h>
-#include <string.h>
-#include <time.h>
-
-int
-main(void)
-{
-#define LIM 4000000
- char p[LIM + 1]; /* +1 for terminating null byte */
- time_t base;
-
- base = time(NULL);
- p[0] = \(aq\e0\(aq;
-
- for (unsigned int j = 0; j < LIM; j++) {
- if ((j % 10000) == 0)
- printf("%u %jd\en", j, (intmax_t) (time(NULL) \- base));
- strcat(p, "a");
- }
-}
-.EE
-.\" SRC END
-.SH SEE ALSO
-.BR bcopy (3),
-.BR memccpy (3),
-.BR memcpy (3),
-.BR strcpy (3),
-.BR string (3),
-.BR strlcat (3bsd),
-.BR wcscat (3),
-.BR wcsncat (3)
+.so man3/strcpy.3
diff --git a/man3/strncat.3 b/man3/strncat.3
index 6e4bf6d78..ff7476a84 100644
--- a/man3/strncat.3
+++ b/man3/strncat.3
@@ -1,171 +1 @@
-.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.TH strncat 3 (date) "Linux man-pages (unreleased)"
-.SH NAME
-strncat \- concatenate an unterminated string into a string
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <string.h>
-.PP
-.BI "char *strncat(char " dest "[restrict strlen(." dest ") + ." n " + 1],"
-.BI " const char " src "[restrict ." n ],
-.BI " size_t " n );
-.fi
-.SH DESCRIPTION
-.IR Note :
-This is probably not the function you want to use.
-For string concatenation with truncation, see
-.BR strlcat (3bsd).
-For copying or concatenating a string into a fixed-length buffer
-with zeroing of the rest, see
-.BR stpncpy (3).
-.PP
-.BR strncat ()
-appends at most
-.I n
-characters of
-.I src
-to the end of
-.IR dst .
-It always terminates with a null character the string placed in
-.IR dest .
-.PP
-An implementation of
-.BR strncat ()
-might be:
-.PP
-.in +4n
-.EX
-char *
-strncat(char *dest, const char *src, size_t n)
-{
- char *cat;
- size_t len;
-
- cat = dest + strlen(dest);
- len = strnlen(src, n);
- memcpy(cat, src, len);
- cat[len] = \(aq\e0\(aq;
-
- return dest;
-}
-.EE
-.in
-.SH RETURN VALUE
-.BR strncat ()
-returns a pointer to the resulting string
-.IR dest .
-.SH ATTRIBUTES
-For an explanation of the terms used in this section, see
-.BR attributes (7).
-.ad l
-.nh
-.TS
-allbox;
-lbx lb lb
-l l l.
-Interface Attribute Value
-T{
-.BR strncat ()
-T} Thread safety MT-Safe
-.TE
-.hy
-.ad
-.sp 1
-.SH STANDARDS
-POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH NOTES
-.SS ustr2stpe()
-You may want to write your own function similar to
-.BR strncpy (),
-with the following improvements:
-.IP \(bu 3
-Copy, instead of concatenating.
-There's no equivalent of
-.BR strncat ()
-that copies instead of concatenating.
-.IP \(bu
-Allow chaining the function,
-by returning a suitable pointer.
-Copy chaining is faster than concatenating.
-.IP \(bu
-Don't check for null characters in the middle of the unterminated string.
-If the string is terminated, this function should not be used.
-If the string is unterminated, it is unnecessary.
-.IP \(bu
-A name that tells what it does:
-Copy from an
-.IR u nterminated
-.IR str ing
-to a
-.IR st ring,
-and return a
-.IR p ointer
-to its end.
-.PP
-.in +4n
-.EX
-/* This code is in the public domain.
- *
- * char *ustr2stp(char dst[restrict .n+1],
- * const char src[restrict .n],
- * size_t len);
- */
-char *
-ustr2stp(char *restrict dst, const char *restrict src, size_t len)
-{
- memcpy(dst, src, len);
- dst[len] = \(aq\e0\(aq;
-
- return dst + len;
-}
-.EE
-.in
-.SH CAVEATS
-This function doesn't know the size of the destination buffer,
-so it can overrun the buffer if the programmer wasn't careful enough.
-.SH BUGS
-.BR strncat (3)
-has a misleading name;
-it has no relationship with
-.BR strncpy (3).
-.SH EXAMPLES
-The following program creates a string
-from a concatenation of unterminated strings.
-.\" SRC BEGIN (strncpy.c)
-.EX
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-
-#define nitems(arr) (sizeof((arr)) / sizeof((arr)[0]))
-
-int
-main(void)
-{
- char pre[4] = "pre.";
- char *post = ".post";
- char *src = "some_long_body.post";
- char dest[100];
-
- dest[0] = \(aq\e0\(aq;
- strncat(dest, pre, nitems(pre));
- strncat(dest, src, strlen(src) \- strlen(post));
-
- puts(dest); // "pre.some_long_body"
- exit(EXIT_SUCCESS);
-}
-.EE
-.\" SRC END
-.in
-.SH SEE ALSO
-.BR memccpy (3),
-.BR memcpy (3),
-.BR mempcpy (3),
-.BR strcpy (3),
-.BR string (3)
+.so man3/strcpy.3
diff --git a/man3/strncpy.3 b/man3/strncpy.3
index e2ffc683f..ff7476a84 100644
--- a/man3/strncpy.3
+++ b/man3/strncpy.3
@@ -1,129 +1 @@
-.\" Copyright (C) 1993 David Metcalfe <david@prism.demon.co.uk>
-.\" Copyright (C) 2022 Alejandro Colomar <alx@kernel.org>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\" References consulted:
-.\" Linux libc source code
-.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
-.\" 386BSD man pages
-.\" Modified Sat Jul 24 18:06:49 1993 by Rik Faith (faith@cs.unc.edu)
-.\" Modified Fri Aug 25 23:17:51 1995 by Andries Brouwer (aeb@cwi.nl)
-.\" Modified Wed Dec 18 00:47:18 1996 by Andries Brouwer (aeb@cwi.nl)
-.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
-.\" Improve discussion of strncpy().
-.\"
-.TH strncpy 3 (date) "Linux man-pages (unreleased)"
-.SH NAME
-strncpy \- copy a string into a fixed-length buffer and zero the rest of it
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <string.h>
-.PP
-.BI "[[deprecated]] char *strncpy(char " dest "[restrict ." n ],
-.BI " const char " src "[restrict ." n "], \
-size_t " n );
-.fi
-.SH DESCRIPTION
-.BI Note: " This is not the function you want to use."
-For string copying with truncation, see
-.BR strlcpy (3bsd).
-For copying a string into a fixed-length buffer with zeroing of the rest,
-see
-.BR stpncpy (3).
-.PP
-.BR strncpy ()
-copies at most
-.I n
-bytes of
-.IR src ,
-and fills the rest of the
-.I dest
-buffer with null bytes.
-.BR Warning :
-If there is no null byte
-among the first
-.I n
-bytes of
-.IR src ,
-the string placed in
-.I dest
-will not be null-terminated.
-.PP
-A simple implementation of
-.BR strncpy ()
-might be:
-.PP
-.in +4n
-.EX
-char *
-strncpy(char *dest, const char *src, size_t n)
-{
- bzero(dest, n);
- memccpy(dest, src, \(aq\e0\(aq, n);
-
- return dest;
-}
-.EE
-.in
-.PP
-The use of
-.BR strncpy ()
-is to copy a C string to a fixed-length buffer
-while ensuring that unused bytes in the destination buffer are zeroed out
-(perhaps to prevent information leaks if the buffer is to be
-written to media or transmitted to another process via an
-interprocess communication technique).
-But
-.BR stpncpy (3)
-is better for this purpose,
-since it detects truncation.
-See BUGS below.
-.SH RETURN VALUE
-The
-.BR strncpy ()
-function returns a pointer to
-the destination buffer
-.IR dest .
-.SH ATTRIBUTES
-For an explanation of the terms used in this section, see
-.BR attributes (7).
-.ad l
-.nh
-.TS
-allbox;
-lbx lb lb
-l l l.
-Interface Attribute Value
-T{
-.BR strncpy ()
-T} Thread safety MT-Safe
-.TE
-.hy
-.ad
-.sp 1
-.SH STANDARDS
-POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH BUGS
-.BR strncpy ()
-has a misleading name.
-It doesn't produce a (null-terminated) string;
-and it should never be used for producing a string.
-.PP
-It can't detect truncation.
-It's probably better to explicitly call
-.BR bzero (3)
-and
-.BR memccpy (3),
-or
-.BR stpncpy (3)
-since they allow detecting truncation.
-.SH SEE ALSO
-.BR bzero (3),
-.BR memccpy (3),
-.BR stpncpy (3),
-.BR string (3),
-.BR wcsncpy (3)
+.so man3/strcpy.3
--
2.38.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH 3/3] stpecpy.3, stpecpyx.3, strlcat.3, strlcpy.3, strscpy.3: Add new links to strcpy(3)
2022-12-11 23:59 string_copy(7): New manual page documenting string copying functions Alejandro Colomar
` (4 preceding siblings ...)
2022-12-12 14:24 ` [PATCH 2/3] stpcpy.3, stpncpy.3, strcat.3, strncat.3, strncpy.3: Transform the old pages into " Alejandro Colomar
@ 2022-12-12 14:24 ` Alejandro Colomar
5 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-12 14:24 UTC (permalink / raw)
To: linux-man; +Cc: Alejandro Colomar
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/stpecpy.3 | 1 +
man3/stpecpyx.3 | 1 +
man3/strlcat.3 | 1 +
man3/strlcpy.3 | 1 +
man3/strscpy.3 | 1 +
5 files changed, 5 insertions(+)
create mode 100644 man3/stpecpy.3
create mode 100644 man3/stpecpyx.3
create mode 100644 man3/strlcat.3
create mode 100644 man3/strlcpy.3
create mode 100644 man3/strscpy.3
diff --git a/man3/stpecpy.3 b/man3/stpecpy.3
new file mode 100644
index 000000000..ff7476a84
--- /dev/null
+++ b/man3/stpecpy.3
@@ -0,0 +1 @@
+.so man3/strcpy.3
diff --git a/man3/stpecpyx.3 b/man3/stpecpyx.3
new file mode 100644
index 000000000..ff7476a84
--- /dev/null
+++ b/man3/stpecpyx.3
@@ -0,0 +1 @@
+.so man3/strcpy.3
diff --git a/man3/strlcat.3 b/man3/strlcat.3
new file mode 100644
index 000000000..ff7476a84
--- /dev/null
+++ b/man3/strlcat.3
@@ -0,0 +1 @@
+.so man3/strcpy.3
diff --git a/man3/strlcpy.3 b/man3/strlcpy.3
new file mode 100644
index 000000000..ff7476a84
--- /dev/null
+++ b/man3/strlcpy.3
@@ -0,0 +1 @@
+.so man3/strcpy.3
diff --git a/man3/strscpy.3 b/man3/strscpy.3
new file mode 100644
index 000000000..ff7476a84
--- /dev/null
+++ b/man3/strscpy.3
@@ -0,0 +1 @@
+.so man3/strcpy.3
--
2.38.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions
2022-12-12 14:24 ` [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions Alejandro Colomar
@ 2022-12-12 17:33 ` Alejandro Colomar
2022-12-12 18:38 ` groff man(7) extensions (was: [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions) G. Branden Robinson
2022-12-12 23:00 ` [PATCH v2 0/3] Rewrite strcpy(3) Alejandro Colomar
` (3 subsequent siblings)
4 siblings, 1 reply; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-12 17:33 UTC (permalink / raw)
To: G. Branden Robinson, groff; +Cc: linux-man
[-- Attachment #1.1: Type: text/plain, Size: 1510 bytes --]
Hi Branden,
On 12/12/22 15:24, Alejandro Colomar wrote:
> +.\" ----- RETURN VALUE :: Deprecated ----------------------------------/
> +.SS Deprecated
> +The following functions return
> +the length of the total string that they tried to create
> +(as if truncation didn't occur).
> +.IP \(bu 3
> +.BR strlcpy (3bsd),
> +.BR strlcat (3bsd)
> +.PP
> +The following function returns
> +the length of the destination string, or
> +.B \-E2BIG
> +on truncation.
> +.IP \(bu 3
> +.BR strscpy (9)
> +.PP
> +The following functions return the
> +.I dst
> +pointer,
> +which is useless.
> +.PD 0
> +.IP \(bu 3
> +.BR strcpy (3),
> +.BR strcat (3)
> +.IP \(bu
> +.BR strncpy (3)
> +.IP \(bu
> +.BR strncat (3)
> +.PD
I realized that the above doesn't produce exactly what I wanted. I wanted this:
The following functions return the dst pointer, which is useless.
• strcpy(3), strcat(3)
• strncpy(3)
• strncat(3)
But I got this:
The following functions return the dst pointer, which is useless.
• strcpy(3), strcat(3)
• strncpy(3)
• strncat(3)
I see various possible solutions, but which would you recommend?
I've thought of:
[
[...]
.PP
.PD 0
.IP \(bu 3
[...]
]
or
[
[...]
.IP \(bu 3
.PD 0
[...]
]
I was thinking about your future (I hope) .LS and .LE, and how they would also
fit in here.
Cheers,
Alex
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* groff man(7) extensions (was: [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions)
2022-12-12 17:33 ` Alejandro Colomar
@ 2022-12-12 18:38 ` G. Branden Robinson
2022-12-13 15:45 ` a Q quotation macro for man(7) (was: groff man(7) extensions) G. Branden Robinson
0 siblings, 1 reply; 53+ messages in thread
From: G. Branden Robinson @ 2022-12-12 18:38 UTC (permalink / raw)
To: Alejandro Colomar; +Cc: groff, linux-man
[-- Attachment #1: Type: text/plain, Size: 5681 bytes --]
Hi Alex,
At 2022-12-12T18:33:52+0100, Alejandro Colomar wrote:
> On 12/12/22 15:24, Alejandro Colomar wrote:
> > +.\" ----- RETURN VALUE :: Deprecated ----------------------------------/
> > +.SS Deprecated
> > +The following functions return
> > +the length of the total string that they tried to create
> > +(as if truncation didn't occur).
> > +.IP \(bu 3
> > +.BR strlcpy (3bsd),
> > +.BR strlcat (3bsd)
> > +.PP
> > +The following function returns
> > +the length of the destination string, or
> > +.B \-E2BIG
> > +on truncation.
> > +.IP \(bu 3
> > +.BR strscpy (9)
> > +.PP
> > +The following functions return the
> > +.I dst
> > +pointer,
> > +which is useless.
> > +.PD 0
> > +.IP \(bu 3
> > +.BR strcpy (3),
> > +.BR strcat (3)
> > +.IP \(bu
> > +.BR strncpy (3)
> > +.IP \(bu
> > +.BR strncat (3)
> > +.PD
>
> I realized that the above doesn't produce exactly what I wanted. I
> wanted this:
>
> The following functions return the dst pointer, which is useless.
>
> • strcpy(3), strcat(3)
> • strncpy(3)
> • strncat(3)
>
> But I got this:
>
> The following functions return the dst pointer, which is useless.
> • strcpy(3), strcat(3)
> • strncpy(3)
> • strncat(3)
>
> I see various possible solutions, but which would you recommend?
>
> I've thought of:
>
> [
> [...]
> .PP
> .PD 0
> .IP \(bu 3
> [...]
> ]
>
> or
>
> [
> [...]
> .IP \(bu 3
> .PD 0
> [...]
> ]
>
> I was thinking about your future (I hope) .LS and .LE, and how they
> would also fit in here.
Either is fine; if it were me, after threatening another radical
innovation, I would probably go with the latter, using ".PD 0" _after_
the first `IP` macro. The hazard there is that if you re-order the
list, you might move the ".PD 0" with it accidentally. Your earlier
approach avoids that at the cost of a _seemingly_ useless `PP` call.
Paragraphing macros in man(7) are not enclosures; they are spot
marks.[1] This is an impedance mismatch with the brains of people who
grew up on HTML/XML.
Also, you don't need to keep restating the indentation amount ("3").
Horizontal and vertical spacing
The indentation argument accepted by .RS, .IP, .TP, and the
deprecated .HP is a number plus an optional scaling unit. If no
scaling unit is given, the man package assumes "n". An indentation
specified in a call to .IP, .TP, or the deprecated .HP persists
until (1) another of these macros is called with an explicit
indentation argument, or (2) .SH, .SS, or .P or its synonyms is
called; these clear the indentation entirely. [...]
(ms(7) works this way, too, though its macro repertoire differs a
bit.[2])
I haven't given much more thought to `LS` and `LE`. I haven't soured on
them; I simply have more urgent fish to fry. The possibility of having
`LS` accept an argument to set the paragraph indentation so that `IP` or
`TP` items can be rearranged freely within has occurred to me. So has
making the inter-paragraph distance itself an argument (possibly just a
Boolean). So has support for auto-enumerated lists. But then I wonder
if man(7) authors really need a macro that is as tricked-out as
mdoc(7)'s list macros, which take up about 5 of its 31 U.S. letter-sized
pages of documentation. That's heavy.
Here's a list of man(7) extensions to which I have given consideration.
KS/KE Keeps. Easy.[3] Harmlessly ignorable by other
implementations.
LS/LE List enclosure. Throws a semantic hint (e.g., for HTML
output) and eliminates final use case of `PD` macro.[4]
DC/TG Semantics at last. Sure to rouse anger in people who
decided long ago that man(7) can't do this.[5] Having
looked more closely at mdoc(7) since writing that, I
think `DC` should accept a _pair_ of arguments as its
second and third parameters for bracketing purposes.
But again, most man page authors would never need to
mess with `DC` at all.
`DS`/`DE` have been squatted on by groff man(7) for 13 years and have
precedent going back at least to DEC Ultrix, but apart from using them
as a sort of ersatz tbl(1) for people who don't want to use to use
tbl(1),[6] I haven't been able to come up with any use cases for it.
Regards,
Branden
[1] For the curious, all the paragraphing macros in groff man(7) call
the same common macro. (They all perform additional operations.)
.\" Break a paragraph. Restore defaults, except for indentation.
.de an-break-paragraph
. ft R
. ps \\n[PS]u
. vs \\n[VS]u
. sp \\n[PD]u
. ns
This internal macro name is subject to change.
[2] The new ms(7) manual for groff 1.23 appears to have stabilized.[7]
Here's a URL to a work area I use to proof-read groff documentation.
I invite you (and others) to check out ms.2022-12-07.pdf, or
whatever version is there at the time.
https://www.dropbox.com/sh/17ftu3z31couf07/AAC_9kq0ZA-Ra2ZhmZFWlLuva?dl=0
[3] I initially shied away from dealing with nested diversions, but I
think I know how to cope with them now. It seems that in a lot of
cases, "bubbling up" as illustrated in groff Git's tbl(1) page is
all that is required.
[4] https://lists.gnu.org/archive/html/groff/2022-05/msg00026.html
[5] https://lore.kernel.org/linux-man/20220724172947.qlunrfnje56yaasv@illithid/
[6] https://lore.kernel.org/linux-man/20220722222045.y7i3yc7d6agygien@illithid/
[7] By saying this, I increase my ability to find a flaw in it, or for
a reader to report one. We use all the QA tools at our disposal.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 0/3] Rewrite strcpy(3)
2022-12-12 14:24 ` [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions Alejandro Colomar
2022-12-12 17:33 ` Alejandro Colomar
@ 2022-12-12 23:00 ` Alejandro Colomar
2022-12-13 20:56 ` Jakub Wilk
` (2 more replies)
2022-12-12 23:00 ` [PATCH v2 1/3] " Alejandro Colomar
` (2 subsequent siblings)
4 siblings, 3 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-12 23:00 UTC (permalink / raw)
To: linux-man; +Cc: Martin Sebor, Alejandro Colomar
I'm describing all string-copying functions together in a single manual
page, using consistent and clear language to help fix long-standing
misuses of those interfaces.
v2 has seen many changes, but two major are:
- Don't deprecate functions. A friendly explanation of why they are
inferior is probably more appealing.
- Use more precise syntax: mostly
s/unterminated string/character sequence/g [Martin].
See the formatted page below.
Alejandro Colomar (3):
strcpy.3: Rewrite page to document all string-copying functions
stpcpy.3, stpncpy.3, strcat.3, strncat.3, strncpy.3: Transform the old
pages into links to strcpy(3)
stpecpy.3, stpecpyx.3, strlcat.3, strlcpy.3, strscpy.3: Add new links
to strcpy(3)
man3/stpcpy.3 | 115 +-----
man3/stpecpy.3 | 1 +
man3/stpecpyx.3 | 1 +
man3/stpncpy.3 | 123 +-----
man3/strcat.3 | 161 +-------
man3/strcpy.3 | 1048 +++++++++++++++++++++++++++++++++++++++++++----
man3/strlcat.3 | 1 +
man3/strlcpy.3 | 1 +
man3/strncat.3 | 172 +-------
man3/strncpy.3 | 130 +-----
man3/strscpy.3 | 1 +
11 files changed, 970 insertions(+), 784 deletions(-)
create mode 100644 man3/stpecpy.3
create mode 100644 man3/stpecpyx.3
create mode 100644 man3/strlcat.3
create mode 100644 man3/strlcpy.3
create mode 100644 man3/strscpy.3
strcpy(3) Library Functions Manual strcpy(3)
NAME
stpcpy, strcpy, strcat, stpecpy, stpecpyx, strlcpy, strlcat, strscpy,
stpncpy, strncpy, ustr2stp, strncat, mempcpy - copy strings and charac‐
ter sequences
LIBRARY
stpcpy(3)
strcpy(3), strcat(3)
stpncpy(3)
strncpy(3)
strncat(3)
mempcpy(3)
Standard C library (libc, -lc)
stpecpy(3), stpecpyx(3)
Not provided by any library.
strlcpy(3), strlcat(3)
Utility functions from BSD systems (libbsd, -lbsd)
strscpy(3)
Not provided by any library. It is a Linux kernel internal
function.
SYNOPSIS
#include <string.h>
Strings
// Chain‐copy a string.
char *stpcpy(char *restrict dst, const char *restrict src);
// Copy/concatenate a string.
char *strcpy(char *restrict dst, const char *restrict src);
char *strcat(char *restrict dst, const char *restrict src);
// Chain‐copy a string with truncation.
char *stpecpy(char *dst, char past_end[0], const char *restrict src);
// Chain‐copy a string with truncation and SIGSEGV on UB.
char *stpecpyx(char *dst, char past_end[0], const char *restrict src);
// Copy/concatenate a string with truncation and SIGSEGV on UB.
size_t strlcpy(char dst[restrict .sz], const char *restrict src,
size_t sz);
size_t strlcat(char dst[restrict .sz], const char *restrict src,
size_t sz);
// Copy a string with truncation.
ssize_t strscpy(char dst[restrict .sz], const char src[restrict .sz],
size_t sz);
Null‐padded character sequences
// Zero a fixed‐width buffer, and
// copy a string with truncation into a character sequence.
char *stpncpy(char dst[restrict .sz], const char *restrict src,
size_t sz);
// Zero a fixed‐width buffer, and
// copy a string with truncation into a character sequence.
char *strncpy(char dest[restrict .sz], const char *restrict src,
size_t sz);
// Chain‐copy a null‐padded character sequence into a string.
char *ustr2stp(char *restrict dst, const char src[restrict .sz],
size_t sz);
// Concatenate a null‐padded character sequence into a string.
char *strncat(char *restrict dst, const char src[restrict .sz],
size_t sz);
Measured character sequences
// Chain‐copy a measured character sequence.
void *mempcpy(void *restrict dst, const void src[restrict .len],
size_t len);
Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
stpcpy(3), stpncpy(3):
Since glibc 2.10:
_POSIX_C_SOURCE >= 200809L
Before glibc 2.10:
_GNU_SOURCE
mempcpy(3):
_GNU_SOURCE
DESCRIPTION
Terms (and abbreviations)
string (str)
is a sequence of zero or more non‐null characters followed by a
null byte.
character sequence (ustr)
is a sequence of zero or more non‐null characters. A program
should never usa a character sequence where a string is re‐
quired. However, with appropriate care, a string can be used in
the place of a character sequence.
null‐padded character sequence
Character sequences can be contained in fixed‐width
buffers, which contain padding null bytes after the char‐
acter sequence, to fill the rest of the buffer without
affecting the character sequence; however, those padding
null bytes are not part of the character sequence.
measured character sequence
Character sequence delimited by its length.
length (len)
is the number of non‐null characters in a string or character
sequence. It is the return value of strlen(str) and of
strnlen(ustr, sz).
size (sz)
refers to the entire buffer where the string or character se‐
quence is contained.
end is the name of a pointer to the terminating null byte of a
string, or a pointer to one past the last character of a charac‐
ter sequence. This is the return value of functions that allow
chaining. It is equivalent to &str[len].
past_end
is the name of a pointer to one past the end of the buffer that
contains a string or character sequence. It is equivalent to
&str[sz]. It is used as a sentinel value, to be able to trun‐
cate strings or character sequences instead of overrunning the
containing buffer.
Copy, concatenate, and chain‐copy
Originally, there was a distinction between functions that copy and
those that concatenate. However, newer functions that copy while al‐
lowing chaining cover both use cases with a single API. They are also
algorithmically faster, since they don’t need to search for the end of
the existing string. However, their use is a bit more verbose.
To chain copy functions, they need to return a pointer to the end.
That’s a byproduct of the copy operation, so it has no performance
costs. Functions that return such a pointer, and thus can be chained,
have names of the form *stp*() or *memp*(), since it’s also common to
name the pointer just p.
Chain‐copying functions that truncate should accept a pointer to one
past the end of the destination buffer, and have names of the form
*stpe*(). This allows not having to recalculate the remaining size af‐
ter each call.
Truncate or not?
The first thing to note is that programmers should be careful with
buffers, so they always have the correct size, and truncation is not
necessary.
In most cases, truncation is not desired, and it is simpler to just do
the copy. Simpler code is safer code. Programming against programming
mistakes by adding more code just adds more points where mistakes can
be made.
Nowadays, compilers can detect most programmer errors with features
like compiler warnings, static analyzers, and _FORTIFY_SOURCE (see
ftm(7)). Keeping the code simple helps these overflow‐detection fea‐
tures be more precise.
When validating user input, however, it makes sense to truncate. Re‐
member to check the return value of such function calls.
Functions that truncate:
• stpecpy(3) is the most efficient string copy function that performs
truncation. It only requires to check for truncation once after all
chained calls.
• stpecpyx(3) is a variant of stpecpy(3) that consumes the entire
source string, to catch bugs in the program by forcing a segmenta‐
tion fault (as strlcpy(3bsd) and strlcat(3bsd) do).
• strlcpy(3bsd) and strlcat(3bsd) are designed to crash if the input
string is invalid (doesn’t contain a terminating null byte).
• strscpy(3) reports an error instead of crashing (similar to
stpecpy(3)).
• stpncpy(3) and strncpy(3) also truncate, but they don’t write
strings, but rather null‐padded character sequences.
Null‐padded character sequences
For historic reasons, some standard APIs, such as utmpx(5), use null‐
padded character sequences in fixed‐width buffers. To interface with
them, specialized functions need to be used.
To copy strings into them, use stpncpy(3).
To copy from an unterminated string within a fixed‐width buffer into a
string, ignoring any trailing null bytes in the source fixed‐width
buffer, you should use ustr2stp(3) or strncat(3).
Measured character sequences
The simplest character sequence copying function is mempcpy(3). It re‐
quires always knowing the length of your character sequences, for which
structures can be used. It makes the code much faster, since you al‐
ways know the length of your character sequences, and can do the mini‐
mal copies and length measurements. mempcpy(3) copies character se‐
quences, so you need to explicitly set the terminating null byte if you
need a string.
The following code can be used to chain‐copy from a measured character
sequence into a string:
p = mempcpy(p, foo->str, foo->len);
*p = '\0';
The following code can be used to chain‐copy from a measured character
sequence into an unterminated string:
p = mempcpy(p, src->str, src->len);
In programs that make considerable use of strings or character se‐
quences, and need the best performance, using overlapping character se‐
quences can make a big difference. It allows holding subsequences of a
larger character sequence. while not duplicating memory nor using time
to do a copy.
However, this is delicate, since it requires using character sequences.
C library APIs use strings, so programs that use character sequences
will have to take care of differentiating strings from character se‐
quences.
String vs character sequence
Some functions only operate on strings. Those require that the input
src is a string, and guarantee an output string (even when truncation
occurs). Functions that concatenate also require that dst holds a
string before the call. List of functions:
• stpcpy(3)
• strcpy(3), strcat(3)
• stpecpy(3), stpecpyx(3)
• strlcpy(3bsd), strlcat(3bsd)
• strscpy(3)
Other functions require an input string, but create a character se‐
quence as output. These functions have confusing names, and have a
long history of misuse. List of functions:
• stpncpy(3)
• strncpy(3)
Other functions operate on an input character sequence, and create an
output string. Functions that concatenate also require that dst holds
a string before the call. strncat(3) has an even more misleading name
than the functions above. List of functions:
• ustr2stp(3)
• strncat(3)
And the last one, operates on an input character sequence to create an
output character sequence. But because it asks for the length, and a
string is by nature composed of a character sequence of the same length
plus a terminating null byte, a string is also accepted as input.
Function:
• mempcpy(3)
Functions
stpcpy(3)
This function copies the input string into a destination string.
The programmer is responsible for allocating a buffer large
enough. It returns a pointer suitable for chaining.
An implementation of this function might be:
char *
stpcpy(char *restrict dst, const char *restrict src)
{
return mempcpy(dst, src, strlen(src));
}
strcpy(3)
strcat(3)
These functions copy the input string into a destination string.
The programmer is responsible for allocating a buffer large
enough. The return value is useless.
stpcpy(3) is a faster alternative to these functions.
An implementation of these functions might be:
char *
strcpy(char *restrict dst, const char *restrict src)
{
stpcpy(dst, src);
return dst;
}
char *
strcat(char *restrict dst, const char *restrict src)
{
stpcpy(dst + strlen(dst), src);
return dst;
}
stpecpy(3)
stpecpyx(3)
These functions copy the input string into a destination string.
If the destination buffer, limited by a pointer to one past the
end of it, isn’t large enough to hold the copy, the resulting
string is truncated (but it is guaranteed to be null‐termi‐
nated). They return a pointer suitable for chaining. Trunca‐
tion needs to be detected only once after the last chained call.
stpecpyx(3) has identical semantics to stpecpy(3), except that
it forces a SIGSEGV if the src pointer is not a string.
These functions are not provided by any library, but you can de‐
fine them with the following reference implementations:
/* This code is in the public domain. */
char *
stpecpy(char *dst, char past_end[0],
const char *restrict src)
{
char *p;
if (dst == past_end)
return past_end;
p = memccpy(dst, src, '\0', past_end - dst);
if (p != NULL)
return p - 1;
/* truncation detected */
past_end[-1] = '\0';
return past_end;
}
/* This code is in the public domain. */
char *
stpecpyx(char *dst, char past_end[0],
const char *restrict src)
{
if (src[strlen(src)] != '\0')
raise(SIGSEGV);
return stpecpy(dst, past_end, src);
}
strlcpy(3bsd)
strlcat(3bsd)
These functions copy the input string into a destination string.
If the destination buffer, limited by its size, isn’t large
enough to hold the copy, the resulting string is truncated (but
it is guaranteed to be null‐terminated). They return the length
of the total string they tried to create. These functions force
a SIGSEGV if the src pointer is not a string.
stpecpyx(3) is a faster alternative to these functions.
strscpy(3)
This function copies the input string into a destination string.
If the destination buffer, limited by its size, isn’t large
enough to hold the copy, the resulting string is truncated (but
it is guaranteed to be null‐terminated). It returns the length
of the destination string, or -E2BIG on truncation.
stpecpy(3) is a simpler and faster alternative to this function.
stpncpy(3)
This function copies the input string into a destination null‐
padded character sequence in a fixed‐width buffer. If the des‐
tination buffer, limited by its size, isn’t large enough to hold
the copy, the resulting character sequence is truncated. Since
it creates a character sequence, it doesn’t need to write a ter‐
minating null byte. It returns a pointer suitable for chaining,
but it’s not ideal for that. Truncation needs to be detected
only once after the last chained call.
If you’re going to use this function in chained calls, it would
be useful to develop a similar function that accepts a pointer
to one past the end of the buffer instead of a size.
An implementation of this function might be:
char *
stpncpy(char *restrict dst, const char *restrict src,
size_t sz)
{
char *p;
bzero(dst, sz);
p = memccpy(dst, src, '\0', sz);
if (p == NULL)
return dst + sz;
return p - 1;
}
ustr2stp(3)
This function copies the input character sequence contained in a
null‐padded wixed‐width buffer, into a destination string. The
programmer is responsible for allocating a buffer large enough.
It returns a pointer suitable for chaining.
A truncating version of this function doesn’t exist, since the
size of the original character sequence is always known, so it
wouldn’t be very useful.
This function is not provided by any library, but you can define
it with the following reference implementation:
/* This code is in the public domain. */
char *
ustr2stp(char *restrict dst, const char *restrict src,
size_t sz)
{
char *end;
end = memccpy(dst, src, '\0', sz)) ?: dst + sz;
*end = '\0';
return end;
}
strncpy(3)
This function is identical to stpncpy(3) except for the useless
return value. Due to the return value, with this function it’s
hard to correctly check for truncation.
stpncpy(3) is a simpler alternative to this function.
An implementation of this function might be:
char *
strncpy(char *restrict dst, const char *restrict src,
size_t sz)
{
stpncpy(dst, src, sz);
return dst;
}
strncat(3)
Do not confuse this function with strncpy(3); they are not re‐
lated at all.
This function concatenates the input character sequence con‐
tained in a null‐padded wixed‐width buffer, into a destination
string. The programmer is responsible for allocating a buffer
large enough. The return value is useless.
ustr2stp(3) is a faster alternative to this function.
An implementation of this function might be:
char *
strncat(char *restrict dst, const char *restrict src,
size_t sz)
{
ustr2stp(dst + strlen(dst), src, sz);
return dst;
}
mempcpy(3)
This function copies the input character sequence, limited by
its length, into a destination character sequence. The program‐
mer is responsible for allocating a buffer large enough. It re‐
turns a pointer suitable for chaining.
An implementation of this function might be:
void *
mempcpy(void *restrict dst, const void *restrict src,
size_t len)
{
return memcpy(dst, src, len) + len;
}
RETURN VALUE
The following functions return a pointer to the terminating null byte
in the destination string.
• stpcpy(3)
• ustr2stp(3)
The following functions return a pointer to the terminating null byte
in the destination string, except when truncation occurs; if truncation
occurs, they return a pointer to one past the end of the destination
buffer (past_end).
• stpecpy(3), stpecpyx(3)
The following function returns a pointer to one after the last charac‐
ter in the destination character sequence; if truncation occurs, that
pointer is equivalent to a pointer to one past the end of the destina‐
tion buffer.
• stpncpy(3)
The following function returns a pointer to one after the last charac‐
ter in the destination character sequence.
• mempcpy(3)
The following functions return the length of the total string that they
tried to create (as if truncation didn’t occur).
• strlcpy(3bsd), strlcat(3bsd)
The following function returns the length of the destination string, or
-E2BIG on truncation.
• strscpy(3)
The following functions return the dst pointer, which is useless.
• strcpy(3), strcat(3)
• strncpy(3)
• strncat(3)
ATTRIBUTES
For an explanation of the terms used in this section, see attrib‐
utes(7).
┌────────────────────────────────────────────┬───────────────┬─────────┐
│Interface │ Attribute │ Value │
├────────────────────────────────────────────┼───────────────┼─────────┤
│stpcpy(), strcpy(), strcat(), stpecpy(), │ Thread safety │ MT‐Safe │
│stpecpyx() strlcpy(), strlcat(), strscpy(), │ │ │
│stpncpy(), strncpy(), ustr2stp(), │ │ │
│strncat(), mempcpy() │ │ │
└────────────────────────────────────────────┴───────────────┴─────────┘
STANDARDS
strcpy(3), strcat(3)
strncpy(3)
strncat(3)
POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
stpcpy(3)
stpncpy(3)
POSIX.1‐2008.
strlcpy(3bsd), strlcat(3bsd)
Functions originated in OpenBSD and present in some Unix sys‐
tems.
mempcpy(3)
This function is a GNU extension.
strscpy(3)
Linux kernel internal function.
stpecpy(3), stpecpyx(3)
ustr2stp(3)
Not defined by any standards nor libraries.
CAVEATS
Don’t mix chain calls to truncating and non‐truncating functions. It
is conceptually wrong unless you know that the first part of a copy
will always fit. Anyway, the performance difference will probably be
negligible, so it will probably be more clear if you use consistent se‐
mantics: either truncating or non‐truncating. Calling a non‐truncating
function after a truncating one is necessarily wrong.
Some of the functions described here are not provided by any library;
you should write your own copy if you want to use them. See STANDARDS.
EXAMPLES
The following are examples of correct use of each of these functions.
stpcpy(3)
p = buf;
p = stpcpy(p, "Hello ");
p = stpcpy(p, "world");
p = stpcpy(p, "!");
len = p - buf;
puts(buf);
strcpy(3)
strcat(3)
strcpy(buf, "Hello ");
strcat(buf, "world");
strcat(buf, "!");
len = strlen(buf);
puts(buf);
stpecpy(3)
stpecpyx(3)
past_end = buf + sizeof(buf);
p = buf;
p = stpecpy(p, past_end, "Hello ");
p = stpecpy(p, past_end, "world");
p = stpecpy(p, past_end, "!");
if (p == past_end) {
p--;
goto toolong;
}
len = p - buf;
puts(buf);
strlcpy(3bsd)
strlcat(3bsd)
if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
goto toolong;
if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
goto toolong;
len = strlcat(buf, "!", sizeof(buf));
if (len >= sizeof(buf))
goto toolong;
puts(buf);
strscpy(3)
len = strscpy(buf, "Hello world!", sizeof(buf));
if (len == -E2BIG)
goto toolong;
puts(buf);
stpncpy(3)
past_end = buf + sizeof(buf);
end = stpncpy(buf, "Hello world!", sizeof(buf));
if (end == past_end)
goto toolong;
len = end - buf;
for (size_t i = 0; i < sizeof(buf); i++)
putchar(buf[i]);
strncpy(3)
strncpy(buf, "Hello world!", sizeof(buf));
if (buf + sizeof(buf) - 1 == '\0')
goto toolong;
len = strnlen(buf, sizeof(buf));
for (size_t i = 0; i < sizeof(buf); i++)
putchar(buf[i]);
ustr2stp(3)
p = buf;
p = ustr2stp(p, "Hello ", 6);
p = ustr2stp(p, "world", 42); // Padding null bytes ignored.
p = ustr2stp(p, "!", 1);
len = p - buf;
puts(buf);
strncat(3)
buf[0] = '\0'; // There’s no ’cpy’ function to this ’cat’.
strncat(buf, "Hello ", 6);
strncat(buf, "world", 42); // Padding null bytes ignored.
strncat(buf, "!", 1);
len = strlen(buf);
puts(buf);
mempcpy(3)
p = buf;
p = mempcpy(p, "Hello ", 6);
p = mempcpy(p, "world", 5);
p = mempcpy(p, "!", 1);
p = '\0';
len = p - buf;
puts(buf);
SEE ALSO
bzero(3), memcpy(3), memccpy(3), mempcpy(3), string(3)
Linux man‐pages (unreleased) (date) strcpy(3)
--
2.38.1
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v2 1/3] strcpy.3: Rewrite page to document all string-copying functions
2022-12-12 14:24 ` [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions Alejandro Colomar
2022-12-12 17:33 ` Alejandro Colomar
2022-12-12 23:00 ` [PATCH v2 0/3] Rewrite strcpy(3) Alejandro Colomar
@ 2022-12-12 23:00 ` Alejandro Colomar
2022-12-12 23:00 ` [PATCH v2 2/3] stpcpy.3, stpncpy.3, strcat.3, strncat.3, strncpy.3: Transform the old pages into links to strcpy(3) Alejandro Colomar
2022-12-12 23:00 ` [PATCH v2 3/3] stpecpy.3, stpecpyx.3, strlcat.3, strlcpy.3, strscpy.3: Add new " Alejandro Colomar
4 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-12 23:00 UTC (permalink / raw)
To: linux-man; +Cc: Martin Sebor, Alejandro Colomar
This is an opportunity to use consistent language across the
documentation for all string-copying functions.
It is also easier to show the similarities and differences between all
of the functions, so that a reader can use this page to know which
function is needed for a given task.
Many functions that are inferior to another one, have been marked as
deprecated, notwithstanding the deprecation status in C libraries or
any standards. Alternatives have been given in the same page, with
reference implementations.
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/strcpy.3 | 1048 ++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 960 insertions(+), 88 deletions(-)
diff --git a/man3/strcpy.3 b/man3/strcpy.3
index 74c3180ae..7e216e3bf 100644
--- a/man3/strcpy.3
+++ b/man3/strcpy.3
@@ -1,48 +1,765 @@
-.\" Copyright (C) 1993 David Metcalfe (david@prism.demon.co.uk)
+.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\" References consulted:
-.\" Linux libc source code
-.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
-.\" 386BSD man pages
-.\" Modified Sat Jul 24 18:06:49 1993 by Rik Faith (faith@cs.unc.edu)
-.\" Modified Fri Aug 25 23:17:51 1995 by Andries Brouwer (aeb@cwi.nl)
-.\" Modified Wed Dec 18 00:47:18 1996 by Andries Brouwer (aeb@cwi.nl)
-.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
-.\" Improve discussion of strncpy().
+.\" SPDX-License-Identifier: BSD-3-Clause
.\"
.TH strcpy 3 (date) "Linux man-pages (unreleased)"
+.\" ----- NAME :: -----------------------------------------------------/
.SH NAME
-strcpy \- copy a string
+stpcpy,
+strcpy, strcat,
+stpecpy, stpecpyx,
+strlcpy, strlcat,
+strscpy,
+stpncpy,
+strncpy,
+ustr2stp,
+strncat,
+mempcpy
+\- copy strings and character sequences
+.\" ----- LIBRARY :: --------------------------------------------------/
.SH LIBRARY
+.TP
+.BR stpcpy (3)
+.TQ
+.BR strcpy "(3), \c"
+.BR strcat (3)
+.TQ
+.BR stpncpy (3)
+.TQ
+.BR strncpy (3)
+.TQ
+.BR strncat (3)
+.TQ
+.BR mempcpy (3)
Standard C library
.RI ( libc ", " \-lc )
+.TP
+.BR stpecpy "(3), \c"
+.BR stpecpyx (3)
+Not provided by any library.
+.TP
+.BR strlcpy "(3), \c"
+.BR strlcat (3)
+Utility functions from BSD systems
+.RI ( libbsd ", " \-lbsd )
+.TP
+.BR strscpy (3)
+Not provided by any library.
+It is a Linux kernel internal function.
+.\" ----- SYNOPSIS :: -------------------------------------------------/
.SH SYNOPSIS
.nf
.B #include <string.h>
+.fi
+.\" ----- SYNOPSIS :: (Null-terminated) strings -----------------------/
+.SS Strings
+.nf
+// Chain-copy a string.
+.BI "char *stpcpy(char *restrict " dst ", const char *restrict " src );
.PP
-.BI "char *strcpy(char *restrict " dest ", const char *restrict " src );
+// Copy/concatenate a string.
+.BI "char *strcpy(char *restrict " dst ", const char *restrict " src );
+.BI "char *strcat(char *restrict " dst ", const char *restrict " src );
+.PP
+// Chain-copy a string with truncation.
+.BI "char *stpecpy(char *" dst ", char " past_end "[0], \
+const char *restrict " src );
+.PP
+// Chain-copy a string with truncation and SIGSEGV on UB.
+.BI "char *stpecpyx(char *" dst ", char " past_end "[0], \
+const char *restrict " src );
+.PP
+// Copy/concatenate a string with truncation and SIGSEGV on UB.
+.BI "size_t strlcpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.BI "size_t strlcat(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Copy a string with truncation.
+.BI "ssize_t strscpy(char " dst "[restrict ." sz "], \
+const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.fi
+.\" ----- SYNOPSIS :: Null-padded character sequences --------/
+.SS Null-padded character sequences
+.nf
+// Zero a fixed-width buffer, and
+// copy a string with truncation into a character sequence.
+.BI "char *stpncpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Zero a fixed-width buffer, and
+// copy a string with truncation into a character sequence.
+.BI "char *strncpy(char " dest "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Chain-copy a null-padded character sequence into a string.
+.BI "char *ustr2stp(char *restrict " dst ", \
+const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.PP
+// Concatenate a null-padded character sequence into a string.
+.BI "char *strncat(char *restrict " dst ", const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.fi
+.\" ----- SYNOPSIS :: Measured character sequences --------------------/
+.SS Measured character sequences
+.nf
+// Chain-copy a measured character sequence.
+.BI "void *mempcpy(void *restrict " dst ", \
+const void " src "[restrict ." len ],
+.BI " size_t " len );
+.fi
+.PP
+.RS -4
+Feature Test Macro Requirements for glibc (see
+.BR feature_test_macros (7)):
+.RE
+.PP
+.BR stpcpy (3),
+.BR stpncpy (3):
+.nf
+ Since glibc 2.10:
+ _POSIX_C_SOURCE >= 200809L
+ Before glibc 2.10:
+ _GNU_SOURCE
+.fi
+.PP
+.BR mempcpy (3):
+.nf
+ _GNU_SOURCE
.fi
.SH DESCRIPTION
-The
-.BR strcpy ()
-function copies the string pointed to by
-.IR src ,
-including the terminating null byte (\(aq\e0\(aq),
-to the buffer pointed to by
-.IR dest .
-The strings may not overlap, and the destination string
-.I dest
-must be large enough to receive the copy.
-.I Beware of buffer overruns!
-(See BUGS.)
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: -----------------/
+.SS Terms (and abbreviations)
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: string (str) ----/
+.TP
+.IR "string " ( str )
+is a sequence of zero or more non-null characters followed by a null byte.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: null-padded character seq
+.TP
+.IR "character sequence " ( ustr )
+is a sequence of zero or more non-null characters.
+A program should never usa a character sequence where a string is required.
+However, with appropriate care,
+a string can be used in the place of a character sequence.
+.RS
+.TP
+.I null-padded character sequence
+Character sequences can be contained in fixed-width buffers,
+which contain padding null bytes after the character sequence,
+to fill the rest of the buffer
+without affecting the character sequence;
+however, those padding null bytes are not part of the character sequence.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: measured character sequence
+.TP
+.I measured character sequence
+Character sequence delimited by its length.
+.RE
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: length (len) ----/
+.TP
+.IR "length " ( len )
+is the number of non-null characters in a string or character sequence.
+It is the return value of
+.I strlen(str)
+and of
+.IR "strnlen(ustr, sz)" .
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: size (sz) -------/
+.TP
+.IR "size " ( sz )
+refers to the entire buffer
+where the string or character sequence is contained.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: end -------------/
+.TP
+.I end
+is the name of a pointer to the terminating null byte of a string,
+or a pointer to one past the last character of a character sequence.
+This is the return value of functions that allow chaining.
+It is equivalent to
+.IR &str[len] .
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: past_end --------/
+.TP
+.I past_end
+is the name of a pointer to one past the end of the buffer
+that contains a string or character sequence.
+It is equivalent to
+.IR &str[sz] .
+It is used as a sentinel value,
+to be able to truncate strings or character sequences
+instead of overrunning the containing buffer.
+.\" ----- DESCRIPTION :: Copy, concatenate, and chain-copy ------------/
+.SS Copy, concatenate, and chain-copy
+Originally,
+there was a distinction between functions that copy and those that concatenate.
+However, newer functions that copy while allowing chaining
+cover both use cases with a single API.
+They are also algorithmically faster,
+since they don't need to search for the end of the existing string.
+However, their use is a bit more verbose.
+.PP
+To chain copy functions,
+they need to return a pointer to the
+.IR end .
+That's a byproduct of the copy operation,
+so it has no performance costs.
+Functions that return such a pointer,
+and thus can be chained,
+have names of the form
+.RB * stp *()
+or
+.RB * memp *(),
+since it's also common to name the pointer just
+.IR p .
+.PP
+Chain-copying functions that truncate
+should accept a pointer to one past the end of the destination buffer,
+and have names of the form
+.RB * stpe *().
+This allows not having to recalculate the remaining size after each call.
+.\" ----- DESCRIPTION :: Truncate or not? -----------------------------/
+.SS Truncate or not?
+The first thing to note is that programmers should be careful with buffers,
+so they always have the correct size,
+and truncation is not necessary.
+.PP
+In most cases,
+truncation is not desired,
+and it is simpler to just do the copy.
+Simpler code is safer code.
+Programming against programming mistakes by adding more code
+just adds more points where mistakes can be made.
+.PP
+Nowadays,
+compilers can detect most programmer errors with features like
+compiler warnings,
+static analyzers, and
+.BR \%_FORTIFY_SOURCE
+(see
+.BR ftm (7)).
+Keeping the code simple
+helps these overflow-detection features be more precise.
+.PP
+When validating user input,
+however,
+it makes sense to truncate.
+Remember to check the return value of such function calls.
+.PP
+Functions that truncate:
+.IP \(bu 3
+.BR stpecpy (3)
+is the most efficient string copy function that performs truncation.
+It only requires to check for truncation once after all chained calls.
+.IP \(bu
+.BR stpecpyx (3)
+is a variant of
+.BR stpecpy (3)
+that consumes the entire source string,
+to catch bugs in the program
+by forcing a segmentation fault (as
+.BR strlcpy (3bsd)
+and
+.BR strlcat (3bsd)
+do).
+.IP \(bu
+.BR strlcpy (3bsd)
+and
+.BR strlcat (3bsd)
+are designed to crash if the input string is invalid
+(doesn't contain a terminating null byte).
+.IP \(bu
+.BR strscpy (3)
+reports an error instead of crashing (similar to
+.BR stpecpy (3)).
+.IP \(bu
+.BR stpncpy (3)
+and
+.BR strncpy (3)
+also truncate, but they don't write strings,
+but rather null-padded character sequences.
+.\" ----- DESCRIPTION :: Null-padded character sequences --------------/
+.SS Null-padded character sequences
+For historic reasons,
+some standard APIs,
+such as
+.BR utmpx (5),
+use null-padded character sequences in fixed-width buffers.
+To interface with them,
+specialized functions need to be used.
+.PP
+To copy strings into them, use
+.BR stpncpy (3).
+.PP
+To copy from an unterminated string within a fixed-width buffer into a string,
+ignoring any trailing null bytes in the source fixed-width buffer,
+you should use
+.BR ustr2stp (3)
+or
+.BR strncat (3).
+.\" ----- DESCRIPTION :: Measured character sequences -----------------/
+.SS Measured character sequences
+The simplest character sequence copying function is
+.BR mempcpy (3).
+It requires always knowing the length of your character sequences,
+for which structures can be used.
+It makes the code much faster,
+since you always know the length of your character sequences,
+and can do the minimal copies and length measurements.
+.BR mempcpy (3)
+copies character sequences,
+so you need to explicitly set the terminating null byte if you need a string.
+.PP
+The following code can be used to
+chain-copy from a measured character sequence into a string:
+.PP
+.in +4n
+.EX
+p = mempcpy(p, foo\->str, foo\->len);
+*p = \(aq\e0\(aq;
+.EE
+.in
+.PP
+The following code can be used to
+chain-copy from a measured character sequence into an unterminated string:
+.PP
+.in +4n
+.EX
+p = mempcpy(p, src\->str, src\->len);
+.EE
+.in
+.PP
+In programs that make considerable use of strings or character sequences,
+and need the best performance,
+using overlapping character sequences can make a big difference.
+It allows holding subsequences of a larger character sequence.
+while not duplicating memory
+nor using time to do a copy.
+.PP
+However, this is delicate,
+since it requires using character sequences.
+C library APIs use strings,
+so programs that use character sequences
+will have to take care of differentiating strings from character sequences.
+.\" ----- DESCRIPTION :: String vs character sequence -----------------/
+.SS String vs character sequence
+Some functions only operate on strings.
+Those require that the input
+.I src
+is a string,
+and guarantee an output string
+(even when truncation occurs).
+Functions that concatenate
+also require that
+.I dst
+holds a string before the call.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR stpcpy (3)
+.IP \(bu
+.BR strcpy "(3), \c"
+.BR strcat (3)
+.IP \(bu
+.BR stpecpy "(3), \c"
+.BR stpecpyx (3)
+.IP \(bu
+.BR strlcpy "(3bsd), \c"
+.BR strlcat (3bsd)
+.IP \(bu
+.BR strscpy (3)
+.PD
+.PP
+Other functions require an input string,
+but create a character sequence as output.
+These functions have confusing names,
+and have a long history of misuse.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR stpncpy (3)
+.IP \(bu
+.BR strncpy (3)
+.PD
+.PP
+Other functions operate on an input character sequence,
+and create an output string.
+Functions that concatenate
+also require that
+.I dst
+holds a string before the call.
+.BR strncat (3)
+has an even more misleading name than the functions above.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR ustr2stp (3)
+.IP \(bu
+.BR strncat (3)
+.PD
+.PP
+And the last one,
+operates on an input character sequence
+to create an output character sequence.
+But because it asks for the length,
+and a string is by nature composed of a character sequence of the same length
+plus a terminating null byte,
+a string is also accepted as input.
+Function:
+.IP \(bu 3
+.BR mempcpy (3)
+.\" ----- DESCRIPTION :: Functions :: ---------------------------------/
+.SS Functions
+.\" ----- DESCRIPTION :: Functions :: stpcpy(3) -----------------------/
+.TP
+.BR stpcpy (3)
+This function copies the input string into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+char *
+stpcpy(char *restrict dst, const char *restrict src)
+{
+ return mempcpy(dst, src, strlen(src));
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: strcpy(3), strcat(3) ------------/
+.TP
+.BR strcpy (3)
+.TQ
+.BR strcat (3)
+These functions copy the input string into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+The return value is useless.
+.IP
+.BR stpcpy (3)
+is a faster alternative to these functions.
+.IP
+An implementation of these functions might be:
+.IP
+.in +4n
+.EX
+char *
+strcpy(char *restrict dst, const char *restrict src)
+{
+ stpcpy(dst, src);
+ return dst;
+}
+
+char *
+strcat(char *restrict dst, const char *restrict src)
+{
+ stpcpy(dst + strlen(dst), src);
+ return dst;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: stpecpy(3), stpecpyx(3) ---------/
+.TP
+.BR stpecpy (3)
+.TQ
+.BR stpecpyx (3)
+These functions copy the input string into a destination string.
+If the destination buffer,
+limited by a pointer to one past the end of it,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+They return a pointer suitable for chaining.
+Truncation needs to be detected only once after the last chained call.
+.BR stpecpyx (3)
+has identical semantics to
+.BR stpecpy (3),
+except that it forces a SIGSEGV if the
+.I src
+pointer is not a string.
+.IP
+These functions are not provided by any library,
+but you can define them with the following reference implementations:
+.IP
+.in +4n
+.EX
+/* This code is in the public domain. */
+char *
+stpecpy(char *dst, char past_end[0],
+ const char *restrict src)
+{
+ char *p;
+
+ if (dst == past_end)
+ return past_end;
+
+ p = memccpy(dst, src, \(aq\e0\(aq, past_end \- dst);
+ if (p != NULL)
+ return p \- 1;
+
+ /* truncation detected */
+ past_end[\-1] = \(aq\e0\(aq;
+ return past_end;
+}
+
+/* This code is in the public domain. */
+char *
+stpecpyx(char *dst, char past_end[0],
+ const char *restrict src)
+{
+ if (src[strlen(src)] != \(aq\e0\(aq)
+ raise(SIGSEGV);
+
+ return stpecpy(dst, past_end, src);
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: strlcpy(3bsd), strlcat(3bsd) ----/
+.TP
+.BR strlcpy (3bsd)
+.TQ
+.BR strlcat (3bsd)
+These functions copy the input string into a destination string.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+They return the length of the total string they tried to create.
+These functions force a SIGSEGV if the
+.I src
+pointer is not a string.
+.IP
+.BR stpecpyx (3)
+is a faster alternative to these functions.
+.\" ----- DESCRIPTION :: Functions :: strscpy(3) ----------------------/
+.TP
+.BR strscpy (3)
+This function copies the input string into a destination string.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+It returns the length of the destination string, or
+.B \-E2BIG
+on truncation.
+.IP
+.BR stpecpy (3)
+is a simpler and faster alternative to this function.
+.RE
+.\" ----- DESCRIPTION :: Functions :: stpncpy(3) ----------------------/
+.TP
+.BR stpncpy (3)
+This function copies the input string into
+a destination null-padded character sequence in a fixed-width buffer.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting character sequence is truncated.
+Since it creates a character sequence,
+it doesn't need to write a terminating null byte.
+It returns a pointer suitable for chaining,
+but it's not ideal for that.
+Truncation needs to be detected only once after the last chained call.
+.IP
+If you're going to use this function in chained calls,
+it would be useful to develop a similar function
+that accepts a pointer to one past the end of the buffer instead of a size.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+char *
+stpncpy(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ char *p;
+
+ bzero(dst, sz);
+ p = memccpy(dst, src, \(aq\e0\(aq, sz);
+ if (p == NULL)
+ return dst + sz;
+
+ return p \- 1;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: ustr2stp(3) ---------------------/
+.TP
+.BR ustr2stp (3)
+This function copies the input character sequence
+contained in a null-padded wixed-width buffer,
+into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+A truncating version of this function doesn't exist,
+since the size of the original character sequence is always known,
+so it wouldn't be very useful.
+.IP
+This function is not provided by any library,
+but you can define it with the following reference implementation:
+.IP
+.in +4n
+.EX
+/* This code is in the public domain. */
+char *
+ustr2stp(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ char *end;
+
+ end = memccpy(dst, src, \(aq\e0\(aq, sz)) ?: dst + sz;
+ *end = \(aq\e0\(aq;
+
+ return end;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: strncpy(3) ----------------------/
+.TP
+.BR strncpy (3)
+This function is identical to
+.BR stpncpy (3)
+except for the useless return value.
+Due to the return value,
+with this function it's hard to correctly check for truncation.
+.IP
+.BR stpncpy (3)
+is a simpler alternative to this function.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+char *
+strncpy(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ stpncpy(dst, src, sz);
+ return dst;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: strncat(3) ----------------------/
+.TP
+.BR strncat (3)
+Do not confuse this function with
+.BR strncpy (3);
+they are not related at all.
+.IP
+This function concatenates the input character sequence
+contained in a null-padded wixed-width buffer,
+into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+The return value is useless.
+.IP
+.BR ustr2stp (3)
+is a faster alternative to this function.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+char *
+strncat(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ ustr2stp(dst + strlen(dst), src, sz);
+ return dst;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: mempcpy(3) ----------------------/
+.TP
+.BR mempcpy (3)
+This function copies the input character sequence,
+limited by its length,
+into a destination character sequence.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+void *
+mempcpy(void *restrict dst, const void *restrict src,
+ size_t len)
+{
+ return memcpy(dst, src, len) + len;
+}
+.EE
+.in
+.\" ----- RETURN VALUE :: ---------------------------------------------/
.SH RETURN VALUE
-The
-.BR strcpy ()
-function returns a pointer to
-the destination string
-.IR dest .
+The following functions return
+a pointer to the terminating null byte in the destination string.
+.IP \(bu 3
+.PD 0
+.BR stpcpy (3)
+.IP \(bu
+.BR ustr2stp (3)
+.PD
+.PP
+The following functions return
+a pointer to the terminating null byte in the destination string,
+except when truncation occurs;
+if truncation occurs,
+they return a pointer to one past the end of the destination buffer
+.RI ( past_end ).
+.IP \(bu 3
+.BR stpecpy (3),
+.BR stpecpyx (3)
+.PP
+The following function returns
+a pointer to one after the last character
+in the destination character sequence;
+if truncation occurs,
+that pointer is equivalent to
+a pointer to one past the end of the destination buffer.
+.IP \(bu 3
+.BR stpncpy (3)
+.PP
+The following function returns
+a pointer to one after the last character
+in the destination character sequence.
+.IP \(bu 3
+.BR mempcpy (3)
+.PP
+The following functions return
+the length of the total string that they tried to create
+(as if truncation didn't occur).
+.IP \(bu 3
+.BR strlcpy (3bsd),
+.BR strlcat (3bsd)
+.PP
+The following function returns
+the length of the destination string, or
+.B \-E2BIG
+on truncation.
+.IP \(bu 3
+.BR strscpy (3)
+.PP
+The following functions return the
+.I dst
+pointer,
+which is useless.
+.IP \(bu 3
+.PD 0
+.BR strcpy (3),
+.BR strcat (3)
+.IP \(bu
+.BR strncpy (3)
+.IP \(bu
+.BR strncat (3)
+.PD
+.\" ----- ATTRIBUTES :: -----------------------------------------------/
.SH ATTRIBUTES
For an explanation of the terms used in this section, see
.BR attributes (7).
@@ -54,73 +771,228 @@ .SH ATTRIBUTES
l l l.
Interface Attribute Value
T{
-.BR strcpy ()
+.BR stpcpy (),
+.BR strcpy (),
+.BR strcat (),
+.BR stpecpy (),
+.BR stpecpyx ()
+.BR strlcpy (),
+.BR strlcat (),
+.BR strscpy (),
+.BR stpncpy (),
+.BR strncpy (),
+.BR ustr2stp (),
+.BR strncat (),
+.BR mempcpy ()
T} Thread safety MT-Safe
.TE
.hy
.ad
.sp 1
+.\" ----- STANDARDS :: ------------------------------------------------/
.SH STANDARDS
-POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH NOTES
-.SS strlcpy()
-Some systems (the BSDs, Solaris, and others) provide the following function:
+.TP
+.BR strcpy "(3), \c"
+.BR strcat (3)
+.TQ
+.BR strncpy (3)
+.TQ
+.BR strncat (3)
+POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
+.TP
+.BR stpcpy (3)
+.\" This function was added to POSIX.1-2008.
+.\" Before that, it was not part of
+.\" the C or POSIX.1 standards, nor customary on UNIX systems.
+.\" It first appeared at least as early as 1986,
+.\" in the Lattice C AmigaDOS compiler,
+.\" then in the GNU fileutils and GNU textutils in 1989,
+.\" and in the GNU C library by 1992.
+.\" It is also present on the BSDs.
+.TQ
+.BR stpncpy (3)
+.\" This function was added to POSIX.1-2008.
+.\" Before that, it was a GNU extension.
+.\" It first appeared in glibc 1.07 in 1993.
+POSIX.1-2008.
+.TP
+.BR strlcpy "(3bsd), \c"
+.BR strlcat (3bsd)
+Functions originated in OpenBSD and present in some Unix systems.
+.TP
+.BR mempcpy (3)
+This function is a GNU extension.
+.TP
+.BR strscpy (3)
+Linux kernel internal function.
+.TP
+.BR stpecpy "(3), \c"
+.BR stpecpyx (3)
+.TQ
+.BR ustr2stp (3)
+Not defined by any standards nor libraries.
+.\" ----- CAVEATS :: --------------------------------------------------/
+.SH CAVEATS
+Don't mix chain calls to truncating and non-truncating functions.
+It is conceptually wrong
+unless you know that the first part of a copy will always fit.
+Anyway, the performance difference will probably be negligible,
+so it will probably be more clear if you use consistent semantics:
+either truncating or non-truncating.
+Calling a non-truncating function after a truncating one is necessarily wrong.
.PP
+Some of the functions described here are not provided by any library;
+you should write your own copy if you want to use them.
+See STANDARDS.
+.\" ----- EXAMPLES :: -------------------------------------------------/
+.SH EXAMPLES
+The following are examples of correct use of each of these functions.
+.\" ----- EXAMPLES :: stpcpy(3) ---------------------------------------/
+.TP
+.BR stpcpy (3)
.in +4n
.EX
-size_t strlcpy(char *dest, const char *src, size_t size);
+p = buf;
+p = stpcpy(p, "Hello ");
+p = stpcpy(p, "world");
+p = stpcpy(p, "!");
+len = p \- buf;
+puts(buf);
.EE
.in
-.PP
-.\" http://static.usenix.org/event/usenix99/full_papers/millert/millert_html/index.html
-.\" "strlcpy and strlcat - consistent, safe, string copy and concatenation"
-.\" 1999 USENIX Annual Technical Conference
-This function is similar to
-.BR strcpy (),
-but it copies at most
-.I size\-1
-bytes to
-.IR dest ,
-truncating the string as necessary.
-It always adds a terminating null byte.
-This function fixes some of the problems of
-.BR strcpy ()
-but the caller must still handle the possibility of data loss if
-.I size
-is too small.
-The return value of the function is the length of
-.IR src ,
-which allows truncation to be easily detected:
-if the return value is greater than or equal to
-.IR size ,
-truncation occurred.
-If loss of data matters, the caller
-.I must
-either check the arguments before the call,
-or test the function return value.
-.BR strlcpy ()
-is not present in glibc and is not standardized by POSIX,
-.\" https://lwn.net/Articles/506530/
-but is available on Linux via the
-.I libbsd
-library.
-.SH BUGS
-If the destination string of a
-.BR strcpy ()
-is not large enough, then anything might happen.
-Overflowing fixed-length string buffers is a favorite cracker technique
-for taking complete control of the machine.
-Any time a program reads or copies data into a buffer,
-the program first needs to check that there's enough space.
-This may be unnecessary if you can show that overflow is impossible,
-but be careful: programs can get changed over time,
-in ways that may make the impossible possible.
+.\" ----- EXAMPLES :: strcpy(3), strcat(3) ----------------------------/
+.TP
+.BR strcpy (3)
+.TQ
+.BR strcat (3)
+.in +4n
+.EX
+strcpy(buf, "Hello ");
+strcat(buf, "world");
+strcat(buf, "!");
+len = strlen(buf);
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: stpecpy(3), stpecpyx(3) -------------------------/
+.TP
+.BR stpecpy (3)
+.TQ
+.BR stpecpyx (3)
+.in +4n
+.EX
+past_end = buf + sizeof(buf);
+p = buf;
+p = stpecpy(p, past_end, "Hello ");
+p = stpecpy(p, past_end, "world");
+p = stpecpy(p, past_end, "!");
+if (p == past_end) {
+ p\-\-;
+ goto toolong;
+}
+len = p \- buf;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: strlcpy(3bsd), strlcat(3bsd) --------------------/
+.TP
+.BR strlcpy (3bsd)
+.TQ
+.BR strlcat (3bsd)
+.in +4n
+.EX
+if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
+ goto toolong;
+if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
+ goto toolong;
+len = strlcat(buf, "!", sizeof(buf));
+if (len >= sizeof(buf))
+ goto toolong;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: strscpy(3) --------------------------------------/
+.TP
+.BR strscpy (3)
+.in +4n
+.EX
+len = strscpy(buf, "Hello world!", sizeof(buf));
+if (len == \-E2BIG)
+ goto toolong;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: stpncpy(3) --------------------------------------/
+.TP
+.BR stpncpy (3)
+.in +4n
+.EX
+past_end = buf + sizeof(buf);
+end = stpncpy(buf, "Hello world!", sizeof(buf));
+if (end == past_end)
+ goto toolong;
+len = end \- buf;
+for (size_t i = 0; i < sizeof(buf); i++)
+ putchar(buf[i]);
+.EE
+.in
+.\" ----- EXAMPLES :: strncpy(3) --------------------------------------/
+.TP
+.BR strncpy (3)
+.in +4n
+.EX
+strncpy(buf, "Hello world!", sizeof(buf));
+if (buf + sizeof(buf) \- 1 == \(aq\e0\(aq)
+ goto toolong;
+len = strnlen(buf, sizeof(buf));
+for (size_t i = 0; i < sizeof(buf); i++)
+ putchar(buf[i]);
+.EE
+.in
+.\" ----- EXAMPLES :: ustr2stp(3) -------------------------------------/
+.TP
+.BR ustr2stp (3)
+.in +4n
+.EX
+p = buf;
+p = ustr2stp(p, "Hello ", 6);
+p = ustr2stp(p, "world", 42); // Padding null bytes ignored.
+p = ustr2stp(p, "!", 1);
+len = p \- buf;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: strncat(3) --------------------------------------/
+.TP
+.BR strncat (3)
+.in +4n
+.EX
+buf[0] = \(aq\e0\(aq; // There's no 'cpy' function to this 'cat'.
+strncat(buf, "Hello ", 6);
+strncat(buf, "world", 42); // Padding null bytes ignored.
+strncat(buf, "!", 1);
+len = strlen(buf);
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: mempcpy(3) --------------------------------------/
+.TP
+.BR mempcpy (3)
+.in +4n
+.EX
+p = buf;
+p = mempcpy(p, "Hello ", 6);
+p = mempcpy(p, "world", 5);
+p = mempcpy(p, "!", 1);
+p = \(aq\e0\(aq;
+len = p \- buf;
+puts(buf);
+.EE
+.in
+.\" ----- SEE ALSO :: -------------------------------------------------/
.SH SEE ALSO
-.BR bcopy (3),
-.BR memccpy (3),
+.BR bzero (3),
.BR memcpy (3),
-.BR memmove (3),
-.BR stpcpy (3),
-.BR strdup (3),
-.BR string (3),
-.BR wcscpy (3)
+.BR memccpy (3),
+.BR mempcpy (3),
+.BR string (3)
--
2.38.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v2 2/3] stpcpy.3, stpncpy.3, strcat.3, strncat.3, strncpy.3: Transform the old pages into links to strcpy(3)
2022-12-12 14:24 ` [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions Alejandro Colomar
` (2 preceding siblings ...)
2022-12-12 23:00 ` [PATCH v2 1/3] " Alejandro Colomar
@ 2022-12-12 23:00 ` Alejandro Colomar
2022-12-12 23:00 ` [PATCH v2 3/3] stpecpy.3, stpecpyx.3, strlcat.3, strlcpy.3, strscpy.3: Add new " Alejandro Colomar
4 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-12 23:00 UTC (permalink / raw)
To: linux-man; +Cc: Martin Sebor, Alejandro Colomar
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/stpcpy.3 | 115 +--------------------------------
man3/stpncpy.3 | 123 +----------------------------------
man3/strcat.3 | 161 +--------------------------------------------
man3/strncat.3 | 172 +------------------------------------------------
man3/strncpy.3 | 130 +------------------------------------
5 files changed, 5 insertions(+), 696 deletions(-)
diff --git a/man3/stpcpy.3 b/man3/stpcpy.3
index 5770790fc..ff7476a84 100644
--- a/man3/stpcpy.3
+++ b/man3/stpcpy.3
@@ -1,114 +1 @@
-.\" Copyright 1995 James R. Van Zandt <jrv@vanzandt.mv.com>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.TH stpcpy 3 (date) "Linux man-pages (unreleased)"
-.SH NAME
-stpcpy \- copy a string returning a pointer to its end
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <string.h>
-.PP
-.BI "char *stpcpy(char *restrict " dest ", const char *restrict " src );
-.fi
-.PP
-.RS -4
-Feature Test Macro Requirements for glibc (see
-.BR feature_test_macros (7)):
-.RE
-.PP
-.BR stpcpy ():
-.nf
- Since glibc 2.10:
- _POSIX_C_SOURCE >= 200809L
- Before glibc 2.10:
- _GNU_SOURCE
-.fi
-.SH DESCRIPTION
-The
-.BR stpcpy ()
-function copies the string pointed to by
-.I src
-(including the terminating null byte (\(aq\e0\(aq)) to the array pointed to by
-.IR dest .
-The strings may not overlap, and the destination string
-.I dest
-must be large enough to receive the copy.
-.SH RETURN VALUE
-.BR stpcpy ()
-returns a pointer to the
-.B end
-of the string
-.I dest
-(that is, the address of the terminating null byte)
-rather than the beginning.
-.SH ATTRIBUTES
-For an explanation of the terms used in this section, see
-.BR attributes (7).
-.ad l
-.nh
-.TS
-allbox;
-lbx lb lb
-l l l.
-Interface Attribute Value
-T{
-.BR stpcpy ()
-T} Thread safety MT-Safe
-.TE
-.hy
-.ad
-.sp 1
-.SH STANDARDS
-This function was added to POSIX.1-2008.
-Before that, it was not part of
-the C or POSIX.1 standards, nor customary on UNIX systems.
-It first appeared at least as early as 1986,
-in the Lattice C AmigaDOS compiler,
-then in the GNU fileutils and GNU textutils in 1989,
-and in the GNU C library by 1992.
-It is also present on the BSDs.
-.SH BUGS
-This function may overrun the buffer
-.IR dest .
-.SH EXAMPLES
-For example, this program uses
-.BR stpcpy ()
-to concatenate
-.B foo
-and
-.B bar
-to produce
-.BR foobar ,
-which it then prints.
-.PP
-.\" SRC BEGIN (stpcpy.c)
-.EX
-#define _GNU_SOURCE
-#include <stdio.h>
-#include <string.h>
-
-int
-main(void)
-{
- char buffer[20];
- char *to = buffer;
-
- to = stpcpy(to, "foo");
- to = stpcpy(to, "bar");
- printf("%s\en", buffer);
-}
-.EE
-.\" SRC END
-.SH SEE ALSO
-.BR bcopy (3),
-.BR memccpy (3),
-.BR memcpy (3),
-.BR memmove (3),
-.BR stpncpy (3),
-.BR strcpy (3),
-.BR string (3),
-.BR wcpcpy (3)
+.so man3/strcpy.3
diff --git a/man3/stpncpy.3 b/man3/stpncpy.3
index 0a62e3055..ff7476a84 100644
--- a/man3/stpncpy.3
+++ b/man3/stpncpy.3
@@ -1,122 +1 @@
-.\" Copyright (c) Bruno Haible <haible@clisp.cons.org>
-.\" Copyright (c) 2022 Alejandro Colomar <alx@kernel.org>
-.\"
-.\" SPDX-License-Identifier: GPL-2.0-or-later
-.\"
-.\" References consulted:
-.\" GNU glibc-2 source code and manual
-.\"
-.\" Corrected, aeb, 990824
-.TH stpncpy 3 (date) "Linux man-pages (unreleased)"
-.SH NAME
-stpncpy \- copy string into a fixed-length buffer and zero the rest of it
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <string.h>
-.PP
-.BI "char *stpncpy(char " dest "[restrict ." n "], \
-const char " src "[restrict ." n ],
-.BI " size_t " n );
-.fi
-.PP
-.RS -4
-Feature Test Macro Requirements for glibc (see
-.BR feature_test_macros (7)):
-.RE
-.PP
-.BR stpncpy ():
-.nf
- Since glibc 2.10:
- _POSIX_C_SOURCE >= 200809L
- Before glibc 2.10:
- _GNU_SOURCE
-.fi
-.SH DESCRIPTION
-.IR Note :
-This is probably not the function you want to use.
-For string copying with truncation, see
-.BR strlcpy (3bsd).
-.PP
-The
-.BR stpncpy ()
-function copies at most
-.I n
-characters of
-.I src
-and fills the rest of the
-.I dest
-buffer with null bytes.
-.BR Warning :
-If there is no null character among the first
-.I n
-bytes of
-.IR src ,
-the string placed in
-.I dest
-will not be null-terminated.
-.PP
-A simple implementation of
-.BR strncpy ()
-might be:
-.PP
-.in +4n
-.EX
-char *
-stpncpy(char *dest, const char *src, size_t n)
-{
- char *p
-
- bzero(dest, n);
- p = memccpy(dest, src, \(aq\e0\(aq, n);
- if (p == NULL)
- return dest + n;
-
- return p - 1;
-}
-.EE
-.in
-.PP
-The use of
-.BR strncpy ()
-is to copy a C string to a fixed-length buffer
-while ensuring that unused bytes in the destination buffer are zeroed out
-(perhaps to prevent information leaks if the buffer is to be
-written to media or transmitted to another process via an
-interprocess communication technique).
-.SH RETURN VALUE
-.BR stpncpy ()
-returns a pointer to the terminating null byte
-in
-.IR dest ,
-or, if
-.I dest
-is not null-terminated,
-.IR dest + n
-(that is, a pointer to one-past-the-end of the array).
-.SH ATTRIBUTES
-For an explanation of the terms used in this section, see
-.BR attributes (7).
-.ad l
-.nh
-.TS
-allbox;
-lbx lb lb
-l l l.
-Interface Attribute Value
-T{
-.BR stpncpy ()
-T} Thread safety MT-Safe
-.TE
-.hy
-.ad
-.sp 1
-.SH STANDARDS
-This function was added to POSIX.1-2008.
-Before that, it was a GNU extension.
-It first appeared in glibc 1.07 in 1993.
-.SH SEE ALSO
-.BR strlcpy (3bsd)
-.BR wcpncpy (3)
+.so man3/strcpy.3
diff --git a/man3/strcat.3 b/man3/strcat.3
index 277e5b1e4..ff7476a84 100644
--- a/man3/strcat.3
+++ b/man3/strcat.3
@@ -1,160 +1 @@
-.\" Copyright 1993 David Metcalfe (david@prism.demon.co.uk)
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\" References consulted:
-.\" Linux libc source code
-.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
-.\" 386BSD man pages
-.\" Modified Sat Jul 24 18:11:47 1993 by Rik Faith (faith@cs.unc.edu)
-.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
-.\" Improve discussion of strncat().
-.TH strcat 3 (date) "Linux man-pages (unreleased)"
-.SH NAME
-strcat \- concatenate two strings
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <string.h>
-.PP
-.BI "char *strcat(char *restrict " dest ", const char *restrict " src );
-.fi
-.SH DESCRIPTION
-The
-.BR strcat ()
-function appends the
-.I src
-string to the
-.I dest
-string,
-overwriting the terminating null byte (\(aq\e0\(aq) at the end of
-.IR dest ,
-and then adds a terminating null byte.
-The strings may not overlap, and the
-.I dest
-string must have
-enough space for the result.
-If
-.I dest
-is not large enough, program behavior is unpredictable;
-.IR "buffer overruns are a favorite avenue for attacking secure programs" .
-.SH RETURN VALUE
-The
-.BR strcat ()
-function returns a pointer to the resulting string
-.IR dest .
-.SH ATTRIBUTES
-For an explanation of the terms used in this section, see
-.BR attributes (7).
-.ad l
-.nh
-.TS
-allbox;
-lbx lb lb
-l l l.
-Interface Attribute Value
-T{
-.BR strcat (),
-.BR strncat ()
-T} Thread safety MT-Safe
-.TE
-.hy
-.ad
-.sp 1
-.SH STANDARDS
-POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH NOTES
-Some systems (the BSDs, Solaris, and others) provide the following function:
-.PP
-.in +4n
-.EX
-size_t strlcat(char *dest, const char *src, size_t size);
-.EE
-.in
-.PP
-This function appends the null-terminated string
-.I src
-to the string
-.IR dest ,
-copying at most
-.I size\-strlen(dest)\-1
-from
-.IR src ,
-and adds a terminating null byte to the result,
-.I unless
-.I size
-is less than
-.IR strlen(dest) .
-This function fixes the buffer overrun problem of
-.BR strcat (),
-but the caller must still handle the possibility of data loss if
-.I size
-is too small.
-The function returns the length of the string
-.BR strlcat ()
-tried to create; if the return value is greater than or equal to
-.IR size ,
-data loss occurred.
-If data loss matters, the caller
-.I must
-either check the arguments before the call, or test the function return value.
-.BR strlcat ()
-is not present in glibc and is not standardized by POSIX,
-.\" https://lwn.net/Articles/506530/
-but is available on Linux via the
-.I libbsd
-library.
-.\"
-.SH EXAMPLES
-Because
-.BR strcat ()
-must find the null byte that terminates the string
-.I dest
-using a search that starts at the beginning of the string,
-the execution time of this function
-scales according to the length of the string
-.IR dest .
-This can be demonstrated by running the program below.
-(If the goal is to concatenate many strings to one target,
-then manually copying the bytes from each source string
-while maintaining a pointer to the end of the target string
-will provide better performance.)
-.\"
-.SS Program source
-\&
-.\" SRC BEGIN (strcat.c)
-.EX
-#include <stdint.h>
-#include <stdio.h>
-#include <string.h>
-#include <time.h>
-
-int
-main(void)
-{
-#define LIM 4000000
- char p[LIM + 1]; /* +1 for terminating null byte */
- time_t base;
-
- base = time(NULL);
- p[0] = \(aq\e0\(aq;
-
- for (unsigned int j = 0; j < LIM; j++) {
- if ((j % 10000) == 0)
- printf("%u %jd\en", j, (intmax_t) (time(NULL) \- base));
- strcat(p, "a");
- }
-}
-.EE
-.\" SRC END
-.SH SEE ALSO
-.BR bcopy (3),
-.BR memccpy (3),
-.BR memcpy (3),
-.BR strcpy (3),
-.BR string (3),
-.BR strlcat (3bsd),
-.BR wcscat (3),
-.BR wcsncat (3)
+.so man3/strcpy.3
diff --git a/man3/strncat.3 b/man3/strncat.3
index 6e4bf6d78..ff7476a84 100644
--- a/man3/strncat.3
+++ b/man3/strncat.3
@@ -1,171 +1 @@
-.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.TH strncat 3 (date) "Linux man-pages (unreleased)"
-.SH NAME
-strncat \- concatenate an unterminated string into a string
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <string.h>
-.PP
-.BI "char *strncat(char " dest "[restrict strlen(." dest ") + ." n " + 1],"
-.BI " const char " src "[restrict ." n ],
-.BI " size_t " n );
-.fi
-.SH DESCRIPTION
-.IR Note :
-This is probably not the function you want to use.
-For string concatenation with truncation, see
-.BR strlcat (3bsd).
-For copying or concatenating a string into a fixed-length buffer
-with zeroing of the rest, see
-.BR stpncpy (3).
-.PP
-.BR strncat ()
-appends at most
-.I n
-characters of
-.I src
-to the end of
-.IR dst .
-It always terminates with a null character the string placed in
-.IR dest .
-.PP
-An implementation of
-.BR strncat ()
-might be:
-.PP
-.in +4n
-.EX
-char *
-strncat(char *dest, const char *src, size_t n)
-{
- char *cat;
- size_t len;
-
- cat = dest + strlen(dest);
- len = strnlen(src, n);
- memcpy(cat, src, len);
- cat[len] = \(aq\e0\(aq;
-
- return dest;
-}
-.EE
-.in
-.SH RETURN VALUE
-.BR strncat ()
-returns a pointer to the resulting string
-.IR dest .
-.SH ATTRIBUTES
-For an explanation of the terms used in this section, see
-.BR attributes (7).
-.ad l
-.nh
-.TS
-allbox;
-lbx lb lb
-l l l.
-Interface Attribute Value
-T{
-.BR strncat ()
-T} Thread safety MT-Safe
-.TE
-.hy
-.ad
-.sp 1
-.SH STANDARDS
-POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH NOTES
-.SS ustr2stpe()
-You may want to write your own function similar to
-.BR strncpy (),
-with the following improvements:
-.IP \(bu 3
-Copy, instead of concatenating.
-There's no equivalent of
-.BR strncat ()
-that copies instead of concatenating.
-.IP \(bu
-Allow chaining the function,
-by returning a suitable pointer.
-Copy chaining is faster than concatenating.
-.IP \(bu
-Don't check for null characters in the middle of the unterminated string.
-If the string is terminated, this function should not be used.
-If the string is unterminated, it is unnecessary.
-.IP \(bu
-A name that tells what it does:
-Copy from an
-.IR u nterminated
-.IR str ing
-to a
-.IR st ring,
-and return a
-.IR p ointer
-to its end.
-.PP
-.in +4n
-.EX
-/* This code is in the public domain.
- *
- * char *ustr2stp(char dst[restrict .n+1],
- * const char src[restrict .n],
- * size_t len);
- */
-char *
-ustr2stp(char *restrict dst, const char *restrict src, size_t len)
-{
- memcpy(dst, src, len);
- dst[len] = \(aq\e0\(aq;
-
- return dst + len;
-}
-.EE
-.in
-.SH CAVEATS
-This function doesn't know the size of the destination buffer,
-so it can overrun the buffer if the programmer wasn't careful enough.
-.SH BUGS
-.BR strncat (3)
-has a misleading name;
-it has no relationship with
-.BR strncpy (3).
-.SH EXAMPLES
-The following program creates a string
-from a concatenation of unterminated strings.
-.\" SRC BEGIN (strncpy.c)
-.EX
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-
-#define nitems(arr) (sizeof((arr)) / sizeof((arr)[0]))
-
-int
-main(void)
-{
- char pre[4] = "pre.";
- char *post = ".post";
- char *src = "some_long_body.post";
- char dest[100];
-
- dest[0] = \(aq\e0\(aq;
- strncat(dest, pre, nitems(pre));
- strncat(dest, src, strlen(src) \- strlen(post));
-
- puts(dest); // "pre.some_long_body"
- exit(EXIT_SUCCESS);
-}
-.EE
-.\" SRC END
-.in
-.SH SEE ALSO
-.BR memccpy (3),
-.BR memcpy (3),
-.BR mempcpy (3),
-.BR strcpy (3),
-.BR string (3)
+.so man3/strcpy.3
diff --git a/man3/strncpy.3 b/man3/strncpy.3
index e2ffc683f..ff7476a84 100644
--- a/man3/strncpy.3
+++ b/man3/strncpy.3
@@ -1,129 +1 @@
-.\" Copyright (C) 1993 David Metcalfe <david@prism.demon.co.uk>
-.\" Copyright (C) 2022 Alejandro Colomar <alx@kernel.org>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\" References consulted:
-.\" Linux libc source code
-.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
-.\" 386BSD man pages
-.\" Modified Sat Jul 24 18:06:49 1993 by Rik Faith (faith@cs.unc.edu)
-.\" Modified Fri Aug 25 23:17:51 1995 by Andries Brouwer (aeb@cwi.nl)
-.\" Modified Wed Dec 18 00:47:18 1996 by Andries Brouwer (aeb@cwi.nl)
-.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
-.\" Improve discussion of strncpy().
-.\"
-.TH strncpy 3 (date) "Linux man-pages (unreleased)"
-.SH NAME
-strncpy \- copy a string into a fixed-length buffer and zero the rest of it
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <string.h>
-.PP
-.BI "[[deprecated]] char *strncpy(char " dest "[restrict ." n ],
-.BI " const char " src "[restrict ." n "], \
-size_t " n );
-.fi
-.SH DESCRIPTION
-.BI Note: " This is not the function you want to use."
-For string copying with truncation, see
-.BR strlcpy (3bsd).
-For copying a string into a fixed-length buffer with zeroing of the rest,
-see
-.BR stpncpy (3).
-.PP
-.BR strncpy ()
-copies at most
-.I n
-bytes of
-.IR src ,
-and fills the rest of the
-.I dest
-buffer with null bytes.
-.BR Warning :
-If there is no null byte
-among the first
-.I n
-bytes of
-.IR src ,
-the string placed in
-.I dest
-will not be null-terminated.
-.PP
-A simple implementation of
-.BR strncpy ()
-might be:
-.PP
-.in +4n
-.EX
-char *
-strncpy(char *dest, const char *src, size_t n)
-{
- bzero(dest, n);
- memccpy(dest, src, \(aq\e0\(aq, n);
-
- return dest;
-}
-.EE
-.in
-.PP
-The use of
-.BR strncpy ()
-is to copy a C string to a fixed-length buffer
-while ensuring that unused bytes in the destination buffer are zeroed out
-(perhaps to prevent information leaks if the buffer is to be
-written to media or transmitted to another process via an
-interprocess communication technique).
-But
-.BR stpncpy (3)
-is better for this purpose,
-since it detects truncation.
-See BUGS below.
-.SH RETURN VALUE
-The
-.BR strncpy ()
-function returns a pointer to
-the destination buffer
-.IR dest .
-.SH ATTRIBUTES
-For an explanation of the terms used in this section, see
-.BR attributes (7).
-.ad l
-.nh
-.TS
-allbox;
-lbx lb lb
-l l l.
-Interface Attribute Value
-T{
-.BR strncpy ()
-T} Thread safety MT-Safe
-.TE
-.hy
-.ad
-.sp 1
-.SH STANDARDS
-POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH BUGS
-.BR strncpy ()
-has a misleading name.
-It doesn't produce a (null-terminated) string;
-and it should never be used for producing a string.
-.PP
-It can't detect truncation.
-It's probably better to explicitly call
-.BR bzero (3)
-and
-.BR memccpy (3),
-or
-.BR stpncpy (3)
-since they allow detecting truncation.
-.SH SEE ALSO
-.BR bzero (3),
-.BR memccpy (3),
-.BR stpncpy (3),
-.BR string (3),
-.BR wcsncpy (3)
+.so man3/strcpy.3
--
2.38.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v2 3/3] stpecpy.3, stpecpyx.3, strlcat.3, strlcpy.3, strscpy.3: Add new links to strcpy(3)
2022-12-12 14:24 ` [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions Alejandro Colomar
` (3 preceding siblings ...)
2022-12-12 23:00 ` [PATCH v2 2/3] stpcpy.3, stpncpy.3, strcat.3, strncat.3, strncpy.3: Transform the old pages into links to strcpy(3) Alejandro Colomar
@ 2022-12-12 23:00 ` Alejandro Colomar
4 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-12 23:00 UTC (permalink / raw)
To: linux-man; +Cc: Martin Sebor, Alejandro Colomar
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/stpecpy.3 | 1 +
man3/stpecpyx.3 | 1 +
man3/strlcat.3 | 1 +
man3/strlcpy.3 | 1 +
man3/strscpy.3 | 1 +
5 files changed, 5 insertions(+)
create mode 100644 man3/stpecpy.3
create mode 100644 man3/stpecpyx.3
create mode 100644 man3/strlcat.3
create mode 100644 man3/strlcpy.3
create mode 100644 man3/strscpy.3
diff --git a/man3/stpecpy.3 b/man3/stpecpy.3
new file mode 100644
index 000000000..ff7476a84
--- /dev/null
+++ b/man3/stpecpy.3
@@ -0,0 +1 @@
+.so man3/strcpy.3
diff --git a/man3/stpecpyx.3 b/man3/stpecpyx.3
new file mode 100644
index 000000000..ff7476a84
--- /dev/null
+++ b/man3/stpecpyx.3
@@ -0,0 +1 @@
+.so man3/strcpy.3
diff --git a/man3/strlcat.3 b/man3/strlcat.3
new file mode 100644
index 000000000..ff7476a84
--- /dev/null
+++ b/man3/strlcat.3
@@ -0,0 +1 @@
+.so man3/strcpy.3
diff --git a/man3/strlcpy.3 b/man3/strlcpy.3
new file mode 100644
index 000000000..ff7476a84
--- /dev/null
+++ b/man3/strlcpy.3
@@ -0,0 +1 @@
+.so man3/strcpy.3
diff --git a/man3/strscpy.3 b/man3/strscpy.3
new file mode 100644
index 000000000..ff7476a84
--- /dev/null
+++ b/man3/strscpy.3
@@ -0,0 +1 @@
+.so man3/strcpy.3
--
2.38.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* a Q quotation macro for man(7) (was: groff man(7) extensions)
2022-12-12 18:38 ` groff man(7) extensions (was: [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions) G. Branden Robinson
@ 2022-12-13 15:45 ` G. Branden Robinson
0 siblings, 0 replies; 53+ messages in thread
From: G. Branden Robinson @ 2022-12-13 15:45 UTC (permalink / raw)
To: Alejandro Colomar; +Cc: groff, linux-man
[-- Attachment #1.1: Type: text/plain, Size: 2256 bytes --]
[self-reply]
At 2022-12-12T12:38:42-0600, G. Branden Robinson wrote:
> Here's a list of man(7) extensions to which I have given consideration.
>
> KS/KE Keeps. Easy.[3] Harmlessly ignorable by other
> implementations.
> LS/LE List enclosure. Throws a semantic hint (e.g., for HTML
> output) and eliminates final use case of `PD` macro.[4]
> DC/TG Semantics at last. Sure to rouse anger in people who
> decided long ago that man(7) can't do this.[5] Having
> looked more closely at mdoc(7) since writing that, I
> think `DC` should accept a _pair_ of arguments as its
> second and third parameters for bracketing purposes.
> But again, most man page authors would never need to
> mess with `DC` at all.
There was one more.
Q Quotation macro. It's madness that one doesn't already
exist. Its absence, the imperfect portability of
special character identifiers for various types of
quotation mark, and the bad ergonomics of introducing
*roff strings just to serve this one purpose have made
quotation such a pain point in man(7) writing that
authors have tended to not bother with and instead abuse
font style changes for it, putting things that should
simply be quoted into stentorian italics or screaming
bold instead, when these faces are already heavily
burdened by other uses.
I experimentally implemented `Q` at one point but ran into a corner case
I wasn't happy with. Looking back over it now I see that I got it
entangled with an extension to `SY`/`YS` to support arguments to help
the formatter compute tab stops. I'm attaching "clone.man" so you can
have a look.
I've also pondered having private strings (i.e., not for use directly by
man pages) for opening and closing quotation marks that localization
packages can set. This might save Helge Kreutzmann and collaborators
some tedium.
Even with that wrinkle, a `Q` macro would be dead simple.
Here's an an-ext.tmac portable version.
.\" Define opening and closing quotation marks as appropriate to your
.\" language and/or output device.
.ds oq \(lq
.ds cq \(rq
.
.\" Quote first argument with second argument immediately following.
.de Q
\*(oq\\$1\*(cq\\$2
..
Regards,
Branden
[-- Attachment #1.2: clone.man --]
[-- Type: application/x-troff-man, Size: 1751 bytes --]
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v2 0/3] Rewrite strcpy(3)
2022-12-12 23:00 ` [PATCH v2 0/3] Rewrite strcpy(3) Alejandro Colomar
@ 2022-12-13 20:56 ` Jakub Wilk
2022-12-13 20:57 ` Alejandro Colomar
2022-12-13 22:05 ` Alejandro Colomar
2022-12-14 0:03 ` [PATCH v3 0/1] Rewritten page for string-copying functions Alejandro Colomar
2022-12-14 0:03 ` [PATCH v3 " Alejandro Colomar
2 siblings, 2 replies; 53+ messages in thread
From: Jakub Wilk @ 2022-12-13 20:56 UTC (permalink / raw)
To: Alejandro Colomar; +Cc: linux-man, Martin Sebor
The sheer size of this page make it almost unusable for me.
Please don't merge it.
* Alejandro Colomar <alx.manpages@gmail.com>, 2022-12-13 00:00:
> stpecpy(3), stpecpyx(3)
> Not provided by any library.
Then they don't belong in the man-pages project.
> strscpy(3)
> Not provided by any library. It is a Linux kernel internal
> function.
Ditto.
--
Jakub Wilk
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v2 0/3] Rewrite strcpy(3)
2022-12-13 20:56 ` Jakub Wilk
@ 2022-12-13 20:57 ` Alejandro Colomar
2022-12-13 22:05 ` Alejandro Colomar
1 sibling, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-13 20:57 UTC (permalink / raw)
To: Jakub Wilk; +Cc: linux-man, Martin Sebor
[-- Attachment #1.1: Type: text/plain, Size: 709 bytes --]
Hi Jakub,
On 12/13/22 21:56, Jakub Wilk wrote:
> The sheer size of this page make it almost unusable for me.
> Please don't merge it.
Plan b is a string_copy(7) page, and keep the other pages minimal. Would that
please you?
Thanks,
Alex
>
> * Alejandro Colomar <alx.manpages@gmail.com>, 2022-12-13 00:00:
>> stpecpy(3), stpecpyx(3)
>> Not provided by any library.
>
> Then they don't belong in the man-pages project. >
>> strscpy(3)
>> Not provided by any library. It is a Linux kernel internal
>> function.
>
> Ditto.
>
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v2 0/3] Rewrite strcpy(3)
2022-12-13 20:56 ` Jakub Wilk
2022-12-13 20:57 ` Alejandro Colomar
@ 2022-12-13 22:05 ` Alejandro Colomar
2022-12-13 22:46 ` Alejandro Colomar
1 sibling, 1 reply; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-13 22:05 UTC (permalink / raw)
To: Jakub Wilk; +Cc: linux-man, Martin Sebor
[-- Attachment #1.1: Type: text/plain, Size: 1817 bytes --]
Hi Jakub,
On 12/13/22 21:56, Jakub Wilk wrote:
> The sheer size of this page make it almost unusable for me.
> Please don't merge it.
>
> * Alejandro Colomar <alx.manpages@gmail.com>, 2022-12-13 00:00:
>> stpecpy(3), stpecpyx(3)
>> Not provided by any library.
>
> Then they don't belong in the man-pages project.
>
>> strscpy(3)
>> Not provided by any library. It is a Linux kernel internal
>> function.
>
> Ditto.
And strictly speaking, I shouldn't document strlcpy(3bsd) and strlcat(3bsd)
either because they're not provided by our libc; libbsd already has manual pages
for them, anyway.
Regarding this, the intention of the page is not to coldly document the behavior
of functions in terms of the byte operations they perform. That's what has been
done until now, and the result is what we know: many string copy functions are
dreaded (e.g., strncpy(3)), because most programmers don't use them correctly.
This new page instead, shows all string copying functions, including those
developed by other systems as alternatives to the standard ones. They did it
for a reason: the standard functions don't cover all use cases, and there's a
need to roll your own. But rolling your own is bad. It's better if someone
explains what alternative string copy functions exist, when they are more
appropriate than libc ones, and when they are not. Even the old pages
documented strlcpy(3) a little bit!
I suggest for a first release using the new page string_copy(7). I'll rewrite
anyway strcpy(3) and all others to be minimal, _and_ be reductions of
string_copy(7), for fast lookup.
Cheers,
Alex
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v2 0/3] Rewrite strcpy(3)
2022-12-13 22:05 ` Alejandro Colomar
@ 2022-12-13 22:46 ` Alejandro Colomar
0 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-13 22:46 UTC (permalink / raw)
To: Jakub Wilk; +Cc: linux-man, Martin Sebor
[-- Attachment #1.1: Type: text/plain, Size: 3445 bytes --]
On 12/13/22 23:05, Alejandro Colomar wrote:
> Hi Jakub,
>
> On 12/13/22 21:56, Jakub Wilk wrote:
>> The sheer size of this page make it almost unusable for me.
Moreover, I'd like to ask, what's your use case for these (string copy) pages?
And how am I impeding it?
- stpcpy(3)
- strcpy(3)
- strcat(3)
- stpncpy(3)
- strncpy(3)
- strncat(3)
Except for the last one, they are so simple in terms of the byte operation that
they perform, or the return value, that the pages are useless. Once you know
what they do, you don't forget (and I bet you know what they do). And even
strncat(3) is simple, when you understand it.
The return value is simple: 'r' functions return dst. 'p' functions return a
pointer past the last non-null character written.
The operation of 'cat' functions is simple: strlen(dst), and append a string there.
The operation of 'st.cpy' functions is even simpler: read a string, and copy it
at dst.
The operation of st.ncpy(3) is slightly less intuitive (probably due to
misdesign, the name doesn't match what they do): read a string, and copy it with
truncation into a null-padded character sequence in a fixed-width array.
strncat(3) is the most misdesigned of all: it reads a character sequence from a
null-padded fixed-width array, and creates a string out of it.
That covers it all. If I were to put those paragraphs in a separate page for
each function, what good would they do?
So, the pages are not very informative for those who already know. And for
those who don't know, I very much prefer that they read the entire page.
Cheers,
Alex
>> Please don't merge it.
>>
>> * Alejandro Colomar <alx.manpages@gmail.com>, 2022-12-13 00:00:
>>> stpecpy(3), stpecpyx(3)
>>> Not provided by any library.
>>
>> Then they don't belong in the man-pages project.
>>
>>> strscpy(3)
>>> Not provided by any library. It is a Linux kernel internal
>>> function.
>>
>> Ditto.
>
> And strictly speaking, I shouldn't document strlcpy(3bsd) and strlcat(3bsd)
> either because they're not provided by our libc; libbsd already has manual pages
> for them, anyway.
>
> Regarding this, the intention of the page is not to coldly document the behavior
> of functions in terms of the byte operations they perform. That's what has been
> done until now, and the result is what we know: many string copy functions are
> dreaded (e.g., strncpy(3)), because most programmers don't use them correctly.
>
> This new page instead, shows all string copying functions, including those
> developed by other systems as alternatives to the standard ones. They did it
> for a reason: the standard functions don't cover all use cases, and there's a
> need to roll your own. But rolling your own is bad. It's better if someone
> explains what alternative string copy functions exist, when they are more
> appropriate than libc ones, and when they are not. Even the old pages
> documented strlcpy(3) a little bit!
>
> I suggest for a first release using the new page string_copy(7). I'll rewrite
> anyway strcpy(3) and all others to be minimal, _and_ be reductions of
> string_copy(7), for fast lookup.
>
> Cheers,
>
> Alex
>
>
>
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v3 0/1] Rewritten page for string-copying functions
2022-12-12 23:00 ` [PATCH v2 0/3] Rewrite strcpy(3) Alejandro Colomar
2022-12-13 20:56 ` Jakub Wilk
@ 2022-12-14 0:03 ` Alejandro Colomar
2022-12-14 0:14 ` Alejandro Colomar
` (2 more replies)
2022-12-14 0:03 ` [PATCH v3 " Alejandro Colomar
2 siblings, 3 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-14 0:03 UTC (permalink / raw)
To: linux-man, Martin Sebor, G. Branden Robinson, Douglas McIlroy,
Jakub Wilk
Cc: Alejandro Colomar
Hi!
I've written a new manual page for documenting string-copying functions
so that it's clear what's the purpose of each of them. It may differ
from the original design of the functions, since my guess for several of
them is simply that they were misdesigned. However, after investigating
the operation that they perform on bytes, I've come up with a story that
can make sense of functions that were once believed to be broken by
many. In fact, my conclusion after writing the page is that only one
function is really useless:
- strncpy(3): stpncpy(3) is _always_ better.
The others depend on the program. If you don't care at all about
performance and Shlemiel is a friend of yours, then rcpy and [rn]cat
are your friends. If you don't like Shlemiel, and don't mind slightly
more complex code, you'll go for 'p' functions.
And so on. I won't spoil the page more.
Basically I want to end with this situation where a function like
strncpy(3) is dreaded by some because it looks broken (myself thought
that for a long time), and other who don't even know it misuse it for
what it shouldn't be useful, which is even worse. Or where programmers
think that strncpy(3) and strncat(3) have any relationship at all (they
don't).
Below goes the formatted page. Please review independently of it being
in strcpy(3) or string_copy(7), and address that as a separate issue
(but of course feel free to cover it, and any other issues).
Cheers,
Alex
Alejandro Colomar (1):
strcpy.3: Rewrite page to document all string-copying functions
man3/strcpy.3 | 1058 +++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 970 insertions(+), 88 deletions(-)
strcpy(3) Library Functions Manual strcpy(3)
NAME
stpcpy, strcpy, strcat, stpecpy, stpecpyx, strlcpy, strlcat, strscpy,
stpncpy, strncpy, ustr2stp, strncat, mempcpy - copy strings and charac‐
ter sequences
LIBRARY
stpcpy(3)
strcpy(3), strcat(3)
stpncpy(3)
strncpy(3)
strncat(3)
mempcpy(3)
Standard C library (libc, -lc)
stpecpy(3), stpecpyx(3)
Not provided by any library.
strlcpy(3), strlcat(3)
Utility functions from BSD systems (libbsd, -lbsd)
strscpy(3)
Not provided by any library. It is a Linux kernel internal
function.
SYNOPSIS
#include <string.h>
Strings
// Chain‐copy a string.
char *stpcpy(char *restrict dst, const char *restrict src);
// Copy/concatenate a string.
char *strcpy(char *restrict dst, const char *restrict src);
char *strcat(char *restrict dst, const char *restrict src);
// Chain‐copy a string with truncation.
char *stpecpy(char *dst, char past_end[0], const char *restrict src);
// Chain‐copy a string with truncation and SIGSEGV on UB.
char *stpecpyx(char *dst, char past_end[0], const char *restrict src);
// Copy/concatenate a string with truncation and SIGSEGV on UB.
size_t strlcpy(char dst[restrict .sz], const char *restrict src,
size_t sz);
size_t strlcat(char dst[restrict .sz], const char *restrict src,
size_t sz);
// Copy a string with truncation.
ssize_t strscpy(char dst[restrict .sz], const char src[restrict .sz],
size_t sz);
Null‐padded character sequences
// Zero a fixed‐width buffer, and
// copy a string with truncation into a character sequence.
char *stpncpy(char dst[restrict .sz], const char *restrict src,
size_t sz);
// Zero a fixed‐width buffer, and
// copy a string with truncation into a character sequence.
char *strncpy(char dest[restrict .sz], const char *restrict src,
size_t sz);
// Chain‐copy a null‐padded character sequence into a string.
char *ustr2stp(char *restrict dst, const char src[restrict .sz],
size_t sz);
// Concatenate a null‐padded character sequence into a string.
char *strncat(char *restrict dst, const char src[restrict .sz],
size_t sz);
Measured character sequences
// Chain‐copy a measured character sequence.
void *mempcpy(void *restrict dst, const void src[restrict .len],
size_t len);
Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
stpcpy(3), stpncpy(3):
Since glibc 2.10:
_POSIX_C_SOURCE >= 200809L
Before glibc 2.10:
_GNU_SOURCE
mempcpy(3):
_GNU_SOURCE
DESCRIPTION
Terms (and abbreviations)
string (str)
is a sequence of zero or more non‐null characters followed by a
null byte.
character sequence (ustr)
is a sequence of zero or more non‐null characters. A program
should never usa a character sequence where a string is re‐
quired. However, with appropriate care, a string can be used in
the place of a character sequence.
null‐padded character sequence
Character sequences can be contained in fixed‐width
buffers, which contain padding null bytes after the char‐
acter sequence, to fill the rest of the buffer without
affecting the character sequence; however, those padding
null bytes are not part of the character sequence.
measured character sequence
Character sequence delimited by its length.
length (len)
is the number of non‐null characters in a string or character
sequence. It is the return value of strlen(str) and of
strnlen(ustr, sz).
size (sz)
refers to the entire buffer where the string or character se‐
quence is contained.
end is the name of a pointer to the terminating null byte of a
string, or a pointer to one past the last character of a charac‐
ter sequence. This is the return value of functions that allow
chaining. It is equivalent to &str[len].
past_end
is the name of a pointer to one past the end of the buffer that
contains a string or character sequence. It is equivalent to
&str[sz]. It is used as a sentinel value, to be able to trun‐
cate strings or character sequences instead of overrunning the
containing buffer.
Copy, concatenate, and chain‐copy
Originally, there was a distinction between functions that copy and
those that concatenate. However, newer functions that copy while al‐
lowing chaining cover both use cases with a single API. They are also
algorithmically faster, since they don’t need to search for the end of
the existing string. However, functions that concatenate have a much
simpler use, so if performance is not important, it can make sense to
use them for improving readability.
To chain copy functions, they need to return a pointer to the end.
That’s a byproduct of the copy operation, so it has no performance
costs. Functions that return such a pointer, and thus can be chained,
have names of the form *stp*() or *memp*(), since it’s also common to
name the pointer just p.
Chain‐copying functions that truncate should accept a pointer to one
past the end of the destination buffer, and have names of the form
*stpe*(). This allows not having to recalculate the remaining size af‐
ter each call.
Truncate or not?
The first thing to note is that programmers should be careful with
buffers, so they always have the correct size, and truncation is not
necessary.
In most cases, truncation is not desired, and it is simpler to just do
the copy. Simpler code is safer code. Programming against programming
mistakes by adding more code just adds more points where mistakes can
be made.
Nowadays, compilers can detect most programmer errors with features
like compiler warnings, static analyzers, and _FORTIFY_SOURCE (see
ftm(7)). Keeping the code simple helps these overflow‐detection fea‐
tures be more precise.
When validating user input, however, it makes sense to truncate. Re‐
member to check the return value of such function calls.
Functions that truncate:
• stpecpy(3) is the most efficient string copy function that performs
truncation. It only requires to check for truncation once after all
chained calls.
• stpecpyx(3) is a variant of stpecpy(3) that consumes the entire
source string, to catch bugs in the program by forcing a segmenta‐
tion fault (as strlcpy(3bsd) and strlcat(3bsd) do).
• strlcpy(3bsd) and strlcat(3bsd) are designed to crash if the input
string is invalid (doesn’t contain a terminating null byte).
• strscpy(3) reports an error instead of crashing (similar to
stpecpy(3)).
• stpncpy(3) and strncpy(3) also truncate, but they don’t write
strings, but rather null‐padded character sequences.
Null‐padded character sequences
For historic reasons, some standard APIs, such as utmpx(5), use null‐
padded character sequences in fixed‐width buffers. To interface with
them, specialized functions need to be used.
To copy strings into them, use stpncpy(3).
To copy from an unterminated string within a fixed‐width buffer into a
string, ignoring any trailing null bytes in the source fixed‐width
buffer, you should use ustr2stp(3) or strncat(3).
Measured character sequences
The simplest character sequence copying function is mempcpy(3). It re‐
quires always knowing the length of your character sequences, for which
structures can be used. It makes the code much faster, since you al‐
ways know the length of your character sequences, and can do the mini‐
mal copies and length measurements. mempcpy(3) copies character se‐
quences, so you need to explicitly set the terminating null byte if you
need a string.
The following code can be used to chain‐copy from a measured character
sequence into a string:
p = mempcpy(p, foo->ustr, foo->len);
*p = '\0';
The following code can be used to chain‐copy from a measured character
sequence into an unterminated string:
p = mempcpy(p, bar->ustr, bar->len);
In programs that make considerable use of strings or character se‐
quences, and need the best performance, using overlapping character se‐
quences can make a big difference. It allows holding subsequences of a
larger character sequence. while not duplicating memory nor using time
to do a copy.
However, this is delicate, since it requires using character sequences.
C library APIs use strings, so programs that use character sequences
will have to take care of differentiating strings from character se‐
quences.
String vs character sequence
Some functions only operate on strings. Those require that the input
src is a string, and guarantee an output string (even when truncation
occurs). Functions that concatenate also require that dst holds a
string before the call. List of functions:
• stpcpy(3)
• strcpy(3), strcat(3)
• stpecpy(3), stpecpyx(3)
• strlcpy(3bsd), strlcat(3bsd)
• strscpy(3)
Other functions require an input string, but create a character se‐
quence as output. These functions have confusing names, and have a
long history of misuse. List of functions:
• stpncpy(3)
• strncpy(3)
Other functions operate on an input character sequence, and create an
output string. Functions that concatenate also require that dst holds
a string before the call. strncat(3) has an even more misleading name
than the functions above. List of functions:
• ustr2stp(3)
• strncat(3)
And the last one, operates on an input character sequence to create an
output character sequence. But because it asks for the length, and a
string is by nature composed of a character sequence of the same length
plus a terminating null byte, a string is also accepted as input.
Function:
• mempcpy(3)
Functions
stpcpy(3)
This function copies the input string into a destination string.
The programmer is responsible for allocating a buffer large
enough. It returns a pointer suitable for chaining.
An implementation of this function might be:
char *
stpcpy(char *restrict dst, const char *restrict src)
{
return mempcpy(dst, src, strlen(src));
}
strcpy(3)
strcat(3)
These functions copy the input string into a destination string.
The programmer is responsible for allocating a buffer large
enough. The return value is useless.
stpcpy(3) is a faster alternative to these functions.
An implementation of these functions might be:
char *
strcpy(char *restrict dst, const char *restrict src)
{
stpcpy(dst, src);
return dst;
}
char *
strcat(char *restrict dst, const char *restrict src)
{
stpcpy(dst + strlen(dst), src);
return dst;
}
stpecpy(3)
stpecpyx(3)
These functions copy the input string into a destination string.
If the destination buffer, limited by a pointer to one past the
end of it, isn’t large enough to hold the copy, the resulting
string is truncated (but it is guaranteed to be null‐termi‐
nated). They return a pointer suitable for chaining. Trunca‐
tion needs to be detected only once after the last chained call.
stpecpyx(3) has identical semantics to stpecpy(3), except that
it forces a SIGSEGV if the src pointer is not a string.
These functions are not provided by any library, but you can de‐
fine them with the following reference implementations:
/* This code is in the public domain. */
char *
stpecpy(char *dst, char past_end[0],
const char *restrict src)
{
char *p;
if (dst == past_end)
return past_end;
p = memccpy(dst, src, '\0', past_end - dst);
if (p != NULL)
return p - 1;
/* truncation detected */
past_end[-1] = '\0';
return past_end;
}
/* This code is in the public domain. */
char *
stpecpyx(char *dst, char past_end[0],
const char *restrict src)
{
if (src[strlen(src)] != '\0')
raise(SIGSEGV);
return stpecpy(dst, past_end, src);
}
strlcpy(3bsd)
strlcat(3bsd)
These functions copy the input string into a destination string.
If the destination buffer, limited by its size, isn’t large
enough to hold the copy, the resulting string is truncated (but
it is guaranteed to be null‐terminated). They return the length
of the total string they tried to create. These functions force
a SIGSEGV if the src pointer is not a string.
stpecpyx(3) is a faster alternative to these functions.
strscpy(3)
This function copies the input string into a destination string.
If the destination buffer, limited by its size, isn’t large
enough to hold the copy, the resulting string is truncated (but
it is guaranteed to be null‐terminated). It returns the length
of the destination string, or -E2BIG on truncation.
stpecpy(3) is a simpler and faster alternative to this function.
stpncpy(3)
This function copies the input string into a destination null‐
padded character sequence in a fixed‐width buffer. If the des‐
tination buffer, limited by its size, isn’t large enough to hold
the copy, the resulting character sequence is truncated. Since
it creates a character sequence, it doesn’t need to write a ter‐
minating null byte. It returns a pointer suitable for chaining,
but it’s not ideal for that. Truncation needs to be detected
only once after the last chained call.
If you’re going to use this function in chained calls, it would
be useful to develop a similar function that accepts a pointer
to one past the end of the buffer instead of a size.
An implementation of this function might be:
char *
stpncpy(char *restrict dst, const char *restrict src,
size_t sz)
{
char *p;
bzero(dst, sz);
p = memccpy(dst, src, '\0', sz);
if (p == NULL)
return dst + sz;
return p - 1;
}
ustr2stp(3)
This function copies the input character sequence contained in a
null‐padded wixed‐width buffer, into a destination string. The
programmer is responsible for allocating a buffer large enough.
It returns a pointer suitable for chaining.
A truncating version of this function doesn’t exist, since the
size of the original character sequence is always known, so it
wouldn’t be very useful.
This function is not provided by any library, but you can define
it with the following reference implementation:
/* This code is in the public domain. */
char *
ustr2stp(char *restrict dst, const char *restrict src,
size_t sz)
{
char *end;
end = memccpy(dst, src, '\0', sz)) ?: dst + sz;
*end = '\0';
return end;
}
strncpy(3)
This function is identical to stpncpy(3) except for the useless
return value. Due to the return value, with this function it’s
hard to correctly check for truncation.
stpncpy(3) is a simpler alternative to this function.
An implementation of this function might be:
char *
strncpy(char *restrict dst, const char *restrict src,
size_t sz)
{
stpncpy(dst, src, sz);
return dst;
}
strncat(3)
Do not confuse this function with strncpy(3); they are not re‐
lated at all.
This function concatenates the input character sequence con‐
tained in a null‐padded wixed‐width buffer, into a destination
string. The programmer is responsible for allocating a buffer
large enough. The return value is useless.
ustr2stp(3) is a faster alternative to this function.
An implementation of this function might be:
char *
strncat(char *restrict dst, const char *restrict src,
size_t sz)
{
ustr2stp(dst + strlen(dst), src, sz);
return dst;
}
mempcpy(3)
This function copies the input character sequence, limited by
its length, into a destination character sequence. The program‐
mer is responsible for allocating a buffer large enough. It re‐
turns a pointer suitable for chaining.
An implementation of this function might be:
void *
mempcpy(void *restrict dst, const void *restrict src,
size_t len)
{
return memcpy(dst, src, len) + len;
}
RETURN VALUE
The following functions return a pointer to the terminating null byte
in the destination string.
• stpcpy(3)
• ustr2stp(3)
The following functions return a pointer to the terminating null byte
in the destination string, except when truncation occurs; if truncation
occurs, they return a pointer to one past the end of the destination
buffer (past_end).
• stpecpy(3), stpecpyx(3)
The following function returns a pointer to one after the last charac‐
ter in the destination character sequence; if truncation occurs, that
pointer is equivalent to a pointer to one past the end of the destina‐
tion buffer.
• stpncpy(3)
The following function returns a pointer to one after the last charac‐
ter in the destination character sequence.
• mempcpy(3)
The following functions return the length of the total string that they
tried to create (as if truncation didn’t occur).
• strlcpy(3bsd), strlcat(3bsd)
The following function returns the length of the destination string, or
-E2BIG on truncation.
• strscpy(3)
The following functions return the dst pointer, which is useless.
• strcpy(3), strcat(3)
• strncpy(3)
• strncat(3)
ATTRIBUTES
For an explanation of the terms used in this section, see attrib‐
utes(7).
┌────────────────────────────────────────────┬───────────────┬─────────┐
│Interface │ Attribute │ Value │
├────────────────────────────────────────────┼───────────────┼─────────┤
│stpcpy(), strcpy(), strcat(), stpecpy(), │ Thread safety │ MT‐Safe │
│stpecpyx() strlcpy(), strlcat(), strscpy(), │ │ │
│stpncpy(), strncpy(), ustr2stp(), │ │ │
│strncat(), mempcpy() │ │ │
└────────────────────────────────────────────┴───────────────┴─────────┘
STANDARDS
strcpy(3), strcat(3)
strncpy(3)
strncat(3)
POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
stpcpy(3)
stpncpy(3)
POSIX.1‐2008.
strlcpy(3bsd), strlcat(3bsd)
Functions originated in OpenBSD and present in some Unix sys‐
tems.
mempcpy(3)
This function is a GNU extension.
strscpy(3)
Linux kernel internal function.
stpecpy(3), stpecpyx(3)
ustr2stp(3)
Not defined by any standards nor libraries.
CAVEATS
Don’t mix chain calls to truncating and non‐truncating functions. It
is conceptually wrong unless you know that the first part of a copy
will always fit. Anyway, the performance difference will probably be
negligible, so it will probably be more clear if you use consistent se‐
mantics: either truncating or non‐truncating. Calling a non‐truncating
function after a truncating one is necessarily wrong.
Some of the functions described here are not provided by any library;
you should write your own copy if you want to use them. See STANDARDS.
BUGS
All concatenation (*cat()) functions share the same performance prob‐
lem: Shlemiel the painter ⟨https://www.joelonsoftware.com/2001/12/11/
back-to-basics/⟩.
EXAMPLES
The following are examples of correct use of each of these functions.
stpcpy(3)
p = buf;
p = stpcpy(p, "Hello ");
p = stpcpy(p, "world");
p = stpcpy(p, "!");
len = p - buf;
puts(buf);
strcpy(3)
strcat(3)
strcpy(buf, "Hello ");
strcat(buf, "world");
strcat(buf, "!");
len = strlen(buf);
puts(buf);
stpecpy(3)
stpecpyx(3)
past_end = buf + sizeof(buf);
p = buf;
p = stpecpy(p, past_end, "Hello ");
p = stpecpy(p, past_end, "world");
p = stpecpy(p, past_end, "!");
if (p == past_end) {
p--;
goto toolong;
}
len = p - buf;
puts(buf);
strlcpy(3bsd)
strlcat(3bsd)
if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
goto toolong;
if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
goto toolong;
len = strlcat(buf, "!", sizeof(buf));
if (len >= sizeof(buf))
goto toolong;
puts(buf);
strscpy(3)
len = strscpy(buf, "Hello world!", sizeof(buf));
if (len == -E2BIG)
goto toolong;
puts(buf);
stpncpy(3)
past_end = buf + sizeof(buf);
end = stpncpy(buf, "Hello world!", sizeof(buf));
if (end == past_end)
goto toolong;
len = end - buf;
for (size_t i = 0; i < sizeof(buf); i++)
putchar(buf[i]);
strncpy(3)
strncpy(buf, "Hello world!", sizeof(buf));
if (buf + sizeof(buf) - 1 == '\0')
goto toolong;
len = strnlen(buf, sizeof(buf));
for (size_t i = 0; i < sizeof(buf); i++)
putchar(buf[i]);
ustr2stp(3)
p = buf;
p = ustr2stp(p, "Hello ", 6);
p = ustr2stp(p, "world", 42); // Padding null bytes ignored.
p = ustr2stp(p, "!", 1);
len = p - buf;
puts(buf);
strncat(3)
buf[0] = '\0'; // There’s no ’cpy’ function to this ’cat’.
strncat(buf, "Hello ", 6);
strncat(buf, "world", 42); // Padding null bytes ignored.
strncat(buf, "!", 1);
len = strlen(buf);
puts(buf);
mempcpy(3)
p = buf;
p = mempcpy(p, "Hello ", 6);
p = mempcpy(p, "world", 5);
p = mempcpy(p, "!", 1);
p = '\0';
len = p - buf;
puts(buf);
SEE ALSO
bzero(3), memcpy(3), memccpy(3), mempcpy(3), string(3)
Linux man‐pages (unreleased) (date) strcpy(3)
--
2.38.1
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v3 1/1] strcpy.3: Rewrite page to document all string-copying functions
2022-12-12 23:00 ` [PATCH v2 0/3] Rewrite strcpy(3) Alejandro Colomar
2022-12-13 20:56 ` Jakub Wilk
2022-12-14 0:03 ` [PATCH v3 0/1] Rewritten page for string-copying functions Alejandro Colomar
@ 2022-12-14 0:03 ` Alejandro Colomar
2022-12-14 16:22 ` Douglas McIlroy
2 siblings, 1 reply; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-14 0:03 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk
This is an opportunity to use consistent language across the
documentation for all string-copying functions.
It is also easier to show the similarities and differences between all
of the functions, so that a reader can use this page to know which
function is needed for a given task.
Many functions that are inferior to another one, have been marked as
deprecated, notwithstanding the deprecation status in C libraries or
any standards. Alternatives have been given in the same page, with
reference implementations.
Cc: Martin Sebor <msebor@redhat.com>
Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
Cc: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/strcpy.3 | 1058 +++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 970 insertions(+), 88 deletions(-)
diff --git a/man3/strcpy.3 b/man3/strcpy.3
index 74c3180ae..e04a7b149 100644
--- a/man3/strcpy.3
+++ b/man3/strcpy.3
@@ -1,48 +1,767 @@
-.\" Copyright (C) 1993 David Metcalfe (david@prism.demon.co.uk)
+.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\" References consulted:
-.\" Linux libc source code
-.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
-.\" 386BSD man pages
-.\" Modified Sat Jul 24 18:06:49 1993 by Rik Faith (faith@cs.unc.edu)
-.\" Modified Fri Aug 25 23:17:51 1995 by Andries Brouwer (aeb@cwi.nl)
-.\" Modified Wed Dec 18 00:47:18 1996 by Andries Brouwer (aeb@cwi.nl)
-.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
-.\" Improve discussion of strncpy().
+.\" SPDX-License-Identifier: BSD-3-Clause
.\"
.TH strcpy 3 (date) "Linux man-pages (unreleased)"
+.\" ----- NAME :: -----------------------------------------------------/
.SH NAME
-strcpy \- copy a string
+stpcpy,
+strcpy, strcat,
+stpecpy, stpecpyx,
+strlcpy, strlcat,
+strscpy,
+stpncpy,
+strncpy,
+ustr2stp,
+strncat,
+mempcpy
+\- copy strings and character sequences
+.\" ----- LIBRARY :: --------------------------------------------------/
.SH LIBRARY
+.TP
+.BR stpcpy (3)
+.TQ
+.BR strcpy "(3), \c"
+.BR strcat (3)
+.TQ
+.BR stpncpy (3)
+.TQ
+.BR strncpy (3)
+.TQ
+.BR strncat (3)
+.TQ
+.BR mempcpy (3)
Standard C library
.RI ( libc ", " \-lc )
+.TP
+.BR stpecpy "(3), \c"
+.BR stpecpyx (3)
+Not provided by any library.
+.TP
+.BR strlcpy "(3), \c"
+.BR strlcat (3)
+Utility functions from BSD systems
+.RI ( libbsd ", " \-lbsd )
+.TP
+.BR strscpy (3)
+Not provided by any library.
+It is a Linux kernel internal function.
+.\" ----- SYNOPSIS :: -------------------------------------------------/
.SH SYNOPSIS
.nf
.B #include <string.h>
+.fi
+.\" ----- SYNOPSIS :: (Null-terminated) strings -----------------------/
+.SS Strings
+.nf
+// Chain-copy a string.
+.BI "char *stpcpy(char *restrict " dst ", const char *restrict " src );
.PP
-.BI "char *strcpy(char *restrict " dest ", const char *restrict " src );
+// Copy/concatenate a string.
+.BI "char *strcpy(char *restrict " dst ", const char *restrict " src );
+.BI "char *strcat(char *restrict " dst ", const char *restrict " src );
+.PP
+// Chain-copy a string with truncation.
+.BI "char *stpecpy(char *" dst ", char " past_end "[0], \
+const char *restrict " src );
+.PP
+// Chain-copy a string with truncation and SIGSEGV on UB.
+.BI "char *stpecpyx(char *" dst ", char " past_end "[0], \
+const char *restrict " src );
+.PP
+// Copy/concatenate a string with truncation and SIGSEGV on UB.
+.BI "size_t strlcpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.BI "size_t strlcat(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Copy a string with truncation.
+.BI "ssize_t strscpy(char " dst "[restrict ." sz "], \
+const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.fi
+.\" ----- SYNOPSIS :: Null-padded character sequences --------/
+.SS Null-padded character sequences
+.nf
+// Zero a fixed-width buffer, and
+// copy a string with truncation into a character sequence.
+.BI "char *stpncpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Zero a fixed-width buffer, and
+// copy a string with truncation into a character sequence.
+.BI "char *strncpy(char " dest "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Chain-copy a null-padded character sequence into a string.
+.BI "char *ustr2stp(char *restrict " dst ", \
+const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.PP
+// Concatenate a null-padded character sequence into a string.
+.BI "char *strncat(char *restrict " dst ", const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.fi
+.\" ----- SYNOPSIS :: Measured character sequences --------------------/
+.SS Measured character sequences
+.nf
+// Chain-copy a measured character sequence.
+.BI "void *mempcpy(void *restrict " dst ", \
+const void " src "[restrict ." len ],
+.BI " size_t " len );
+.fi
+.PP
+.RS -4
+Feature Test Macro Requirements for glibc (see
+.BR feature_test_macros (7)):
+.RE
+.PP
+.BR stpcpy (3),
+.BR stpncpy (3):
+.nf
+ Since glibc 2.10:
+ _POSIX_C_SOURCE >= 200809L
+ Before glibc 2.10:
+ _GNU_SOURCE
+.fi
+.PP
+.BR mempcpy (3):
+.nf
+ _GNU_SOURCE
.fi
.SH DESCRIPTION
-The
-.BR strcpy ()
-function copies the string pointed to by
-.IR src ,
-including the terminating null byte (\(aq\e0\(aq),
-to the buffer pointed to by
-.IR dest .
-The strings may not overlap, and the destination string
-.I dest
-must be large enough to receive the copy.
-.I Beware of buffer overruns!
-(See BUGS.)
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: -----------------/
+.SS Terms (and abbreviations)
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: string (str) ----/
+.TP
+.IR "string " ( str )
+is a sequence of zero or more non-null characters followed by a null byte.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: null-padded character seq
+.TP
+.IR "character sequence " ( ustr )
+is a sequence of zero or more non-null characters.
+A program should never usa a character sequence where a string is required.
+However, with appropriate care,
+a string can be used in the place of a character sequence.
+.RS
+.TP
+.I null-padded character sequence
+Character sequences can be contained in fixed-width buffers,
+which contain padding null bytes after the character sequence,
+to fill the rest of the buffer
+without affecting the character sequence;
+however, those padding null bytes are not part of the character sequence.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: measured character sequence
+.TP
+.I measured character sequence
+Character sequence delimited by its length.
+.RE
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: length (len) ----/
+.TP
+.IR "length " ( len )
+is the number of non-null characters in a string or character sequence.
+It is the return value of
+.I strlen(str)
+and of
+.IR "strnlen(ustr, sz)" .
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: size (sz) -------/
+.TP
+.IR "size " ( sz )
+refers to the entire buffer
+where the string or character sequence is contained.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: end -------------/
+.TP
+.I end
+is the name of a pointer to the terminating null byte of a string,
+or a pointer to one past the last character of a character sequence.
+This is the return value of functions that allow chaining.
+It is equivalent to
+.IR &str[len] .
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: past_end --------/
+.TP
+.I past_end
+is the name of a pointer to one past the end of the buffer
+that contains a string or character sequence.
+It is equivalent to
+.IR &str[sz] .
+It is used as a sentinel value,
+to be able to truncate strings or character sequences
+instead of overrunning the containing buffer.
+.\" ----- DESCRIPTION :: Copy, concatenate, and chain-copy ------------/
+.SS Copy, concatenate, and chain-copy
+Originally,
+there was a distinction between functions that copy and those that concatenate.
+However, newer functions that copy while allowing chaining
+cover both use cases with a single API.
+They are also algorithmically faster,
+since they don't need to search for the end of the existing string.
+However, functions that concatenate have a much simpler use,
+so if performance is not important,
+it can make sense to use them for improving readability.
+.PP
+To chain copy functions,
+they need to return a pointer to the
+.IR end .
+That's a byproduct of the copy operation,
+so it has no performance costs.
+Functions that return such a pointer,
+and thus can be chained,
+have names of the form
+.RB * stp *()
+or
+.RB * memp *(),
+since it's also common to name the pointer just
+.IR p .
+.PP
+Chain-copying functions that truncate
+should accept a pointer to one past the end of the destination buffer,
+and have names of the form
+.RB * stpe *().
+This allows not having to recalculate the remaining size after each call.
+.\" ----- DESCRIPTION :: Truncate or not? -----------------------------/
+.SS Truncate or not?
+The first thing to note is that programmers should be careful with buffers,
+so they always have the correct size,
+and truncation is not necessary.
+.PP
+In most cases,
+truncation is not desired,
+and it is simpler to just do the copy.
+Simpler code is safer code.
+Programming against programming mistakes by adding more code
+just adds more points where mistakes can be made.
+.PP
+Nowadays,
+compilers can detect most programmer errors with features like
+compiler warnings,
+static analyzers, and
+.BR \%_FORTIFY_SOURCE
+(see
+.BR ftm (7)).
+Keeping the code simple
+helps these overflow-detection features be more precise.
+.PP
+When validating user input,
+however,
+it makes sense to truncate.
+Remember to check the return value of such function calls.
+.PP
+Functions that truncate:
+.IP \(bu 3
+.BR stpecpy (3)
+is the most efficient string copy function that performs truncation.
+It only requires to check for truncation once after all chained calls.
+.IP \(bu
+.BR stpecpyx (3)
+is a variant of
+.BR stpecpy (3)
+that consumes the entire source string,
+to catch bugs in the program
+by forcing a segmentation fault (as
+.BR strlcpy (3bsd)
+and
+.BR strlcat (3bsd)
+do).
+.IP \(bu
+.BR strlcpy (3bsd)
+and
+.BR strlcat (3bsd)
+are designed to crash if the input string is invalid
+(doesn't contain a terminating null byte).
+.IP \(bu
+.BR strscpy (3)
+reports an error instead of crashing (similar to
+.BR stpecpy (3)).
+.IP \(bu
+.BR stpncpy (3)
+and
+.BR strncpy (3)
+also truncate, but they don't write strings,
+but rather null-padded character sequences.
+.\" ----- DESCRIPTION :: Null-padded character sequences --------------/
+.SS Null-padded character sequences
+For historic reasons,
+some standard APIs,
+such as
+.BR utmpx (5),
+use null-padded character sequences in fixed-width buffers.
+To interface with them,
+specialized functions need to be used.
+.PP
+To copy strings into them, use
+.BR stpncpy (3).
+.PP
+To copy from an unterminated string within a fixed-width buffer into a string,
+ignoring any trailing null bytes in the source fixed-width buffer,
+you should use
+.BR ustr2stp (3)
+or
+.BR strncat (3).
+.\" ----- DESCRIPTION :: Measured character sequences -----------------/
+.SS Measured character sequences
+The simplest character sequence copying function is
+.BR mempcpy (3).
+It requires always knowing the length of your character sequences,
+for which structures can be used.
+It makes the code much faster,
+since you always know the length of your character sequences,
+and can do the minimal copies and length measurements.
+.BR mempcpy (3)
+copies character sequences,
+so you need to explicitly set the terminating null byte if you need a string.
+.PP
+The following code can be used to
+chain-copy from a measured character sequence into a string:
+.PP
+.in +4n
+.EX
+p = mempcpy(p, foo\->ustr, foo\->len);
+*p = \(aq\e0\(aq;
+.EE
+.in
+.PP
+The following code can be used to
+chain-copy from a measured character sequence into an unterminated string:
+.PP
+.in +4n
+.EX
+p = mempcpy(p, bar\->ustr, bar\->len);
+.EE
+.in
+.PP
+In programs that make considerable use of strings or character sequences,
+and need the best performance,
+using overlapping character sequences can make a big difference.
+It allows holding subsequences of a larger character sequence.
+while not duplicating memory
+nor using time to do a copy.
+.PP
+However, this is delicate,
+since it requires using character sequences.
+C library APIs use strings,
+so programs that use character sequences
+will have to take care of differentiating strings from character sequences.
+.\" ----- DESCRIPTION :: String vs character sequence -----------------/
+.SS String vs character sequence
+Some functions only operate on strings.
+Those require that the input
+.I src
+is a string,
+and guarantee an output string
+(even when truncation occurs).
+Functions that concatenate
+also require that
+.I dst
+holds a string before the call.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR stpcpy (3)
+.IP \(bu
+.BR strcpy "(3), \c"
+.BR strcat (3)
+.IP \(bu
+.BR stpecpy "(3), \c"
+.BR stpecpyx (3)
+.IP \(bu
+.BR strlcpy "(3bsd), \c"
+.BR strlcat (3bsd)
+.IP \(bu
+.BR strscpy (3)
+.PD
+.PP
+Other functions require an input string,
+but create a character sequence as output.
+These functions have confusing names,
+and have a long history of misuse.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR stpncpy (3)
+.IP \(bu
+.BR strncpy (3)
+.PD
+.PP
+Other functions operate on an input character sequence,
+and create an output string.
+Functions that concatenate
+also require that
+.I dst
+holds a string before the call.
+.BR strncat (3)
+has an even more misleading name than the functions above.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR ustr2stp (3)
+.IP \(bu
+.BR strncat (3)
+.PD
+.PP
+And the last one,
+operates on an input character sequence
+to create an output character sequence.
+But because it asks for the length,
+and a string is by nature composed of a character sequence of the same length
+plus a terminating null byte,
+a string is also accepted as input.
+Function:
+.IP \(bu 3
+.BR mempcpy (3)
+.\" ----- DESCRIPTION :: Functions :: ---------------------------------/
+.SS Functions
+.\" ----- DESCRIPTION :: Functions :: stpcpy(3) -----------------------/
+.TP
+.BR stpcpy (3)
+This function copies the input string into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+char *
+stpcpy(char *restrict dst, const char *restrict src)
+{
+ return mempcpy(dst, src, strlen(src));
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: strcpy(3), strcat(3) ------------/
+.TP
+.BR strcpy (3)
+.TQ
+.BR strcat (3)
+These functions copy the input string into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+The return value is useless.
+.IP
+.BR stpcpy (3)
+is a faster alternative to these functions.
+.IP
+An implementation of these functions might be:
+.IP
+.in +4n
+.EX
+char *
+strcpy(char *restrict dst, const char *restrict src)
+{
+ stpcpy(dst, src);
+ return dst;
+}
+
+char *
+strcat(char *restrict dst, const char *restrict src)
+{
+ stpcpy(dst + strlen(dst), src);
+ return dst;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: stpecpy(3), stpecpyx(3) ---------/
+.TP
+.BR stpecpy (3)
+.TQ
+.BR stpecpyx (3)
+These functions copy the input string into a destination string.
+If the destination buffer,
+limited by a pointer to one past the end of it,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+They return a pointer suitable for chaining.
+Truncation needs to be detected only once after the last chained call.
+.BR stpecpyx (3)
+has identical semantics to
+.BR stpecpy (3),
+except that it forces a SIGSEGV if the
+.I src
+pointer is not a string.
+.IP
+These functions are not provided by any library,
+but you can define them with the following reference implementations:
+.IP
+.in +4n
+.EX
+/* This code is in the public domain. */
+char *
+stpecpy(char *dst, char past_end[0],
+ const char *restrict src)
+{
+ char *p;
+
+ if (dst == past_end)
+ return past_end;
+
+ p = memccpy(dst, src, \(aq\e0\(aq, past_end \- dst);
+ if (p != NULL)
+ return p \- 1;
+
+ /* truncation detected */
+ past_end[\-1] = \(aq\e0\(aq;
+ return past_end;
+}
+
+/* This code is in the public domain. */
+char *
+stpecpyx(char *dst, char past_end[0],
+ const char *restrict src)
+{
+ if (src[strlen(src)] != \(aq\e0\(aq)
+ raise(SIGSEGV);
+
+ return stpecpy(dst, past_end, src);
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: strlcpy(3bsd), strlcat(3bsd) ----/
+.TP
+.BR strlcpy (3bsd)
+.TQ
+.BR strlcat (3bsd)
+These functions copy the input string into a destination string.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+They return the length of the total string they tried to create.
+These functions force a SIGSEGV if the
+.I src
+pointer is not a string.
+.IP
+.BR stpecpyx (3)
+is a faster alternative to these functions.
+.\" ----- DESCRIPTION :: Functions :: strscpy(3) ----------------------/
+.TP
+.BR strscpy (3)
+This function copies the input string into a destination string.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+It returns the length of the destination string, or
+.B \-E2BIG
+on truncation.
+.IP
+.BR stpecpy (3)
+is a simpler and faster alternative to this function.
+.RE
+.\" ----- DESCRIPTION :: Functions :: stpncpy(3) ----------------------/
+.TP
+.BR stpncpy (3)
+This function copies the input string into
+a destination null-padded character sequence in a fixed-width buffer.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting character sequence is truncated.
+Since it creates a character sequence,
+it doesn't need to write a terminating null byte.
+It returns a pointer suitable for chaining,
+but it's not ideal for that.
+Truncation needs to be detected only once after the last chained call.
+.IP
+If you're going to use this function in chained calls,
+it would be useful to develop a similar function
+that accepts a pointer to one past the end of the buffer instead of a size.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+char *
+stpncpy(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ char *p;
+
+ bzero(dst, sz);
+ p = memccpy(dst, src, \(aq\e0\(aq, sz);
+ if (p == NULL)
+ return dst + sz;
+
+ return p \- 1;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: ustr2stp(3) ---------------------/
+.TP
+.BR ustr2stp (3)
+This function copies the input character sequence
+contained in a null-padded wixed-width buffer,
+into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+A truncating version of this function doesn't exist,
+since the size of the original character sequence is always known,
+so it wouldn't be very useful.
+.IP
+This function is not provided by any library,
+but you can define it with the following reference implementation:
+.IP
+.in +4n
+.EX
+/* This code is in the public domain. */
+char *
+ustr2stp(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ char *end;
+
+ end = memccpy(dst, src, \(aq\e0\(aq, sz)) ?: dst + sz;
+ *end = \(aq\e0\(aq;
+
+ return end;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: strncpy(3) ----------------------/
+.TP
+.BR strncpy (3)
+This function is identical to
+.BR stpncpy (3)
+except for the useless return value.
+Due to the return value,
+with this function it's hard to correctly check for truncation.
+.IP
+.BR stpncpy (3)
+is a simpler alternative to this function.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+char *
+strncpy(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ stpncpy(dst, src, sz);
+ return dst;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: strncat(3) ----------------------/
+.TP
+.BR strncat (3)
+Do not confuse this function with
+.BR strncpy (3);
+they are not related at all.
+.IP
+This function concatenates the input character sequence
+contained in a null-padded wixed-width buffer,
+into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+The return value is useless.
+.IP
+.BR ustr2stp (3)
+is a faster alternative to this function.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+char *
+strncat(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ ustr2stp(dst + strlen(dst), src, sz);
+ return dst;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: mempcpy(3) ----------------------/
+.TP
+.BR mempcpy (3)
+This function copies the input character sequence,
+limited by its length,
+into a destination character sequence.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+void *
+mempcpy(void *restrict dst, const void *restrict src,
+ size_t len)
+{
+ return memcpy(dst, src, len) + len;
+}
+.EE
+.in
+.\" ----- RETURN VALUE :: ---------------------------------------------/
.SH RETURN VALUE
-The
-.BR strcpy ()
-function returns a pointer to
-the destination string
-.IR dest .
+The following functions return
+a pointer to the terminating null byte in the destination string.
+.IP \(bu 3
+.PD 0
+.BR stpcpy (3)
+.IP \(bu
+.BR ustr2stp (3)
+.PD
+.PP
+The following functions return
+a pointer to the terminating null byte in the destination string,
+except when truncation occurs;
+if truncation occurs,
+they return a pointer to one past the end of the destination buffer
+.RI ( past_end ).
+.IP \(bu 3
+.BR stpecpy (3),
+.BR stpecpyx (3)
+.PP
+The following function returns
+a pointer to one after the last character
+in the destination character sequence;
+if truncation occurs,
+that pointer is equivalent to
+a pointer to one past the end of the destination buffer.
+.IP \(bu 3
+.BR stpncpy (3)
+.PP
+The following function returns
+a pointer to one after the last character
+in the destination character sequence.
+.IP \(bu 3
+.BR mempcpy (3)
+.PP
+The following functions return
+the length of the total string that they tried to create
+(as if truncation didn't occur).
+.IP \(bu 3
+.BR strlcpy (3bsd),
+.BR strlcat (3bsd)
+.PP
+The following function returns
+the length of the destination string, or
+.B \-E2BIG
+on truncation.
+.IP \(bu 3
+.BR strscpy (3)
+.PP
+The following functions return the
+.I dst
+pointer,
+which is useless.
+.IP \(bu 3
+.PD 0
+.BR strcpy (3),
+.BR strcat (3)
+.IP \(bu
+.BR strncpy (3)
+.IP \(bu
+.BR strncat (3)
+.PD
+.\" ----- ATTRIBUTES :: -----------------------------------------------/
.SH ATTRIBUTES
For an explanation of the terms used in this section, see
.BR attributes (7).
@@ -54,73 +773,236 @@ .SH ATTRIBUTES
l l l.
Interface Attribute Value
T{
-.BR strcpy ()
+.BR stpcpy (),
+.BR strcpy (),
+.BR strcat (),
+.BR stpecpy (),
+.BR stpecpyx ()
+.BR strlcpy (),
+.BR strlcat (),
+.BR strscpy (),
+.BR stpncpy (),
+.BR strncpy (),
+.BR ustr2stp (),
+.BR strncat (),
+.BR mempcpy ()
T} Thread safety MT-Safe
.TE
.hy
.ad
.sp 1
+.\" ----- STANDARDS :: ------------------------------------------------/
.SH STANDARDS
-POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH NOTES
-.SS strlcpy()
-Some systems (the BSDs, Solaris, and others) provide the following function:
+.TP
+.BR strcpy "(3), \c"
+.BR strcat (3)
+.TQ
+.BR strncpy (3)
+.TQ
+.BR strncat (3)
+POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
+.TP
+.BR stpcpy (3)
+.\" This function was added to POSIX.1-2008.
+.\" Before that, it was not part of
+.\" the C or POSIX.1 standards, nor customary on UNIX systems.
+.\" It first appeared at least as early as 1986,
+.\" in the Lattice C AmigaDOS compiler,
+.\" then in the GNU fileutils and GNU textutils in 1989,
+.\" and in the GNU C library by 1992.
+.\" It is also present on the BSDs.
+.TQ
+.BR stpncpy (3)
+.\" This function was added to POSIX.1-2008.
+.\" Before that, it was a GNU extension.
+.\" It first appeared in glibc 1.07 in 1993.
+POSIX.1-2008.
+.TP
+.BR strlcpy "(3bsd), \c"
+.BR strlcat (3bsd)
+Functions originated in OpenBSD and present in some Unix systems.
+.TP
+.BR mempcpy (3)
+This function is a GNU extension.
+.TP
+.BR strscpy (3)
+Linux kernel internal function.
+.TP
+.BR stpecpy "(3), \c"
+.BR stpecpyx (3)
+.TQ
+.BR ustr2stp (3)
+Not defined by any standards nor libraries.
+.\" ----- CAVEATS :: --------------------------------------------------/
+.SH CAVEATS
+Don't mix chain calls to truncating and non-truncating functions.
+It is conceptually wrong
+unless you know that the first part of a copy will always fit.
+Anyway, the performance difference will probably be negligible,
+so it will probably be more clear if you use consistent semantics:
+either truncating or non-truncating.
+Calling a non-truncating function after a truncating one is necessarily wrong.
.PP
+Some of the functions described here are not provided by any library;
+you should write your own copy if you want to use them.
+See STANDARDS.
+.\" ----- BUGS :: -----------------------------------------------------/
+.SH BUGS
+All concatenation
+.RB (* cat ())
+functions share the same performance problem:
+.UR https://www.joelonsoftware.com/\:2001/12/11/\:back\-to\-basics/
+Shlemiel the painter
+.UE .
+.\" ----- EXAMPLES :: -------------------------------------------------/
+.SH EXAMPLES
+The following are examples of correct use of each of these functions.
+.\" ----- EXAMPLES :: stpcpy(3) ---------------------------------------/
+.TP
+.BR stpcpy (3)
.in +4n
.EX
-size_t strlcpy(char *dest, const char *src, size_t size);
+p = buf;
+p = stpcpy(p, "Hello ");
+p = stpcpy(p, "world");
+p = stpcpy(p, "!");
+len = p \- buf;
+puts(buf);
.EE
.in
-.PP
-.\" http://static.usenix.org/event/usenix99/full_papers/millert/millert_html/index.html
-.\" "strlcpy and strlcat - consistent, safe, string copy and concatenation"
-.\" 1999 USENIX Annual Technical Conference
-This function is similar to
-.BR strcpy (),
-but it copies at most
-.I size\-1
-bytes to
-.IR dest ,
-truncating the string as necessary.
-It always adds a terminating null byte.
-This function fixes some of the problems of
-.BR strcpy ()
-but the caller must still handle the possibility of data loss if
-.I size
-is too small.
-The return value of the function is the length of
-.IR src ,
-which allows truncation to be easily detected:
-if the return value is greater than or equal to
-.IR size ,
-truncation occurred.
-If loss of data matters, the caller
-.I must
-either check the arguments before the call,
-or test the function return value.
-.BR strlcpy ()
-is not present in glibc and is not standardized by POSIX,
-.\" https://lwn.net/Articles/506530/
-but is available on Linux via the
-.I libbsd
-library.
-.SH BUGS
-If the destination string of a
-.BR strcpy ()
-is not large enough, then anything might happen.
-Overflowing fixed-length string buffers is a favorite cracker technique
-for taking complete control of the machine.
-Any time a program reads or copies data into a buffer,
-the program first needs to check that there's enough space.
-This may be unnecessary if you can show that overflow is impossible,
-but be careful: programs can get changed over time,
-in ways that may make the impossible possible.
+.\" ----- EXAMPLES :: strcpy(3), strcat(3) ----------------------------/
+.TP
+.BR strcpy (3)
+.TQ
+.BR strcat (3)
+.in +4n
+.EX
+strcpy(buf, "Hello ");
+strcat(buf, "world");
+strcat(buf, "!");
+len = strlen(buf);
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: stpecpy(3), stpecpyx(3) -------------------------/
+.TP
+.BR stpecpy (3)
+.TQ
+.BR stpecpyx (3)
+.in +4n
+.EX
+past_end = buf + sizeof(buf);
+p = buf;
+p = stpecpy(p, past_end, "Hello ");
+p = stpecpy(p, past_end, "world");
+p = stpecpy(p, past_end, "!");
+if (p == past_end) {
+ p\-\-;
+ goto toolong;
+}
+len = p \- buf;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: strlcpy(3bsd), strlcat(3bsd) --------------------/
+.TP
+.BR strlcpy (3bsd)
+.TQ
+.BR strlcat (3bsd)
+.in +4n
+.EX
+if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
+ goto toolong;
+if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
+ goto toolong;
+len = strlcat(buf, "!", sizeof(buf));
+if (len >= sizeof(buf))
+ goto toolong;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: strscpy(3) --------------------------------------/
+.TP
+.BR strscpy (3)
+.in +4n
+.EX
+len = strscpy(buf, "Hello world!", sizeof(buf));
+if (len == \-E2BIG)
+ goto toolong;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: stpncpy(3) --------------------------------------/
+.TP
+.BR stpncpy (3)
+.in +4n
+.EX
+past_end = buf + sizeof(buf);
+end = stpncpy(buf, "Hello world!", sizeof(buf));
+if (end == past_end)
+ goto toolong;
+len = end \- buf;
+for (size_t i = 0; i < sizeof(buf); i++)
+ putchar(buf[i]);
+.EE
+.in
+.\" ----- EXAMPLES :: strncpy(3) --------------------------------------/
+.TP
+.BR strncpy (3)
+.in +4n
+.EX
+strncpy(buf, "Hello world!", sizeof(buf));
+if (buf + sizeof(buf) \- 1 == \(aq\e0\(aq)
+ goto toolong;
+len = strnlen(buf, sizeof(buf));
+for (size_t i = 0; i < sizeof(buf); i++)
+ putchar(buf[i]);
+.EE
+.in
+.\" ----- EXAMPLES :: ustr2stp(3) -------------------------------------/
+.TP
+.BR ustr2stp (3)
+.in +4n
+.EX
+p = buf;
+p = ustr2stp(p, "Hello ", 6);
+p = ustr2stp(p, "world", 42); // Padding null bytes ignored.
+p = ustr2stp(p, "!", 1);
+len = p \- buf;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: strncat(3) --------------------------------------/
+.TP
+.BR strncat (3)
+.in +4n
+.EX
+buf[0] = \(aq\e0\(aq; // There's no 'cpy' function to this 'cat'.
+strncat(buf, "Hello ", 6);
+strncat(buf, "world", 42); // Padding null bytes ignored.
+strncat(buf, "!", 1);
+len = strlen(buf);
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: mempcpy(3) --------------------------------------/
+.TP
+.BR mempcpy (3)
+.in +4n
+.EX
+p = buf;
+p = mempcpy(p, "Hello ", 6);
+p = mempcpy(p, "world", 5);
+p = mempcpy(p, "!", 1);
+p = \(aq\e0\(aq;
+len = p \- buf;
+puts(buf);
+.EE
+.in
+.\" ----- SEE ALSO :: -------------------------------------------------/
.SH SEE ALSO
-.BR bcopy (3),
-.BR memccpy (3),
+.BR bzero (3),
.BR memcpy (3),
-.BR memmove (3),
-.BR stpcpy (3),
-.BR strdup (3),
-.BR string (3),
-.BR wcscpy (3)
+.BR memccpy (3),
+.BR mempcpy (3),
+.BR string (3)
--
2.38.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/1] Rewritten page for string-copying functions
2022-12-14 0:03 ` [PATCH v3 0/1] Rewritten page for string-copying functions Alejandro Colomar
@ 2022-12-14 0:14 ` Alejandro Colomar
2022-12-14 0:16 ` Alejandro Colomar
2022-12-14 16:17 ` [PATCH v4 " Alejandro Colomar
2022-12-14 16:17 ` [PATCH v4 1/1] strcpy.3: Rewrite page to document all string-copying functions Alejandro Colomar
2 siblings, 1 reply; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-14 0:14 UTC (permalink / raw)
To: linux-man, Martin Sebor, G. Branden Robinson, Douglas McIlroy,
Jakub Wilk
Cc: Alejandro Colomar
[-- Attachment #1.1: Type: text/plain, Size: 32004 bytes --]
On 12/14/22 01:03, Alejandro Colomar wrote:
>
> Hi!
>
> I've written a new manual page for documenting string-copying functions
> so that it's clear what's the purpose of each of them. It may differ
> from the original design of the functions, since my guess for several of
> them is simply that they were misdesigned. However, after investigating
> the operation that they perform on bytes, I've come up with a story that
> can make sense of functions that were once believed to be broken by
> many. In fact, my conclusion after writing the page is that only one
> function is really useless:
>
> - strncpy(3): stpncpy(3) is _always_ better.
>
> The others depend on the program. If you don't care at all about
> performance and Shlemiel is a friend of yours, then rcpy and [rn]cat
> are your friends. If you don't like Shlemiel, and don't mind slightly
> more complex code, you'll go for 'p' functions.
>
> And so on. I won't spoil the page more.
>
> Basically I want to end with this situation where a function like
> strncpy(3) is dreaded by some because it looks broken (myself thought
> that for a long time), and other who don't even know it misuse it for
> what it shouldn't be useful, which is even worse. Or where programmers
> think that strncpy(3) and strncat(3) have any relationship at all (they
> don't).
>
> Below goes the formatted page. Please review independently of it being
> in strcpy(3) or string_copy(7), and address that as a separate issue
> (but of course feel free to cover it, and any other issues).
>
>
> Cheers,
>
> Alex
>
>
> Alejandro Colomar (1):
> strcpy.3: Rewrite page to document all string-copying functions
>
> man3/strcpy.3 | 1058 +++++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 970 insertions(+), 88 deletions(-)
>
>
> strcpy(3) Library Functions Manual strcpy(3)
>
> NAME
> stpcpy, strcpy, strcat, stpecpy, stpecpyx, strlcpy, strlcat, strscpy,
> stpncpy, strncpy, ustr2stp, strncat, mempcpy - copy strings and charac‐
> ter sequences
>
> LIBRARY
> stpcpy(3)
> strcpy(3), strcat(3)
> stpncpy(3)
> strncpy(3)
> strncat(3)
> mempcpy(3)
> Standard C library (libc, -lc)
>
> stpecpy(3), stpecpyx(3)
> Not provided by any library.
>
> strlcpy(3), strlcat(3)
> Utility functions from BSD systems (libbsd, -lbsd)
>
> strscpy(3)
> Not provided by any library. It is a Linux kernel internal
> function.
>
> SYNOPSIS
> #include <string.h>
>
> Strings
> // Chain‐copy a string.
> char *stpcpy(char *restrict dst, const char *restrict src);
>
> // Copy/concatenate a string.
> char *strcpy(char *restrict dst, const char *restrict src);
> char *strcat(char *restrict dst, const char *restrict src);
>
> // Chain‐copy a string with truncation.
> char *stpecpy(char *dst, char past_end[0], const char *restrict src);
>
> // Chain‐copy a string with truncation and SIGSEGV on UB.
> char *stpecpyx(char *dst, char past_end[0], const char *restrict src);
>
> // Copy/concatenate a string with truncation and SIGSEGV on UB.
> size_t strlcpy(char dst[restrict .sz], const char *restrict src,
> size_t sz);
> size_t strlcat(char dst[restrict .sz], const char *restrict src,
> size_t sz);
>
> // Copy a string with truncation.
> ssize_t strscpy(char dst[restrict .sz], const char src[restrict .sz],
> size_t sz);
>
> Null‐padded character sequences
> // Zero a fixed‐width buffer, and
> // copy a string with truncation into a character sequence.
> char *stpncpy(char dst[restrict .sz], const char *restrict src,
> size_t sz);
>
> // Zero a fixed‐width buffer, and
> // copy a string with truncation into a character sequence.
> char *strncpy(char dest[restrict .sz], const char *restrict src,
> size_t sz);
>
> // Chain‐copy a null‐padded character sequence into a string.
> char *ustr2stp(char *restrict dst, const char src[restrict .sz],
> size_t sz);
>
> // Concatenate a null‐padded character sequence into a string.
> char *strncat(char *restrict dst, const char src[restrict .sz],
> size_t sz);
>
> Measured character sequences
> // Chain‐copy a measured character sequence.
> void *mempcpy(void *restrict dst, const void src[restrict .len],
> size_t len);
>
> Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
>
> stpcpy(3), stpncpy(3):
> Since glibc 2.10:
> _POSIX_C_SOURCE >= 200809L
> Before glibc 2.10:
> _GNU_SOURCE
>
> mempcpy(3):
> _GNU_SOURCE
>
> DESCRIPTION
> Terms (and abbreviations)
> string (str)
> is a sequence of zero or more non‐null characters followed by a
> null byte.
>
> character sequence (ustr)
> is a sequence of zero or more non‐null characters. A program
> should never usa a character sequence where a string is re‐
> quired. However, with appropriate care, a string can be used in
> the place of a character sequence.
>
> null‐padded character sequence
> Character sequences can be contained in fixed‐width
> buffers, which contain padding null bytes after the char‐
> acter sequence, to fill the rest of the buffer without
> affecting the character sequence; however, those padding
> null bytes are not part of the character sequence.
>
> measured character sequence
> Character sequence delimited by its length.
>
> length (len)
> is the number of non‐null characters in a string or character
> sequence. It is the return value of strlen(str) and of
> strnlen(ustr, sz).
>
> size (sz)
> refers to the entire buffer where the string or character se‐
> quence is contained.
>
> end is the name of a pointer to the terminating null byte of a
> string, or a pointer to one past the last character of a charac‐
> ter sequence. This is the return value of functions that allow
> chaining. It is equivalent to &str[len].
>
> past_end
> is the name of a pointer to one past the end of the buffer that
> contains a string or character sequence. It is equivalent to
> &str[sz]. It is used as a sentinel value, to be able to trun‐
> cate strings or character sequences instead of overrunning the
> containing buffer.
>
> Copy, concatenate, and chain‐copy
> Originally, there was a distinction between functions that copy and
> those that concatenate. However, newer functions that copy while al‐
> lowing chaining cover both use cases with a single API. They are also
> algorithmically faster, since they don’t need to search for the end of
> the existing string. However, functions that concatenate have a much
> simpler use, so if performance is not important, it can make sense to
> use them for improving readability.
>
> To chain copy functions, they need to return a pointer to the end.
> That’s a byproduct of the copy operation, so it has no performance
> costs. Functions that return such a pointer, and thus can be chained,
> have names of the form *stp*() or *memp*(), since it’s also common to
> name the pointer just p.
>
> Chain‐copying functions that truncate should accept a pointer to one
> past the end of the destination buffer, and have names of the form
> *stpe*(). This allows not having to recalculate the remaining size af‐
> ter each call.
>
> Truncate or not?
> The first thing to note is that programmers should be careful with
> buffers, so they always have the correct size, and truncation is not
> necessary.
>
> In most cases, truncation is not desired, and it is simpler to just do
> the copy. Simpler code is safer code. Programming against programming
> mistakes by adding more code just adds more points where mistakes can
> be made.
>
> Nowadays, compilers can detect most programmer errors with features
> like compiler warnings, static analyzers, and _FORTIFY_SOURCE (see
> ftm(7)). Keeping the code simple helps these overflow‐detection fea‐
> tures be more precise.
>
> When validating user input, however, it makes sense to truncate. Re‐
> member to check the return value of such function calls.
>
> Functions that truncate:
>
> • stpecpy(3) is the most efficient string copy function that performs
> truncation. It only requires to check for truncation once after all
> chained calls.
>
> • stpecpyx(3) is a variant of stpecpy(3) that consumes the entire
> source string, to catch bugs in the program by forcing a segmenta‐
> tion fault (as strlcpy(3bsd) and strlcat(3bsd) do).
>
> • strlcpy(3bsd) and strlcat(3bsd) are designed to crash if the input
> string is invalid (doesn’t contain a terminating null byte).
>
> • strscpy(3) reports an error instead of crashing (similar to
> stpecpy(3)).
>
> • stpncpy(3) and strncpy(3) also truncate, but they don’t write
> strings, but rather null‐padded character sequences.
>
> Null‐padded character sequences
> For historic reasons, some standard APIs, such as utmpx(5), use null‐
> padded character sequences in fixed‐width buffers. To interface with
> them, specialized functions need to be used.
>
> To copy strings into them, use stpncpy(3).
>
> To copy from an unterminated string within a fixed‐width buffer into a
> string, ignoring any trailing null bytes in the source fixed‐width
> buffer, you should use ustr2stp(3) or strncat(3).
>
> Measured character sequences
> The simplest character sequence copying function is mempcpy(3). It re‐
> quires always knowing the length of your character sequences, for which
> structures can be used. It makes the code much faster, since you al‐
> ways know the length of your character sequences, and can do the mini‐
> mal copies and length measurements. mempcpy(3) copies character se‐
> quences, so you need to explicitly set the terminating null byte if you
> need a string.
>
> The following code can be used to chain‐copy from a measured character
> sequence into a string:
>
> p = mempcpy(p, foo->ustr, foo->len);
> *p = '\0';
>
> The following code can be used to chain‐copy from a measured character
> sequence into an unterminated string:
>
> p = mempcpy(p, bar->ustr, bar->len);
>
> In programs that make considerable use of strings or character se‐
> quences, and need the best performance, using overlapping character se‐
> quences can make a big difference. It allows holding subsequences of a
> larger character sequence. while not duplicating memory nor using time
> to do a copy.
>
> However, this is delicate, since it requires using character sequences.
> C library APIs use strings, so programs that use character sequences
> will have to take care of differentiating strings from character se‐
> quences.
>
> String vs character sequence
> Some functions only operate on strings. Those require that the input
> src is a string, and guarantee an output string (even when truncation
> occurs). Functions that concatenate also require that dst holds a
> string before the call. List of functions:
>
> • stpcpy(3)
> • strcpy(3), strcat(3)
> • stpecpy(3), stpecpyx(3)
> • strlcpy(3bsd), strlcat(3bsd)
> • strscpy(3)
>
> Other functions require an input string, but create a character se‐
> quence as output. These functions have confusing names, and have a
> long history of misuse. List of functions:
>
> • stpncpy(3)
> • strncpy(3)
>
> Other functions operate on an input character sequence, and create an
> output string. Functions that concatenate also require that dst holds
> a string before the call. strncat(3) has an even more misleading name
> than the functions above. List of functions:
>
> • ustr2stp(3)
> • strncat(3)
>
> And the last one, operates on an input character sequence to create an
> output character sequence. But because it asks for the length, and a
> string is by nature composed of a character sequence of the same length
> plus a terminating null byte, a string is also accepted as input.
> Function:
>
> • mempcpy(3)
>
> Functions
> stpcpy(3)
> This function copies the input string into a destination string.
> The programmer is responsible for allocating a buffer large
> enough. It returns a pointer suitable for chaining.
>
> An implementation of this function might be:
>
> char *
> stpcpy(char *restrict dst, const char *restrict src)
> {
> return mempcpy(dst, src, strlen(src));
Oops. It should have been:
char *p;
p = mempcpy(dst, src, strlen(src));
p = '\0';
return p;
> }
>
> strcpy(3)
> strcat(3)
> These functions copy the input string into a destination string.
> The programmer is responsible for allocating a buffer large
> enough. The return value is useless.
>
> stpcpy(3) is a faster alternative to these functions.
>
> An implementation of these functions might be:
>
> char *
> strcpy(char *restrict dst, const char *restrict src)
> {
> stpcpy(dst, src);
> return dst;
> }
>
> char *
> strcat(char *restrict dst, const char *restrict src)
> {
> stpcpy(dst + strlen(dst), src);
> return dst;
> }
>
> stpecpy(3)
> stpecpyx(3)
> These functions copy the input string into a destination string.
> If the destination buffer, limited by a pointer to one past the
> end of it, isn’t large enough to hold the copy, the resulting
> string is truncated (but it is guaranteed to be null‐termi‐
> nated). They return a pointer suitable for chaining. Trunca‐
> tion needs to be detected only once after the last chained call.
> stpecpyx(3) has identical semantics to stpecpy(3), except that
> it forces a SIGSEGV if the src pointer is not a string.
>
> These functions are not provided by any library, but you can de‐
> fine them with the following reference implementations:
>
> /* This code is in the public domain. */
> char *
> stpecpy(char *dst, char past_end[0],
> const char *restrict src)
> {
> char *p;
>
> if (dst == past_end)
> return past_end;
>
> p = memccpy(dst, src, '\0', past_end - dst);
> if (p != NULL)
> return p - 1;
>
> /* truncation detected */
> past_end[-1] = '\0';
> return past_end;
> }
>
> /* This code is in the public domain. */
> char *
> stpecpyx(char *dst, char past_end[0],
> const char *restrict src)
> {
> if (src[strlen(src)] != '\0')
> raise(SIGSEGV);
>
> return stpecpy(dst, past_end, src);
> }
>
> strlcpy(3bsd)
> strlcat(3bsd)
> These functions copy the input string into a destination string.
> If the destination buffer, limited by its size, isn’t large
> enough to hold the copy, the resulting string is truncated (but
> it is guaranteed to be null‐terminated). They return the length
> of the total string they tried to create. These functions force
> a SIGSEGV if the src pointer is not a string.
>
> stpecpyx(3) is a faster alternative to these functions.
>
> strscpy(3)
> This function copies the input string into a destination string.
> If the destination buffer, limited by its size, isn’t large
> enough to hold the copy, the resulting string is truncated (but
> it is guaranteed to be null‐terminated). It returns the length
> of the destination string, or -E2BIG on truncation.
>
> stpecpy(3) is a simpler and faster alternative to this function.
>
> stpncpy(3)
> This function copies the input string into a destination null‐
> padded character sequence in a fixed‐width buffer. If the des‐
> tination buffer, limited by its size, isn’t large enough to hold
> the copy, the resulting character sequence is truncated. Since
> it creates a character sequence, it doesn’t need to write a ter‐
> minating null byte. It returns a pointer suitable for chaining,
> but it’s not ideal for that. Truncation needs to be detected
> only once after the last chained call.
>
> If you’re going to use this function in chained calls, it would
> be useful to develop a similar function that accepts a pointer
> to one past the end of the buffer instead of a size.
>
> An implementation of this function might be:
>
> char *
> stpncpy(char *restrict dst, const char *restrict src,
> size_t sz)
> {
> char *p;
>
> bzero(dst, sz);
> p = memccpy(dst, src, '\0', sz);
> if (p == NULL)
> return dst + sz;
>
> return p - 1;
> }
>
> ustr2stp(3)
> This function copies the input character sequence contained in a
> null‐padded wixed‐width buffer, into a destination string. The
> programmer is responsible for allocating a buffer large enough.
> It returns a pointer suitable for chaining.
>
> A truncating version of this function doesn’t exist, since the
> size of the original character sequence is always known, so it
> wouldn’t be very useful.
>
> This function is not provided by any library, but you can define
> it with the following reference implementation:
>
> /* This code is in the public domain. */
> char *
> ustr2stp(char *restrict dst, const char *restrict src,
> size_t sz)
> {
> char *end;
>
> end = memccpy(dst, src, '\0', sz)) ?: dst + sz;
> *end = '\0';
>
> return end;
> }
>
> strncpy(3)
> This function is identical to stpncpy(3) except for the useless
> return value. Due to the return value, with this function it’s
> hard to correctly check for truncation.
>
> stpncpy(3) is a simpler alternative to this function.
>
> An implementation of this function might be:
>
> char *
> strncpy(char *restrict dst, const char *restrict src,
> size_t sz)
> {
> stpncpy(dst, src, sz);
> return dst;
> }
>
> strncat(3)
> Do not confuse this function with strncpy(3); they are not re‐
> lated at all.
>
> This function concatenates the input character sequence con‐
> tained in a null‐padded wixed‐width buffer, into a destination
> string. The programmer is responsible for allocating a buffer
> large enough. The return value is useless.
>
> ustr2stp(3) is a faster alternative to this function.
>
> An implementation of this function might be:
>
> char *
> strncat(char *restrict dst, const char *restrict src,
> size_t sz)
> {
> ustr2stp(dst + strlen(dst), src, sz);
> return dst;
> }
>
> mempcpy(3)
> This function copies the input character sequence, limited by
> its length, into a destination character sequence. The program‐
> mer is responsible for allocating a buffer large enough. It re‐
> turns a pointer suitable for chaining.
>
> An implementation of this function might be:
>
> void *
> mempcpy(void *restrict dst, const void *restrict src,
> size_t len)
> {
> return memcpy(dst, src, len) + len;
> }
>
> RETURN VALUE
> The following functions return a pointer to the terminating null byte
> in the destination string.
>
> • stpcpy(3)
> • ustr2stp(3)
>
> The following functions return a pointer to the terminating null byte
> in the destination string, except when truncation occurs; if truncation
> occurs, they return a pointer to one past the end of the destination
> buffer (past_end).
>
> • stpecpy(3), stpecpyx(3)
>
> The following function returns a pointer to one after the last charac‐
> ter in the destination character sequence; if truncation occurs, that
> pointer is equivalent to a pointer to one past the end of the destina‐
> tion buffer.
>
> • stpncpy(3)
>
> The following function returns a pointer to one after the last charac‐
> ter in the destination character sequence.
>
> • mempcpy(3)
>
> The following functions return the length of the total string that they
> tried to create (as if truncation didn’t occur).
>
> • strlcpy(3bsd), strlcat(3bsd)
>
> The following function returns the length of the destination string, or
> -E2BIG on truncation.
>
> • strscpy(3)
>
> The following functions return the dst pointer, which is useless.
>
> • strcpy(3), strcat(3)
> • strncpy(3)
> • strncat(3)
>
> ATTRIBUTES
> For an explanation of the terms used in this section, see attrib‐
> utes(7).
> ┌────────────────────────────────────────────┬───────────────┬─────────┐
> │Interface │ Attribute │ Value │
> ├────────────────────────────────────────────┼───────────────┼─────────┤
> │stpcpy(), strcpy(), strcat(), stpecpy(), │ Thread safety │ MT‐Safe │
> │stpecpyx() strlcpy(), strlcat(), strscpy(), │ │ │
> │stpncpy(), strncpy(), ustr2stp(), │ │ │
> │strncat(), mempcpy() │ │ │
> └────────────────────────────────────────────┴───────────────┴─────────┘
>
> STANDARDS
> strcpy(3), strcat(3)
> strncpy(3)
> strncat(3)
> POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
>
> stpcpy(3)
> stpncpy(3)
> POSIX.1‐2008.
>
> strlcpy(3bsd), strlcat(3bsd)
> Functions originated in OpenBSD and present in some Unix sys‐
> tems.
>
> mempcpy(3)
> This function is a GNU extension.
>
> strscpy(3)
> Linux kernel internal function.
>
> stpecpy(3), stpecpyx(3)
> ustr2stp(3)
> Not defined by any standards nor libraries.
>
> CAVEATS
> Don’t mix chain calls to truncating and non‐truncating functions. It
> is conceptually wrong unless you know that the first part of a copy
> will always fit. Anyway, the performance difference will probably be
> negligible, so it will probably be more clear if you use consistent se‐
> mantics: either truncating or non‐truncating. Calling a non‐truncating
> function after a truncating one is necessarily wrong.
>
> Some of the functions described here are not provided by any library;
> you should write your own copy if you want to use them. See STANDARDS.
>
> BUGS
> All concatenation (*cat()) functions share the same performance prob‐
> lem: Shlemiel the painter ⟨https://www.joelonsoftware.com/2001/12/11/
> back-to-basics/⟩.
>
> EXAMPLES
> The following are examples of correct use of each of these functions.
>
> stpcpy(3)
> p = buf;
> p = stpcpy(p, "Hello ");
> p = stpcpy(p, "world");
> p = stpcpy(p, "!");
> len = p - buf;
> puts(buf);
>
> strcpy(3)
> strcat(3)
> strcpy(buf, "Hello ");
> strcat(buf, "world");
> strcat(buf, "!");
> len = strlen(buf);
> puts(buf);
>
> stpecpy(3)
> stpecpyx(3)
> past_end = buf + sizeof(buf);
> p = buf;
> p = stpecpy(p, past_end, "Hello ");
> p = stpecpy(p, past_end, "world");
> p = stpecpy(p, past_end, "!");
> if (p == past_end) {
> p--;
> goto toolong;
> }
> len = p - buf;
> puts(buf);
>
> strlcpy(3bsd)
> strlcat(3bsd)
> if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
> goto toolong;
> if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
> goto toolong;
> len = strlcat(buf, "!", sizeof(buf));
> if (len >= sizeof(buf))
> goto toolong;
> puts(buf);
>
> strscpy(3)
> len = strscpy(buf, "Hello world!", sizeof(buf));
> if (len == -E2BIG)
> goto toolong;
> puts(buf);
>
> stpncpy(3)
> past_end = buf + sizeof(buf);
> end = stpncpy(buf, "Hello world!", sizeof(buf));
> if (end == past_end)
> goto toolong;
> len = end - buf;
> for (size_t i = 0; i < sizeof(buf); i++)
> putchar(buf[i]);
>
> strncpy(3)
> strncpy(buf, "Hello world!", sizeof(buf));
> if (buf + sizeof(buf) - 1 == '\0')
> goto toolong;
> len = strnlen(buf, sizeof(buf));
> for (size_t i = 0; i < sizeof(buf); i++)
> putchar(buf[i]);
>
> ustr2stp(3)
> p = buf;
> p = ustr2stp(p, "Hello ", 6);
> p = ustr2stp(p, "world", 42); // Padding null bytes ignored.
> p = ustr2stp(p, "!", 1);
> len = p - buf;
> puts(buf);
>
> strncat(3)
> buf[0] = '\0'; // There’s no ’cpy’ function to this ’cat’.
> strncat(buf, "Hello ", 6);
> strncat(buf, "world", 42); // Padding null bytes ignored.
> strncat(buf, "!", 1);
> len = strlen(buf);
> puts(buf);
>
> mempcpy(3)
> p = buf;
> p = mempcpy(p, "Hello ", 6);
> p = mempcpy(p, "world", 5);
> p = mempcpy(p, "!", 1);
> p = '\0';
> len = p - buf;
> puts(buf);
>
> SEE ALSO
> bzero(3), memcpy(3), memccpy(3), mempcpy(3), string(3)
>
> Linux man‐pages (unreleased) (date) strcpy(3)
>
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 0/1] Rewritten page for string-copying functions
2022-12-14 0:14 ` Alejandro Colomar
@ 2022-12-14 0:16 ` Alejandro Colomar
0 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-14 0:16 UTC (permalink / raw)
To: linux-man, Martin Sebor, G. Branden Robinson, Douglas McIlroy,
Jakub Wilk
Cc: Alejandro Colomar
[-- Attachment #1.1: Type: text/plain, Size: 641 bytes --]
On 12/14/22 01:14, Alejandro Colomar wrote:
>> An implementation of this function might be:
>>
>> char *
>> stpcpy(char *restrict dst, const char *restrict src)
>> {
>> return mempcpy(dst, src, strlen(src));
>
> Oops. It should have been:
>
> char *p;
>
> p = mempcpy(dst, src, strlen(src));
> p = '\0';
*p = '\0'; //:)
> return p;
>
>> }
>>
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v4 0/1] Rewritten page for string-copying functions
2022-12-14 0:03 ` [PATCH v3 0/1] Rewritten page for string-copying functions Alejandro Colomar
2022-12-14 0:14 ` Alejandro Colomar
@ 2022-12-14 16:17 ` Alejandro Colomar
2022-12-15 0:26 ` [PATCH v5 0/5] Rewrite pages about " Alejandro Colomar
` (5 more replies)
2022-12-14 16:17 ` [PATCH v4 1/1] strcpy.3: Rewrite page to document all string-copying functions Alejandro Colomar
2 siblings, 6 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-14 16:17 UTC (permalink / raw)
To: linux-man, Martin Sebor, G. Branden Robinson, Douglas McIlroy,
Jakub Wilk
Cc: Alejandro Colomar
Several improvements, including new functions (I wasn't happy with raw
mempcpy(3) and its type unsafety), and fixed an off-by-one error, and
improved descriptions.
Here goes the new version of the formatted page.
Cheers,
Alex
Alejandro Colomar (1):
strcpy.3: Rewrite page to document all string-copying functions
man3/strcpy.3 | 1164 +++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 1076 insertions(+), 88 deletions(-)
strcpy(3) Library Functions Manual strcpy(3)
NAME
stpcpy, strcpy, strcat, stpecpy, stpecpyx, strlcpy, strlcat, strscpy,
stpncpy, strncpy, zustr2ustp, zustr2stp, strncat, ustpcpy, ustr2stp -
copy strings and character sequences
LIBRARY
stpcpy(3)
strcpy(3), strcat(3)
stpncpy(3)
strncpy(3)
strncat(3)
Standard C library (libc, -lc)
stpecpy(3), stpecpyx(3)
zustr2ustp(3), zustr2stp(3)
ustpcpy(3), ustr2stp(3)
Not provided by any library.
strlcpy(3), strlcat(3)
Utility functions from BSD systems (libbsd, -lbsd)
strscpy(3)
Not provided by any library. It is a Linux kernel internal
function.
SYNOPSIS
#include <string.h>
Strings
// Chain‐copy a string.
char *stpcpy(char *restrict dst, const char *restrict src);
// Copy/concatenate a string.
char *strcpy(char *restrict dst, const char *restrict src);
char *strcat(char *restrict dst, const char *restrict src);
// Chain‐copy a string with truncation.
char *stpecpy(char *dst, char past_end[0], const char *restrict src);
// Chain‐copy a string with truncation and SIGSEGV on UB.
char *stpecpyx(char *dst, char past_end[0], const char *restrict src);
// Copy/concatenate a string with truncation and SIGSEGV on UB.
size_t strlcpy(char dst[restrict .sz], const char *restrict src,
size_t sz);
size_t strlcat(char dst[restrict .sz], const char *restrict src,
size_t sz);
// Copy a string with truncation.
ssize_t strscpy(char dst[restrict .sz], const char src[restrict .sz],
size_t sz);
Null‐padded character sequences
// Zero a fixed‐width buffer, and
// copy a string into a character sequence with truncation.
char *stpncpy(char dst[restrict .sz], const char *restrict src,
size_t sz);
// Zero a fixed‐width buffer, and
// copy a string into a character sequence with truncation.
char *strncpy(char dest[restrict .sz], const char *restrict src,
size_t sz);
// Chain‐copy a null‐padded character sequence into a character sequence.
char *zustr2ustp(char *restrict dst, const char src[restrict .sz],
size_t sz);
// Chain‐copy a null‐padded character sequence into a string.
char *zustr2stp(char *restrict dst, const char src[restrict .sz],
size_t sz);
// Concatenate a null‐padded character sequence into a string.
char *strncat(char *restrict dst, const char src[restrict .sz],
size_t sz);
Measured character sequences
// Chain‐copy a measured character sequence.
char *ustpcpy(char *restrict dst, const char src[restrict .len],
size_t len);
// Chain‐copy a measured character sequence into a string.
char *ustr2stp(char *restrict dst, const char src[restrict .len],
size_t len);
Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
stpcpy(3), stpncpy(3):
Since glibc 2.10:
_POSIX_C_SOURCE >= 200809L
Before glibc 2.10:
_GNU_SOURCE
DESCRIPTION
Terms (and abbreviations)
string (str)
is a sequence of zero or more non‐null characters followed by a
null byte.
character sequence
is a sequence of zero or more non‐null characters. A program
should never usa a character sequence where a string is re‐
quired. However, with appropriate care, a string can be used in
the place of a character sequence.
null‐padded character sequence (zustr)
Character sequences can be contained in fixed‐width
buffers, which contain padding null bytes after the char‐
acter sequence, to fill the rest of the buffer without
affecting the character sequence; however, those padding
null bytes are not part of the character sequence.
measured character sequence (ustr)
Character sequence delimited by its length. It may be a
slice of a larger character sequence, or even of a
string.
length (len)
is the number of non‐null characters in a string or character
sequence. It is the return value of strlen(str) and of
strnlen(ustr, sz).
size (sz)
refers to the entire buffer where the string or character se‐
quence is contained.
end is the name of a pointer to the terminating null byte of a
string, or a pointer to one past the last character of a charac‐
ter sequence. This is the return value of functions that allow
chaining. It is equivalent to &str[len].
past_end
is the name of a pointer to one past the end of the buffer that
contains a string or character sequence. It is equivalent to
&str[sz]. It is used as a sentinel value, to be able to trun‐
cate strings or character sequences instead of overrunning the
containing buffer.
Copy, concatenate, and chain‐copy
Originally, there was a distinction between functions that copy and
those that concatenate. However, newer functions that copy while al‐
lowing chaining cover both use cases with a single API. They are also
algorithmically faster, since they don’t need to search for the end of
the existing string. However, functions that concatenate have a much
simpler use, so if performance is not important, it can make sense to
use them for improving readability.
To chain copy functions, they need to return a pointer to the end.
That’s a byproduct of the copy operation, so it has no performance
costs. Functions that return such a pointer, and thus can be chained,
have names of the form *stp*(), since it’s also common to name the
pointer just p.
Chain‐copying functions that truncate should accept a pointer to one
past the end of the destination buffer, and have names of the form
*stpe*(). This allows not having to recalculate the remaining size af‐
ter each call.
Truncate or not?
The first thing to note is that programmers should be careful with
buffers, so they always have the correct size, and truncation is not
necessary.
In most cases, truncation is not desired, and it is simpler to just do
the copy. Simpler code is safer code. Programming against programming
mistakes by adding more code just adds more points where mistakes can
be made.
Nowadays, compilers can detect most programmer errors with features
like compiler warnings, static analyzers, and _FORTIFY_SOURCE (see
ftm(7)). Keeping the code simple helps these overflow‐detection fea‐
tures be more precise.
When validating user input, however, it makes sense to truncate. Re‐
member to check the return value of such function calls.
Functions that truncate:
• stpecpy(3) is the most efficient string copy function that performs
truncation. It only requires to check for truncation once after all
chained calls.
• stpecpyx(3) is a variant of stpecpy(3) that consumes the entire
source string, to catch bugs in the program by forcing a segmenta‐
tion fault (as strlcpy(3bsd) and strlcat(3bsd) do).
• strlcpy(3bsd) and strlcat(3bsd) are designed to crash if the input
string is invalid (doesn’t contain a terminating null byte).
• strscpy(3) reports an error instead of crashing (similar to
stpecpy(3)).
• stpncpy(3) and strncpy(3) also truncate, but they don’t write
strings, but rather null‐padded character sequences.
Null‐padded character sequences
For historic reasons, some standard APIs, such as utmpx(5), use null‐
padded character sequences in fixed‐width buffers. To interface with
them, specialized functions need to be used.
To copy strings into them, use stpncpy(3).
To copy from an unterminated string within a fixed‐width buffer into a
string, ignoring any trailing null bytes in the source fixed‐width
buffer, you should use zustr2stp(3) or strncat(3).
To copy from an unterminated string within a fixed‐width buffer into a
character sequence, ingoring any trailing null bytes in the source
fixed‐width buffer, you should use zustr2ustp(3).
Measured character sequences
The simplest character sequence copying function is mempcpy(3). It re‐
quires always knowing the length of your character sequences, for which
structures can be used. It makes the code much faster, since you al‐
ways know the length of your character sequences, and can do the mini‐
mal copies and length measurements. mempcpy(3) copies character se‐
quences, so you need to explicitly set the terminating null byte if you
need a string.
However, for keeping type safety, it’s good to add a wrapper that uses
char * instead of void *: ustpcpy(3).
In programs that make considerable use of strings or character se‐
quences, and need the best performance, using overlapping character se‐
quences can make a big difference. It allows holding subsequences of a
larger character sequence. while not duplicating memory nor using time
to do a copy.
However, this is delicate, since it requires using character sequences.
C library APIs use strings, so programs that use character sequences
will have to take care of differentiating strings from character se‐
quences.
To copy a measured character sequence, use ustpcpy(3).
To copy a measured character sequence into a string, use ustr2stp(3).
Because these functions ask for the length, and a string is by nature
composed of a character sequence of the same length plus a terminating
null byte, a string is also accepted as input.
String vs character sequence
Some functions only operate on strings. Those require that the input
src is a string, and guarantee an output string (even when truncation
occurs). Functions that concatenate also require that dst holds a
string before the call. List of functions:
• stpcpy(3)
• strcpy(3), strcat(3)
• stpecpy(3), stpecpyx(3)
• strlcpy(3bsd), strlcat(3bsd)
• strscpy(3)
Other functions require an input string, but create a character se‐
quence as output. These functions have confusing names, and have a
long history of misuse. List of functions:
• stpncpy(3)
• strncpy(3)
Other functions operate on an input character sequence, and create an
output string. Functions that concatenate also require that dst holds
a string before the call. strncat(3) has an even more misleading name
than the functions above. List of functions:
• zustr2stp(3)
• strncat(3)
• ustr2stp(3)
Other functions operate on an input character sequence to create an
output character sequence. List of functions:
• ustpcpy(3)
• zustr2stp(3)
Functions
stpcpy(3)
This function copies the input string into a destination string.
The programmer is responsible for allocating a buffer large
enough. It returns a pointer suitable for chaining.
An implementation of this function might be:
char *
stpcpy(char *restrict dst, const char *restrict src)
{
char *end;
end = mempcpy(dst, src, strlen(src));
*end = '\0';
return end;
}
strcpy(3)
strcat(3)
These functions copy the input string into a destination string.
The programmer is responsible for allocating a buffer large
enough. The return value is useless.
stpcpy(3) is a faster alternative to these functions.
An implementation of these functions might be:
char *
strcpy(char *restrict dst, const char *restrict src)
{
stpcpy(dst, src);
return dst;
}
char *
strcat(char *restrict dst, const char *restrict src)
{
stpcpy(dst + strlen(dst), src);
return dst;
}
stpecpy(3)
stpecpyx(3)
These functions copy the input string into a destination string.
If the destination buffer, limited by a pointer to one past the
end of it, isn’t large enough to hold the copy, the resulting
string is truncated (but it is guaranteed to be null‐termi‐
nated). They return a pointer suitable for chaining. Trunca‐
tion needs to be detected only once after the last chained call.
stpecpyx(3) has identical semantics to stpecpy(3), except that
it forces a SIGSEGV if the src pointer is not a string.
These functions are not provided by any library, but you can de‐
fine them with the following reference implementations:
/* This code is in the public domain. */
char *
stpecpy(char *dst, char past_end[0],
const char *restrict src)
{
char *p;
if (dst == past_end)
return past_end;
p = memccpy(dst, src, '\0', past_end - dst);
if (p != NULL)
return p - 1;
/* truncation detected */
past_end[-1] = '\0';
return past_end;
}
/* This code is in the public domain. */
char *
stpecpyx(char *dst, char past_end[0],
const char *restrict src)
{
if (src[strlen(src)] != '\0')
raise(SIGSEGV);
return stpecpy(dst, past_end, src);
}
strlcpy(3bsd)
strlcat(3bsd)
These functions copy the input string into a destination string.
If the destination buffer, limited by its size, isn’t large
enough to hold the copy, the resulting string is truncated (but
it is guaranteed to be null‐terminated). They return the length
of the total string they tried to create. These functions force
a SIGSEGV if the src pointer is not a string.
stpecpyx(3) is a faster alternative to these functions.
strscpy(3)
This function copies the input string into a destination string.
If the destination buffer, limited by its size, isn’t large
enough to hold the copy, the resulting string is truncated (but
it is guaranteed to be null‐terminated). It returns the length
of the destination string, or -E2BIG on truncation.
stpecpy(3) is a simpler and faster alternative to this function.
stpncpy(3)
This function copies the input string into a destination null‐
padded character sequence in a fixed‐width buffer. If the des‐
tination buffer, limited by its size, isn’t large enough to hold
the copy, the resulting character sequence is truncated. Since
it creates a character sequence, it doesn’t need to write a ter‐
minating null byte. It returns a pointer suitable for chaining,
but it’s not ideal for that. It’s impossible to distinguish
truncation after the call, from a character sequence that just
fits the destination buffer; truncation should be detected from
the length of the original string.
If you’re going to use this function in chained calls, it would
be useful to develop a similar function that accepts a pointer
to one past the end of the buffer instead of a size.
An implementation of this function might be:
char *
stpncpy(char *restrict dst, const char *restrict src,
size_t sz)
{
char *p;
bzero(dst, sz);
p = memccpy(dst, src, '\0', sz);
if (p == NULL)
return dst + sz;
return p - 1;
}
strncpy(3)
This function is identical to stpncpy(3) except for the useless
return value.
stpncpy(3) is a simpler alternative to this function.
An implementation of this function might be:
char *
strncpy(char *restrict dst, const char *restrict src,
size_t sz)
{
stpncpy(dst, src, sz);
return dst;
}
zustr2ustp(3)
This function copies the input character sequence contained in a
null‐padded wixed‐width buffer, into a destination character se‐
quence. The programmer is responsible for allocating a buffer
large enough. It returns a pointer suitable for chaining.
A truncating version of this function doesn’t exist, since the
size of the original character sequence is always known, so it
wouldn’t be very useful.
This function is not provided by any library, but you can define
it with the following reference implementation:
/* This code is in the public domain. */
char *
zustr2ustp(char *restrict dst, const char *restrict src,
size_t sz)
{
return ustpcpy(dst, src, strnlen(src, sz));
}
zustr2stp(3)
This function copies the input character sequence contained in a
null‐padded wixed‐width buffer, into a destination string. The
programmer is responsible for allocating a buffer large enough.
It returns a pointer suitable for chaining.
A truncating version of this function doesn’t exist, since the
size of the original character sequence is always known, so it
wouldn’t be very useful.
This function is not provided by any library, but you can define
it with the following reference implementation:
/* This code is in the public domain. */
char *
zustr2stp(char *restrict dst, const char *restrict src,
size_t sz)
{
char *end;
end = zustr2ustp(dst, src, sz);
*end = '\0';
return end;
}
strncat(3)
Do not confuse this function with strncpy(3); they are not re‐
lated at all.
This function concatenates the input character sequence con‐
tained in a null‐padded wixed‐width buffer, into a destination
string. The programmer is responsible for allocating a buffer
large enough. The return value is useless.
zustr2stp(3) is a faster alternative to this function.
An implementation of this function might be:
char *
strncat(char *restrict dst, const char *restrict src,
size_t sz)
{
zustr2stp(dst + strlen(dst), src, sz);
return dst;
}
ustpcpy(3)
This function copies the input character sequence, limited by
its length, into a destination character sequence. The program‐
mer is responsible for allocating a buffer large enough. It re‐
turns a pointer suitable for chaining.
An implementation of this function might be:
/* This code is in the public domain. */
char *
ustpcpy(char *restrict dst, const char *restrict src,
size_t len)
{
return mempcpy(dst, src, len);
}
ustr2stp(3)
This function copies the input character sequence, limited by
its length, into a destination string. The programmer is re‐
sponsible for allocating a buffer large enough. It returns a
pointer suitable for chaining.
An implementation of this function might be:
/* This code is in the public domain. */
char *
ustr2stp(char *restrict dst, const char *restrict src,
size_t len)
{
char *end;
end = ustpcpy(dst, src, len);
*end = '\0';
return end;
}
RETURN VALUE
The following functions return a pointer to the terminating null byte
in the destination string.
• stpcpy(3)
• ustr2stp(3)
• zustr2stp(3)
The following functions return a pointer to the terminating null byte
in the destination string, except when truncation occurs; if truncation
occurs, they return a pointer to one past the end of the destination
buffer (past_end).
• stpecpy(3), stpecpyx(3)
The following function returns a pointer to one after the last charac‐
ter in the destination character sequence; if truncation occurs, that
pointer is equivalent to a pointer to one past the end of the destina‐
tion buffer.
• stpncpy(3)
The following function returns a pointer to one after the last charac‐
ter in the destination character sequence.
• zustr2ustp(3)
• ustpcpy(3)
The following functions return the length of the total string that they
tried to create (as if truncation didn’t occur).
• strlcpy(3bsd), strlcat(3bsd)
The following function returns the length of the destination string, or
-E2BIG on truncation.
• strscpy(3)
The following functions return the dst pointer, which is useless.
• strcpy(3), strcat(3)
• strncpy(3)
• strncat(3)
ATTRIBUTES
For an explanation of the terms used in this section, see attrib‐
utes(7).
┌────────────────────────────────────────────┬───────────────┬─────────┐
│Interface │ Attribute │ Value │
├────────────────────────────────────────────┼───────────────┼─────────┤
│stpcpy(), strcpy(), strcat(), stpecpy(), │ Thread safety │ MT‐Safe │
│stpecpyx() strlcpy(), strlcat(), strscpy(), │ │ │
│stpncpy(), strncpy(), zustr2ustp(), │ │ │
│zustr2stp(), strncat(), ustr2stp() │ │ │
│ustpcpy() │ │ │
└────────────────────────────────────────────┴───────────────┴─────────┘
STANDARDS
strcpy(3), strcat(3)
strncpy(3)
strncat(3)
POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
stpcpy(3)
stpncpy(3)
POSIX.1‐2008.
strlcpy(3bsd), strlcat(3bsd)
Functions originated in OpenBSD and present in some Unix sys‐
tems.
strscpy(3)
Linux kernel internal function.
stpecpy(3), stpecpyx(3)
zustr2ustp(3)
zustr2stp(3)
ustr2stp(3), ustpcpy(3)
Not defined by any standards nor libraries.
CAVEATS
Don’t mix chain calls to truncating and non‐truncating functions. It
is conceptually wrong unless you know that the first part of a copy
will always fit. Anyway, the performance difference will probably be
negligible, so it will probably be more clear if you use consistent se‐
mantics: either truncating or non‐truncating. Calling a non‐truncating
function after a truncating one is necessarily wrong.
Some of the functions described here are not provided by any library;
you should write your own copy if you want to use them. See STANDARDS.
BUGS
All concatenation (*cat()) functions share the same performance prob‐
lem: Shlemiel the painter ⟨https://www.joelonsoftware.com/2001/12/11/
back-to-basics/⟩.
EXAMPLES
The following are examples of correct use of each of these functions.
stpcpy(3)
p = buf;
p = stpcpy(p, "Hello ");
p = stpcpy(p, "world");
p = stpcpy(p, "!");
len = p - buf;
puts(buf);
strcpy(3)
strcat(3)
strcpy(buf, "Hello ");
strcat(buf, "world");
strcat(buf, "!");
len = strlen(buf);
puts(buf);
stpecpy(3)
stpecpyx(3)
past_end = buf + sizeof(buf);
p = buf;
p = stpecpy(p, past_end, "Hello ");
p = stpecpy(p, past_end, "world");
p = stpecpy(p, past_end, "!");
if (p == past_end) {
p--;
goto toolong;
}
len = p - buf;
puts(buf);
strlcpy(3bsd)
strlcat(3bsd)
if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
goto toolong;
if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
goto toolong;
len = strlcat(buf, "!", sizeof(buf));
if (len >= sizeof(buf))
goto toolong;
puts(buf);
strscpy(3)
len = strscpy(buf, "Hello world!", sizeof(buf));
if (len == -E2BIG)
goto toolong;
puts(buf);
stpncpy(3)
end = stpncpy(buf, "Hello world!", sizeof(buf));
if (sizeof(buf) < strlen("Hello world!"))
goto toolong;
len = end - buf;
for (size_t i = 0; i < sizeof(buf); i++)
putchar(buf[i]);
strncpy(3)
strncpy(buf, "Hello world!", sizeof(buf));
if (sizeof(buf) < strlen("Hello world!"))
goto toolong;
len = strnlen(buf, sizeof(buf));
for (size_t i = 0; i < sizeof(buf); i++)
putchar(buf[i]);
zustr2ustp(3)
p = buf;
p = zustr2ustp(p, "Hello ", 6);
p = zustr2ustp(p, "world", 42); // Padding null bytes ignored.
p = zustr2ustp(p, "!", 1);
len = p - buf;
printf("%.*s\n", (int) len, buf);
zustr2stp(3)
p = buf;
p = zustr2stp(p, "Hello ", 6);
p = zustr2stp(p, "world", 42); // Padding null bytes ignored.
p = zustr2stp(p, "!", 1);
len = p - buf;
puts(buf);
strncat(3)
buf[0] = '\0'; // There’s no ’cpy’ function to this ’cat’.
strncat(buf, "Hello ", 6);
strncat(buf, "world", 42); // Padding null bytes ignored.
strncat(buf, "!", 1);
len = strlen(buf);
puts(buf);
ustpcpy(3)
p = buf;
p = ustpcpy(p, "Hello ", 6);
p = ustpcpy(p, "world", 5);
p = ustpcpy(p, "!", 1);
len = p - buf;
printf("%.*s\n", (int) len, buf);
ustr2stp(3)
p = buf;
p = ustr2stp(p, "Hello ", 6);
p = ustr2stp(p, "world", 5);
p = ustr2stp(p, "!", 1);
len = p - buf;
puts(buf);
SEE ALSO
bzero(3), memcpy(3), memccpy(3), mempcpy(3), string(3)
Linux man‐pages (unreleased) (date) strcpy(3)
--
2.38.1
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v4 1/1] strcpy.3: Rewrite page to document all string-copying functions
2022-12-14 0:03 ` [PATCH v3 0/1] Rewritten page for string-copying functions Alejandro Colomar
2022-12-14 0:14 ` Alejandro Colomar
2022-12-14 16:17 ` [PATCH v4 " Alejandro Colomar
@ 2022-12-14 16:17 ` Alejandro Colomar
2 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-14 16:17 UTC (permalink / raw)
To: linux-man, Martin Sebor, G. Branden Robinson, Douglas McIlroy,
Jakub Wilk
Cc: Alejandro Colomar
This is an opportunity to use consistent language across the
documentation for all string-copying functions.
It is also easier to show the similarities and differences between all
of the functions, so that a reader can use this page to know which
function is needed for a given task.
Many functions that are inferior to another one, have been marked as
deprecated, notwithstanding the deprecation status in C libraries or
any standards. Alternatives have been given in the same page, with
reference implementations.
Cc: Martin Sebor <msebor@redhat.com>
Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
Cc: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/strcpy.3 | 1164 +++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 1076 insertions(+), 88 deletions(-)
diff --git a/man3/strcpy.3 b/man3/strcpy.3
index 74c3180ae..3b97da822 100644
--- a/man3/strcpy.3
+++ b/man3/strcpy.3
@@ -1,48 +1,845 @@
-.\" Copyright (C) 1993 David Metcalfe (david@prism.demon.co.uk)
+.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\" References consulted:
-.\" Linux libc source code
-.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
-.\" 386BSD man pages
-.\" Modified Sat Jul 24 18:06:49 1993 by Rik Faith (faith@cs.unc.edu)
-.\" Modified Fri Aug 25 23:17:51 1995 by Andries Brouwer (aeb@cwi.nl)
-.\" Modified Wed Dec 18 00:47:18 1996 by Andries Brouwer (aeb@cwi.nl)
-.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
-.\" Improve discussion of strncpy().
+.\" SPDX-License-Identifier: BSD-3-Clause
.\"
.TH strcpy 3 (date) "Linux man-pages (unreleased)"
+.\" ----- NAME :: -----------------------------------------------------/
.SH NAME
-strcpy \- copy a string
+stpcpy,
+strcpy, strcat,
+stpecpy, stpecpyx,
+strlcpy, strlcat,
+strscpy,
+stpncpy,
+strncpy,
+zustr2ustp, zustr2stp,
+strncat,
+ustpcpy, ustr2stp
+\- copy strings and character sequences
+.\" ----- LIBRARY :: --------------------------------------------------/
.SH LIBRARY
+.TP
+.BR stpcpy (3)
+.TQ
+.BR strcpy "(3), \c"
+.BR strcat (3)
+.TQ
+.BR stpncpy (3)
+.TQ
+.BR strncpy (3)
+.TQ
+.BR strncat (3)
Standard C library
.RI ( libc ", " \-lc )
+.TP
+.BR stpecpy "(3), \c"
+.BR stpecpyx (3)
+.TQ
+.BR zustr2ustp "(3), \c"
+.BR zustr2stp (3)
+.TQ
+.BR ustpcpy "(3), \c"
+.BR ustr2stp (3)
+Not provided by any library.
+.TP
+.BR strlcpy "(3), \c"
+.BR strlcat (3)
+Utility functions from BSD systems
+.RI ( libbsd ", " \-lbsd )
+.TP
+.BR strscpy (3)
+Not provided by any library.
+It is a Linux kernel internal function.
+.\" ----- SYNOPSIS :: -------------------------------------------------/
.SH SYNOPSIS
.nf
.B #include <string.h>
+.fi
+.\" ----- SYNOPSIS :: (Null-terminated) strings -----------------------/
+.SS Strings
+.nf
+// Chain-copy a string.
+.BI "char *stpcpy(char *restrict " dst ", const char *restrict " src );
.PP
-.BI "char *strcpy(char *restrict " dest ", const char *restrict " src );
+// Copy/concatenate a string.
+.BI "char *strcpy(char *restrict " dst ", const char *restrict " src );
+.BI "char *strcat(char *restrict " dst ", const char *restrict " src );
+.PP
+// Chain-copy a string with truncation.
+.BI "char *stpecpy(char *" dst ", char " past_end "[0], \
+const char *restrict " src );
+.PP
+// Chain-copy a string with truncation and SIGSEGV on UB.
+.BI "char *stpecpyx(char *" dst ", char " past_end "[0], \
+const char *restrict " src );
+.PP
+// Copy/concatenate a string with truncation and SIGSEGV on UB.
+.BI "size_t strlcpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.BI "size_t strlcat(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Copy a string with truncation.
+.BI "ssize_t strscpy(char " dst "[restrict ." sz "], \
+const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.fi
+.\" ----- SYNOPSIS :: Null-padded character sequences --------/
+.SS Null-padded character sequences
+.nf
+// Zero a fixed-width buffer, and
+// copy a string into a character sequence with truncation.
+.BI "char *stpncpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Zero a fixed-width buffer, and
+// copy a string into a character sequence with truncation.
+.BI "char *strncpy(char " dest "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Chain-copy a null-padded character sequence into a character sequence.
+.BI "char *zustr2ustp(char *restrict " dst ", \
+const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.PP
+// Chain-copy a null-padded character sequence into a string.
+.BI "char *zustr2stp(char *restrict " dst ", \
+const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.PP
+// Concatenate a null-padded character sequence into a string.
+.BI "char *strncat(char *restrict " dst ", const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.fi
+.\" ----- SYNOPSIS :: Measured character sequences --------------------/
+.SS Measured character sequences
+.nf
+// Chain-copy a measured character sequence.
+.BI "char *ustpcpy(char *restrict " dst ", \
+const char " src "[restrict ." len ],
+.BI " size_t " len );
+.PP
+// Chain-copy a measured character sequence into a string.
+.BI "char *ustr2stp(char *restrict " dst ", \
+const char " src "[restrict ." len ],
+.BI " size_t " len );
+.fi
+.PP
+.RS -4
+Feature Test Macro Requirements for glibc (see
+.BR feature_test_macros (7)):
+.RE
+.PP
+.BR stpcpy (3),
+.BR stpncpy (3):
+.nf
+ Since glibc 2.10:
+ _POSIX_C_SOURCE >= 200809L
+ Before glibc 2.10:
+ _GNU_SOURCE
.fi
.SH DESCRIPTION
-The
-.BR strcpy ()
-function copies the string pointed to by
-.IR src ,
-including the terminating null byte (\(aq\e0\(aq),
-to the buffer pointed to by
-.IR dest .
-The strings may not overlap, and the destination string
-.I dest
-must be large enough to receive the copy.
-.I Beware of buffer overruns!
-(See BUGS.)
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: -----------------/
+.SS Terms (and abbreviations)
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: string (str) ----/
+.TP
+.IR "string " ( str )
+is a sequence of zero or more non-null characters followed by a null byte.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: null-padded character seq
+.TP
+.I character sequence
+is a sequence of zero or more non-null characters.
+A program should never usa a character sequence where a string is required.
+However, with appropriate care,
+a string can be used in the place of a character sequence.
+.RS
+.TP
+.IR "null-padded character sequence " ( zustr )
+Character sequences can be contained in fixed-width buffers,
+which contain padding null bytes after the character sequence,
+to fill the rest of the buffer
+without affecting the character sequence;
+however, those padding null bytes are not part of the character sequence.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: measured character sequence
+.TP
+.IR "measured character sequence " ( ustr )
+Character sequence delimited by its length.
+It may be a slice of a larger character sequence,
+or even of a string.
+.RE
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: length (len) ----/
+.TP
+.IR "length " ( len )
+is the number of non-null characters in a string or character sequence.
+It is the return value of
+.I strlen(str)
+and of
+.IR "strnlen(ustr, sz)" .
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: size (sz) -------/
+.TP
+.IR "size " ( sz )
+refers to the entire buffer
+where the string or character sequence is contained.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: end -------------/
+.TP
+.I end
+is the name of a pointer to the terminating null byte of a string,
+or a pointer to one past the last character of a character sequence.
+This is the return value of functions that allow chaining.
+It is equivalent to
+.IR &str[len] .
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: past_end --------/
+.TP
+.I past_end
+is the name of a pointer to one past the end of the buffer
+that contains a string or character sequence.
+It is equivalent to
+.IR &str[sz] .
+It is used as a sentinel value,
+to be able to truncate strings or character sequences
+instead of overrunning the containing buffer.
+.\" ----- DESCRIPTION :: Copy, concatenate, and chain-copy ------------/
+.SS Copy, concatenate, and chain-copy
+Originally,
+there was a distinction between functions that copy and those that concatenate.
+However, newer functions that copy while allowing chaining
+cover both use cases with a single API.
+They are also algorithmically faster,
+since they don't need to search for the end of the existing string.
+However, functions that concatenate have a much simpler use,
+so if performance is not important,
+it can make sense to use them for improving readability.
+.PP
+To chain copy functions,
+they need to return a pointer to the
+.IR end .
+That's a byproduct of the copy operation,
+so it has no performance costs.
+Functions that return such a pointer,
+and thus can be chained,
+have names of the form
+.RB * stp *(),
+since it's also common to name the pointer just
+.IR p .
+.PP
+Chain-copying functions that truncate
+should accept a pointer to one past the end of the destination buffer,
+and have names of the form
+.RB * stpe *().
+This allows not having to recalculate the remaining size after each call.
+.\" ----- DESCRIPTION :: Truncate or not? -----------------------------/
+.SS Truncate or not?
+The first thing to note is that programmers should be careful with buffers,
+so they always have the correct size,
+and truncation is not necessary.
+.PP
+In most cases,
+truncation is not desired,
+and it is simpler to just do the copy.
+Simpler code is safer code.
+Programming against programming mistakes by adding more code
+just adds more points where mistakes can be made.
+.PP
+Nowadays,
+compilers can detect most programmer errors with features like
+compiler warnings,
+static analyzers, and
+.BR \%_FORTIFY_SOURCE
+(see
+.BR ftm (7)).
+Keeping the code simple
+helps these overflow-detection features be more precise.
+.PP
+When validating user input,
+however,
+it makes sense to truncate.
+Remember to check the return value of such function calls.
+.PP
+Functions that truncate:
+.IP \(bu 3
+.BR stpecpy (3)
+is the most efficient string copy function that performs truncation.
+It only requires to check for truncation once after all chained calls.
+.IP \(bu
+.BR stpecpyx (3)
+is a variant of
+.BR stpecpy (3)
+that consumes the entire source string,
+to catch bugs in the program
+by forcing a segmentation fault (as
+.BR strlcpy (3bsd)
+and
+.BR strlcat (3bsd)
+do).
+.IP \(bu
+.BR strlcpy (3bsd)
+and
+.BR strlcat (3bsd)
+are designed to crash if the input string is invalid
+(doesn't contain a terminating null byte).
+.IP \(bu
+.BR strscpy (3)
+reports an error instead of crashing (similar to
+.BR stpecpy (3)).
+.IP \(bu
+.BR stpncpy (3)
+and
+.BR strncpy (3)
+also truncate, but they don't write strings,
+but rather null-padded character sequences.
+.\" ----- DESCRIPTION :: Null-padded character sequences --------------/
+.SS Null-padded character sequences
+For historic reasons,
+some standard APIs,
+such as
+.BR utmpx (5),
+use null-padded character sequences in fixed-width buffers.
+To interface with them,
+specialized functions need to be used.
+.PP
+To copy strings into them, use
+.BR stpncpy (3).
+.PP
+To copy from an unterminated string within a fixed-width buffer into a string,
+ignoring any trailing null bytes in the source fixed-width buffer,
+you should use
+.BR zustr2stp (3)
+or
+.BR strncat (3).
+.PP
+To copy from an unterminated string within a fixed-width buffer
+into a character sequence,
+ingoring any trailing null bytes in the source fixed-width buffer,
+you should use
+.BR zustr2ustp (3).
+.\" ----- DESCRIPTION :: Measured character sequences -----------------/
+.SS Measured character sequences
+The simplest character sequence copying function is
+.BR mempcpy (3).
+It requires always knowing the length of your character sequences,
+for which structures can be used.
+It makes the code much faster,
+since you always know the length of your character sequences,
+and can do the minimal copies and length measurements.
+.BR mempcpy (3)
+copies character sequences,
+so you need to explicitly set the terminating null byte if you need a string.
+.PP
+However,
+for keeping type safety,
+it's good to add a wrapper that uses
+.I char\~*
+instead of
+.IR void\~* :
+.BR ustpcpy (3).
+.PP
+In programs that make considerable use of strings or character sequences,
+and need the best performance,
+using overlapping character sequences can make a big difference.
+It allows holding subsequences of a larger character sequence.
+while not duplicating memory
+nor using time to do a copy.
+.PP
+However, this is delicate,
+since it requires using character sequences.
+C library APIs use strings,
+so programs that use character sequences
+will have to take care of differentiating strings from character sequences.
+.PP
+To copy a measured character sequence, use
+.BR ustpcpy (3).
+.PP
+To copy a measured character sequence into a string, use
+.BR ustr2stp (3).
+.PP
+Because these functions ask for the length,
+and a string is by nature composed of a character sequence of the same length
+plus a terminating null byte,
+a string is also accepted as input.
+.\" ----- DESCRIPTION :: String vs character sequence -----------------/
+.SS String vs character sequence
+Some functions only operate on strings.
+Those require that the input
+.I src
+is a string,
+and guarantee an output string
+(even when truncation occurs).
+Functions that concatenate
+also require that
+.I dst
+holds a string before the call.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR stpcpy (3)
+.IP \(bu
+.BR strcpy "(3), \c"
+.BR strcat (3)
+.IP \(bu
+.BR stpecpy "(3), \c"
+.BR stpecpyx (3)
+.IP \(bu
+.BR strlcpy "(3bsd), \c"
+.BR strlcat (3bsd)
+.IP \(bu
+.BR strscpy (3)
+.PD
+.PP
+Other functions require an input string,
+but create a character sequence as output.
+These functions have confusing names,
+and have a long history of misuse.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR stpncpy (3)
+.IP \(bu
+.BR strncpy (3)
+.PD
+.PP
+Other functions operate on an input character sequence,
+and create an output string.
+Functions that concatenate
+also require that
+.I dst
+holds a string before the call.
+.BR strncat (3)
+has an even more misleading name than the functions above.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR zustr2stp (3)
+.IP \(bu
+.BR strncat (3)
+.IP \(bu
+.BR ustr2stp (3)
+.PD
+.PP
+Other functions operate on an input character sequence
+to create an output character sequence.
+List of functions:
+.IP \(bu 3
+.BR ustpcpy (3)
+.IP \(bu
+.BR zustr2stp (3)
+.\" ----- DESCRIPTION :: Functions :: ---------------------------------/
+.SS Functions
+.\" ----- DESCRIPTION :: Functions :: stpcpy(3) -----------------------/
+.TP
+.BR stpcpy (3)
+This function copies the input string into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+char *
+stpcpy(char *restrict dst, const char *restrict src)
+{
+ char *end;
+
+ end = mempcpy(dst, src, strlen(src));
+ *end = \(aq\e0\(aq;
+
+ return end;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: strcpy(3), strcat(3) ------------/
+.TP
+.BR strcpy (3)
+.TQ
+.BR strcat (3)
+These functions copy the input string into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+The return value is useless.
+.IP
+.BR stpcpy (3)
+is a faster alternative to these functions.
+.IP
+An implementation of these functions might be:
+.IP
+.in +4n
+.EX
+char *
+strcpy(char *restrict dst, const char *restrict src)
+{
+ stpcpy(dst, src);
+ return dst;
+}
+
+char *
+strcat(char *restrict dst, const char *restrict src)
+{
+ stpcpy(dst + strlen(dst), src);
+ return dst;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: stpecpy(3), stpecpyx(3) ---------/
+.TP
+.BR stpecpy (3)
+.TQ
+.BR stpecpyx (3)
+These functions copy the input string into a destination string.
+If the destination buffer,
+limited by a pointer to one past the end of it,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+They return a pointer suitable for chaining.
+Truncation needs to be detected only once after the last chained call.
+.BR stpecpyx (3)
+has identical semantics to
+.BR stpecpy (3),
+except that it forces a SIGSEGV if the
+.I src
+pointer is not a string.
+.IP
+These functions are not provided by any library,
+but you can define them with the following reference implementations:
+.IP
+.in +4n
+.EX
+/* This code is in the public domain. */
+char *
+stpecpy(char *dst, char past_end[0],
+ const char *restrict src)
+{
+ char *p;
+
+ if (dst == past_end)
+ return past_end;
+
+ p = memccpy(dst, src, \(aq\e0\(aq, past_end \- dst);
+ if (p != NULL)
+ return p \- 1;
+
+ /* truncation detected */
+ past_end[\-1] = \(aq\e0\(aq;
+ return past_end;
+}
+
+/* This code is in the public domain. */
+char *
+stpecpyx(char *dst, char past_end[0],
+ const char *restrict src)
+{
+ if (src[strlen(src)] != \(aq\e0\(aq)
+ raise(SIGSEGV);
+
+ return stpecpy(dst, past_end, src);
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: strlcpy(3bsd), strlcat(3bsd) ----/
+.TP
+.BR strlcpy (3bsd)
+.TQ
+.BR strlcat (3bsd)
+These functions copy the input string into a destination string.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+They return the length of the total string they tried to create.
+These functions force a SIGSEGV if the
+.I src
+pointer is not a string.
+.IP
+.BR stpecpyx (3)
+is a faster alternative to these functions.
+.\" ----- DESCRIPTION :: Functions :: strscpy(3) ----------------------/
+.TP
+.BR strscpy (3)
+This function copies the input string into a destination string.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+It returns the length of the destination string, or
+.B \-E2BIG
+on truncation.
+.IP
+.BR stpecpy (3)
+is a simpler and faster alternative to this function.
+.RE
+.\" ----- DESCRIPTION :: Functions :: stpncpy(3) ----------------------/
+.TP
+.BR stpncpy (3)
+This function copies the input string into
+a destination null-padded character sequence in a fixed-width buffer.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting character sequence is truncated.
+Since it creates a character sequence,
+it doesn't need to write a terminating null byte.
+It returns a pointer suitable for chaining,
+but it's not ideal for that.
+It's impossible to distinguish truncation after the call,
+from a character sequence that just fits the destination buffer;
+truncation should be detected from the length of the original string.
+.IP
+If you're going to use this function in chained calls,
+it would be useful to develop a similar function
+that accepts a pointer to one past the end of the buffer instead of a size.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+char *
+stpncpy(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ char *p;
+
+ bzero(dst, sz);
+ p = memccpy(dst, src, \(aq\e0\(aq, sz);
+ if (p == NULL)
+ return dst + sz;
+
+ return p \- 1;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: strncpy(3) ----------------------/
+.TP
+.BR strncpy (3)
+This function is identical to
+.BR stpncpy (3)
+except for the useless return value.
+.IP
+.BR stpncpy (3)
+is a simpler alternative to this function.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+char *
+strncpy(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ stpncpy(dst, src, sz);
+ return dst;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: zustr2ustp(3) --------------------/
+.TP
+.BR zustr2ustp (3)
+This function copies the input character sequence
+contained in a null-padded wixed-width buffer,
+into a destination character sequence.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+A truncating version of this function doesn't exist,
+since the size of the original character sequence is always known,
+so it wouldn't be very useful.
+.IP
+This function is not provided by any library,
+but you can define it with the following reference implementation:
+.IP
+.in +4n
+.EX
+/* This code is in the public domain. */
+char *
+zustr2ustp(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ return ustpcpy(dst, src, strnlen(src, sz));
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: zustr2stp(3) --------------------/
+.TP
+.BR zustr2stp (3)
+This function copies the input character sequence
+contained in a null-padded wixed-width buffer,
+into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+A truncating version of this function doesn't exist,
+since the size of the original character sequence is always known,
+so it wouldn't be very useful.
+.IP
+This function is not provided by any library,
+but you can define it with the following reference implementation:
+.IP
+.in +4n
+.EX
+/* This code is in the public domain. */
+char *
+zustr2stp(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ char *end;
+
+ end = zustr2ustp(dst, src, sz);
+ *end = \(aq\e0\(aq;
+
+ return end;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: strncat(3) ----------------------/
+.TP
+.BR strncat (3)
+Do not confuse this function with
+.BR strncpy (3);
+they are not related at all.
+.IP
+This function concatenates the input character sequence
+contained in a null-padded wixed-width buffer,
+into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+The return value is useless.
+.IP
+.BR zustr2stp (3)
+is a faster alternative to this function.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+char *
+strncat(char *restrict dst, const char *restrict src,
+ size_t sz)
+{
+ zustr2stp(dst + strlen(dst), src, sz);
+ return dst;
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: ustpcpy(3) ----------------------/
+.TP
+.BR ustpcpy (3)
+This function copies the input character sequence,
+limited by its length,
+into a destination character sequence.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+/* This code is in the public domain. */
+char *
+ustpcpy(char *restrict dst, const char *restrict src,
+ size_t len)
+{
+ return mempcpy(dst, src, len);
+}
+.EE
+.in
+.\" ----- DESCRIPTION :: Functions :: ustr2stp(3) ---------------------/
+.TP
+.BR ustr2stp (3)
+This function copies the input character sequence,
+limited by its length,
+into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+An implementation of this function might be:
+.IP
+.in +4n
+.EX
+/* This code is in the public domain. */
+char *
+ustr2stp(char *restrict dst, const char *restrict src,
+ size_t len)
+{
+ char *end;
+
+ end = ustpcpy(dst, src, len);
+ *end = \(aq\e0\(aq;
+
+ return end;
+}
+.EE
+.in
+.\" ----- RETURN VALUE :: ---------------------------------------------/
.SH RETURN VALUE
-The
-.BR strcpy ()
-function returns a pointer to
-the destination string
-.IR dest .
+The following functions return
+a pointer to the terminating null byte in the destination string.
+.IP \(bu 3
+.PD 0
+.BR stpcpy (3)
+.IP \(bu
+.BR ustr2stp (3)
+.IP \(bu
+.BR zustr2stp (3)
+.PD
+.PP
+The following functions return
+a pointer to the terminating null byte in the destination string,
+except when truncation occurs;
+if truncation occurs,
+they return a pointer to one past the end of the destination buffer
+.RI ( past_end ).
+.IP \(bu 3
+.BR stpecpy (3),
+.BR stpecpyx (3)
+.PP
+The following function returns
+a pointer to one after the last character
+in the destination character sequence;
+if truncation occurs,
+that pointer is equivalent to
+a pointer to one past the end of the destination buffer.
+.IP \(bu 3
+.BR stpncpy (3)
+.PP
+The following function returns
+a pointer to one after the last character
+in the destination character sequence.
+.IP \(bu 3
+.BR zustr2ustp (3)
+.IP \(bu
+.BR ustpcpy (3)
+.PP
+The following functions return
+the length of the total string that they tried to create
+(as if truncation didn't occur).
+.IP \(bu 3
+.BR strlcpy (3bsd),
+.BR strlcat (3bsd)
+.PP
+The following function returns
+the length of the destination string, or
+.B \-E2BIG
+on truncation.
+.IP \(bu 3
+.BR strscpy (3)
+.PP
+The following functions return the
+.I dst
+pointer,
+which is useless.
+.IP \(bu 3
+.PD 0
+.BR strcpy (3),
+.BR strcat (3)
+.IP \(bu
+.BR strncpy (3)
+.IP \(bu
+.BR strncat (3)
+.PD
+.\" ----- ATTRIBUTES :: -----------------------------------------------/
.SH ATTRIBUTES
For an explanation of the terms used in this section, see
.BR attributes (7).
@@ -54,73 +851,264 @@ .SH ATTRIBUTES
l l l.
Interface Attribute Value
T{
-.BR strcpy ()
+.BR stpcpy (),
+.BR strcpy (),
+.BR strcat (),
+.BR stpecpy (),
+.BR stpecpyx ()
+.BR strlcpy (),
+.BR strlcat (),
+.BR strscpy (),
+.BR stpncpy (),
+.BR strncpy (),
+.BR zustr2ustp (),
+.BR zustr2stp (),
+.BR strncat (),
+.BR ustr2stp ()
+.BR ustpcpy ()
T} Thread safety MT-Safe
.TE
.hy
.ad
.sp 1
+.\" ----- STANDARDS :: ------------------------------------------------/
.SH STANDARDS
-POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH NOTES
-.SS strlcpy()
-Some systems (the BSDs, Solaris, and others) provide the following function:
+.TP
+.BR strcpy "(3), \c"
+.BR strcat (3)
+.TQ
+.BR strncpy (3)
+.TQ
+.BR strncat (3)
+POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
+.TP
+.BR stpcpy (3)
+.\" This function was added to POSIX.1-2008.
+.\" Before that, it was not part of
+.\" the C or POSIX.1 standards, nor customary on UNIX systems.
+.\" It first appeared at least as early as 1986,
+.\" in the Lattice C AmigaDOS compiler,
+.\" then in the GNU fileutils and GNU textutils in 1989,
+.\" and in the GNU C library by 1992.
+.\" It is also present on the BSDs.
+.TQ
+.BR stpncpy (3)
+.\" This function was added to POSIX.1-2008.
+.\" Before that, it was a GNU extension.
+.\" It first appeared in glibc 1.07 in 1993.
+POSIX.1-2008.
+.TP
+.BR strlcpy "(3bsd), \c"
+.BR strlcat (3bsd)
+Functions originated in OpenBSD and present in some Unix systems.
+.TP
+.BR strscpy (3)
+Linux kernel internal function.
+.TP
+.BR stpecpy "(3), \c"
+.BR stpecpyx (3)
+.TQ
+.BR zustr2ustp (3)
+.TQ
+.BR zustr2stp (3)
+.TQ
+.BR ustr2stp "(3), \c"
+.BR ustpcpy (3)
+Not defined by any standards nor libraries.
+.\" ----- CAVEATS :: --------------------------------------------------/
+.SH CAVEATS
+Don't mix chain calls to truncating and non-truncating functions.
+It is conceptually wrong
+unless you know that the first part of a copy will always fit.
+Anyway, the performance difference will probably be negligible,
+so it will probably be more clear if you use consistent semantics:
+either truncating or non-truncating.
+Calling a non-truncating function after a truncating one is necessarily wrong.
.PP
+Some of the functions described here are not provided by any library;
+you should write your own copy if you want to use them.
+See STANDARDS.
+.\" ----- BUGS :: -----------------------------------------------------/
+.SH BUGS
+All concatenation
+.RB (* cat ())
+functions share the same performance problem:
+.UR https://www.joelonsoftware.com/\:2001/12/11/\:back\-to\-basics/
+Shlemiel the painter
+.UE .
+.\" ----- EXAMPLES :: -------------------------------------------------/
+.SH EXAMPLES
+The following are examples of correct use of each of these functions.
+.\" ----- EXAMPLES :: stpcpy(3) ---------------------------------------/
+.TP
+.BR stpcpy (3)
.in +4n
.EX
-size_t strlcpy(char *dest, const char *src, size_t size);
+p = buf;
+p = stpcpy(p, "Hello ");
+p = stpcpy(p, "world");
+p = stpcpy(p, "!");
+len = p \- buf;
+puts(buf);
.EE
.in
-.PP
-.\" http://static.usenix.org/event/usenix99/full_papers/millert/millert_html/index.html
-.\" "strlcpy and strlcat - consistent, safe, string copy and concatenation"
-.\" 1999 USENIX Annual Technical Conference
-This function is similar to
-.BR strcpy (),
-but it copies at most
-.I size\-1
-bytes to
-.IR dest ,
-truncating the string as necessary.
-It always adds a terminating null byte.
-This function fixes some of the problems of
-.BR strcpy ()
-but the caller must still handle the possibility of data loss if
-.I size
-is too small.
-The return value of the function is the length of
-.IR src ,
-which allows truncation to be easily detected:
-if the return value is greater than or equal to
-.IR size ,
-truncation occurred.
-If loss of data matters, the caller
-.I must
-either check the arguments before the call,
-or test the function return value.
-.BR strlcpy ()
-is not present in glibc and is not standardized by POSIX,
-.\" https://lwn.net/Articles/506530/
-but is available on Linux via the
-.I libbsd
-library.
-.SH BUGS
-If the destination string of a
-.BR strcpy ()
-is not large enough, then anything might happen.
-Overflowing fixed-length string buffers is a favorite cracker technique
-for taking complete control of the machine.
-Any time a program reads or copies data into a buffer,
-the program first needs to check that there's enough space.
-This may be unnecessary if you can show that overflow is impossible,
-but be careful: programs can get changed over time,
-in ways that may make the impossible possible.
+.\" ----- EXAMPLES :: strcpy(3), strcat(3) ----------------------------/
+.TP
+.BR strcpy (3)
+.TQ
+.BR strcat (3)
+.in +4n
+.EX
+strcpy(buf, "Hello ");
+strcat(buf, "world");
+strcat(buf, "!");
+len = strlen(buf);
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: stpecpy(3), stpecpyx(3) -------------------------/
+.TP
+.BR stpecpy (3)
+.TQ
+.BR stpecpyx (3)
+.in +4n
+.EX
+past_end = buf + sizeof(buf);
+p = buf;
+p = stpecpy(p, past_end, "Hello ");
+p = stpecpy(p, past_end, "world");
+p = stpecpy(p, past_end, "!");
+if (p == past_end) {
+ p\-\-;
+ goto toolong;
+}
+len = p \- buf;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: strlcpy(3bsd), strlcat(3bsd) --------------------/
+.TP
+.BR strlcpy (3bsd)
+.TQ
+.BR strlcat (3bsd)
+.in +4n
+.EX
+if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
+ goto toolong;
+if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
+ goto toolong;
+len = strlcat(buf, "!", sizeof(buf));
+if (len >= sizeof(buf))
+ goto toolong;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: strscpy(3) --------------------------------------/
+.TP
+.BR strscpy (3)
+.in +4n
+.EX
+len = strscpy(buf, "Hello world!", sizeof(buf));
+if (len == \-E2BIG)
+ goto toolong;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: stpncpy(3) --------------------------------------/
+.TP
+.BR stpncpy (3)
+.in +4n
+.EX
+end = stpncpy(buf, "Hello world!", sizeof(buf));
+if (sizeof(buf) < strlen("Hello world!"))
+ goto toolong;
+len = end \- buf;
+for (size_t i = 0; i < sizeof(buf); i++)
+ putchar(buf[i]);
+.EE
+.in
+.\" ----- EXAMPLES :: strncpy(3) --------------------------------------/
+.TP
+.BR strncpy (3)
+.in +4n
+.EX
+strncpy(buf, "Hello world!", sizeof(buf));
+if (sizeof(buf) < strlen("Hello world!"))
+ goto toolong;
+len = strnlen(buf, sizeof(buf));
+for (size_t i = 0; i < sizeof(buf); i++)
+ putchar(buf[i]);
+.EE
+.in
+.\" ----- EXAMPLES :: zustr2ustp(3) -----------------------------------/
+.TP
+.BR zustr2ustp (3)
+.in +4n
+.EX
+p = buf;
+p = zustr2ustp(p, "Hello ", 6);
+p = zustr2ustp(p, "world", 42); // Padding null bytes ignored.
+p = zustr2ustp(p, "!", 1);
+len = p \- buf;
+printf("%.*s\en", (int) len, buf);
+.EE
+.in
+.\" ----- EXAMPLES :: zustr2stp(3) ------------------------------------/
+.TP
+.BR zustr2stp (3)
+.in +4n
+.EX
+p = buf;
+p = zustr2stp(p, "Hello ", 6);
+p = zustr2stp(p, "world", 42); // Padding null bytes ignored.
+p = zustr2stp(p, "!", 1);
+len = p \- buf;
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: strncat(3) --------------------------------------/
+.TP
+.BR strncat (3)
+.in +4n
+.EX
+buf[0] = \(aq\e0\(aq; // There's no 'cpy' function to this 'cat'.
+strncat(buf, "Hello ", 6);
+strncat(buf, "world", 42); // Padding null bytes ignored.
+strncat(buf, "!", 1);
+len = strlen(buf);
+puts(buf);
+.EE
+.in
+.\" ----- EXAMPLES :: ustpcpy(3) --------------------------------------/
+.TP
+.BR ustpcpy (3)
+.in +4n
+.EX
+p = buf;
+p = ustpcpy(p, "Hello ", 6);
+p = ustpcpy(p, "world", 5);
+p = ustpcpy(p, "!", 1);
+len = p \- buf;
+printf("%.*s\en", (int) len, buf);
+.EE
+.in
+.\" ----- EXAMPLES :: ustr2stp(3) -------------------------------------/
+.TP
+.BR ustr2stp (3)
+.in +4n
+.EX
+p = buf;
+p = ustr2stp(p, "Hello ", 6);
+p = ustr2stp(p, "world", 5);
+p = ustr2stp(p, "!", 1);
+len = p \- buf;
+puts(buf);
+.EE
+.in
+.\" ----- SEE ALSO :: -------------------------------------------------/
.SH SEE ALSO
-.BR bcopy (3),
-.BR memccpy (3),
+.BR bzero (3),
.BR memcpy (3),
-.BR memmove (3),
-.BR stpcpy (3),
-.BR strdup (3),
-.BR string (3),
-.BR wcscpy (3)
+.BR memccpy (3),
+.BR mempcpy (3),
+.BR string (3)
--
2.38.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v3 1/1] strcpy.3: Rewrite page to document all string-copying functions
2022-12-14 0:03 ` [PATCH v3 " Alejandro Colomar
@ 2022-12-14 16:22 ` Douglas McIlroy
2022-12-14 16:36 ` Alejandro Colomar
0 siblings, 1 reply; 53+ messages in thread
From: Douglas McIlroy @ 2022-12-14 16:22 UTC (permalink / raw)
To: Alejandro Colomar
Cc: linux-man, Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Jakub Wilk
> a sequence of zero or more non-null characters followed by a null byte
Varying terminology (character vs byte) is poor style in technical writing.
> concatenate
We began fighting this pomposity before v7. There has only been
backsliding since..
"Catenate" is crisper, means the same thing, and concurs with the "cat" command.
I invite you to join the battle for simplicity.
> chain copy
This term is never overtly defined. The definition might be inferred
from, "To chain copy
functions, they need to return a pointer to the end", but the
problematic grammar of the
sentence diverts attention from its content.
> strscpy
Doesn't it muddy the waters to include a non-library function in man3?
Doug
On Tue, Dec 13, 2022 at 7:03 PM Alejandro Colomar
<alx.manpages@gmail.com> wrote:
>
> This is an opportunity to use consistent language across the
> documentation for all string-copying functions.
>
> It is also easier to show the similarities and differences between all
> of the functions, so that a reader can use this page to know which
> function is needed for a given task.
>
> Many functions that are inferior to another one, have been marked as
> deprecated, notwithstanding the deprecation status in C libraries or
> any standards. Alternatives have been given in the same page, with
> reference implementations.
>
> Cc: Martin Sebor <msebor@redhat.com>
> Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
> Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
> Cc: Jakub Wilk <jwilk@jwilk.net>
> Signed-off-by: Alejandro Colomar <alx@kernel.org>
> ---
> man3/strcpy.3 | 1058 +++++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 970 insertions(+), 88 deletions(-)
>
> diff --git a/man3/strcpy.3 b/man3/strcpy.3
> index 74c3180ae..e04a7b149 100644
> --- a/man3/strcpy.3
> +++ b/man3/strcpy.3
> @@ -1,48 +1,767 @@
> -.\" Copyright (C) 1993 David Metcalfe (david@prism.demon.co.uk)
> +.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
> .\"
> -.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> -.\"
> -.\" References consulted:
> -.\" Linux libc source code
> -.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
> -.\" 386BSD man pages
> -.\" Modified Sat Jul 24 18:06:49 1993 by Rik Faith (faith@cs.unc.edu)
> -.\" Modified Fri Aug 25 23:17:51 1995 by Andries Brouwer (aeb@cwi.nl)
> -.\" Modified Wed Dec 18 00:47:18 1996 by Andries Brouwer (aeb@cwi.nl)
> -.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
> -.\" Improve discussion of strncpy().
> +.\" SPDX-License-Identifier: BSD-3-Clause
> .\"
> .TH strcpy 3 (date) "Linux man-pages (unreleased)"
> +.\" ----- NAME :: -----------------------------------------------------/
> .SH NAME
> -strcpy \- copy a string
> +stpcpy,
> +strcpy, strcat,
> +stpecpy, stpecpyx,
> +strlcpy, strlcat,
> +strscpy,
> +stpncpy,
> +strncpy,
> +ustr2stp,
> +strncat,
> +mempcpy
> +\- copy strings and character sequences
> +.\" ----- LIBRARY :: --------------------------------------------------/
> .SH LIBRARY
> +.TP
> +.BR stpcpy (3)
> +.TQ
> +.BR strcpy "(3), \c"
> +.BR strcat (3)
> +.TQ
> +.BR stpncpy (3)
> +.TQ
> +.BR strncpy (3)
> +.TQ
> +.BR strncat (3)
> +.TQ
> +.BR mempcpy (3)
> Standard C library
> .RI ( libc ", " \-lc )
> +.TP
> +.BR stpecpy "(3), \c"
> +.BR stpecpyx (3)
> +Not provided by any library.
> +.TP
> +.BR strlcpy "(3), \c"
> +.BR strlcat (3)
> +Utility functions from BSD systems
> +.RI ( libbsd ", " \-lbsd )
> +.TP
> +.BR strscpy (3)
> +Not provided by any library.
> +It is a Linux kernel internal function.
> +.\" ----- SYNOPSIS :: -------------------------------------------------/
> .SH SYNOPSIS
> .nf
> .B #include <string.h>
> +.fi
> +.\" ----- SYNOPSIS :: (Null-terminated) strings -----------------------/
> +.SS Strings
> +.nf
> +// Chain-copy a string.
> +.BI "char *stpcpy(char *restrict " dst ", const char *restrict " src );
> .PP
> -.BI "char *strcpy(char *restrict " dest ", const char *restrict " src );
> +// Copy/concatenate a string.
> +.BI "char *strcpy(char *restrict " dst ", const char *restrict " src );
> +.BI "char *strcat(char *restrict " dst ", const char *restrict " src );
> +.PP
> +// Chain-copy a string with truncation.
> +.BI "char *stpecpy(char *" dst ", char " past_end "[0], \
> +const char *restrict " src );
> +.PP
> +// Chain-copy a string with truncation and SIGSEGV on UB.
> +.BI "char *stpecpyx(char *" dst ", char " past_end "[0], \
> +const char *restrict " src );
> +.PP
> +// Copy/concatenate a string with truncation and SIGSEGV on UB.
> +.BI "size_t strlcpy(char " dst "[restrict ." sz "], \
> +const char *restrict " src ,
> +.BI " size_t " sz );
> +.BI "size_t strlcat(char " dst "[restrict ." sz "], \
> +const char *restrict " src ,
> +.BI " size_t " sz );
> +.PP
> +// Copy a string with truncation.
> +.BI "ssize_t strscpy(char " dst "[restrict ." sz "], \
> +const char " src "[restrict ." sz ],
> +.BI " size_t " sz );
> +.fi
> +.\" ----- SYNOPSIS :: Null-padded character sequences --------/
> +.SS Null-padded character sequences
> +.nf
> +// Zero a fixed-width buffer, and
> +// copy a string with truncation into a character sequence.
> +.BI "char *stpncpy(char " dst "[restrict ." sz "], \
> +const char *restrict " src ,
> +.BI " size_t " sz );
> +.PP
> +// Zero a fixed-width buffer, and
> +// copy a string with truncation into a character sequence.
> +.BI "char *strncpy(char " dest "[restrict ." sz "], \
> +const char *restrict " src ,
> +.BI " size_t " sz );
> +.PP
> +// Chain-copy a null-padded character sequence into a string.
> +.BI "char *ustr2stp(char *restrict " dst ", \
> +const char " src "[restrict ." sz ],
> +.BI " size_t " sz );
> +.PP
> +// Concatenate a null-padded character sequence into a string.
> +.BI "char *strncat(char *restrict " dst ", const char " src "[restrict ." sz ],
> +.BI " size_t " sz );
> +.fi
> +.\" ----- SYNOPSIS :: Measured character sequences --------------------/
> +.SS Measured character sequences
> +.nf
> +// Chain-copy a measured character sequence.
> +.BI "void *mempcpy(void *restrict " dst ", \
> +const void " src "[restrict ." len ],
> +.BI " size_t " len );
> +.fi
> +.PP
> +.RS -4
> +Feature Test Macro Requirements for glibc (see
> +.BR feature_test_macros (7)):
> +.RE
> +.PP
> +.BR stpcpy (3),
> +.BR stpncpy (3):
> +.nf
> + Since glibc 2.10:
> + _POSIX_C_SOURCE >= 200809L
> + Before glibc 2.10:
> + _GNU_SOURCE
> +.fi
> +.PP
> +.BR mempcpy (3):
> +.nf
> + _GNU_SOURCE
> .fi
> .SH DESCRIPTION
> -The
> -.BR strcpy ()
> -function copies the string pointed to by
> -.IR src ,
> -including the terminating null byte (\(aq\e0\(aq),
> -to the buffer pointed to by
> -.IR dest .
> -The strings may not overlap, and the destination string
> -.I dest
> -must be large enough to receive the copy.
> -.I Beware of buffer overruns!
> -(See BUGS.)
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: -----------------/
> +.SS Terms (and abbreviations)
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: string (str) ----/
> +.TP
> +.IR "string " ( str )
> +is a sequence of zero or more non-null characters followed by a null byte.
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: null-padded character seq
> +.TP
> +.IR "character sequence " ( ustr )
> +is a sequence of zero or more non-null characters.
> +A program should never usa a character sequence where a string is required.
> +However, with appropriate care,
> +a string can be used in the place of a character sequence.
> +.RS
> +.TP
> +.I null-padded character sequence
> +Character sequences can be contained in fixed-width buffers,
> +which contain padding null bytes after the character sequence,
> +to fill the rest of the buffer
> +without affecting the character sequence;
> +however, those padding null bytes are not part of the character sequence.
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: measured character sequence
> +.TP
> +.I measured character sequence
> +Character sequence delimited by its length.
> +.RE
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: length (len) ----/
> +.TP
> +.IR "length " ( len )
> +is the number of non-null characters in a string or character sequence.
> +It is the return value of
> +.I strlen(str)
> +and of
> +.IR "strnlen(ustr, sz)" .
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: size (sz) -------/
> +.TP
> +.IR "size " ( sz )
> +refers to the entire buffer
> +where the string or character sequence is contained.
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: end -------------/
> +.TP
> +.I end
> +is the name of a pointer to the terminating null byte of a string,
> +or a pointer to one past the last character of a character sequence.
> +This is the return value of functions that allow chaining.
> +It is equivalent to
> +.IR &str[len] .
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: past_end --------/
> +.TP
> +.I past_end
> +is the name of a pointer to one past the end of the buffer
> +that contains a string or character sequence.
> +It is equivalent to
> +.IR &str[sz] .
> +It is used as a sentinel value,
> +to be able to truncate strings or character sequences
> +instead of overrunning the containing buffer.
> +.\" ----- DESCRIPTION :: Copy, concatenate, and chain-copy ------------/
> +.SS Copy, concatenate, and chain-copy
> +Originally,
> +there was a distinction between functions that copy and those that concatenate.
> +However, newer functions that copy while allowing chaining
> +cover both use cases with a single API.
> +They are also algorithmically faster,
> +since they don't need to search for the end of the existing string.
> +However, functions that concatenate have a much simpler use,
> +so if performance is not important,
> +it can make sense to use them for improving readability.
> +.PP
> +To chain copy functions,
> +they need to return a pointer to the
> +.IR end .
> +That's a byproduct of the copy operation,
> +so it has no performance costs.
> +Functions that return such a pointer,
> +and thus can be chained,
> +have names of the form
> +.RB * stp *()
> +or
> +.RB * memp *(),
> +since it's also common to name the pointer just
> +.IR p .
> +.PP
> +Chain-copying functions that truncate
> +should accept a pointer to one past the end of the destination buffer,
> +and have names of the form
> +.RB * stpe *().
> +This allows not having to recalculate the remaining size after each call.
> +.\" ----- DESCRIPTION :: Truncate or not? -----------------------------/
> +.SS Truncate or not?
> +The first thing to note is that programmers should be careful with buffers,
> +so they always have the correct size,
> +and truncation is not necessary.
> +.PP
> +In most cases,
> +truncation is not desired,
> +and it is simpler to just do the copy.
> +Simpler code is safer code.
> +Programming against programming mistakes by adding more code
> +just adds more points where mistakes can be made.
> +.PP
> +Nowadays,
> +compilers can detect most programmer errors with features like
> +compiler warnings,
> +static analyzers, and
> +.BR \%_FORTIFY_SOURCE
> +(see
> +.BR ftm (7)).
> +Keeping the code simple
> +helps these overflow-detection features be more precise.
> +.PP
> +When validating user input,
> +however,
> +it makes sense to truncate.
> +Remember to check the return value of such function calls.
> +.PP
> +Functions that truncate:
> +.IP \(bu 3
> +.BR stpecpy (3)
> +is the most efficient string copy function that performs truncation.
> +It only requires to check for truncation once after all chained calls.
> +.IP \(bu
> +.BR stpecpyx (3)
> +is a variant of
> +.BR stpecpy (3)
> +that consumes the entire source string,
> +to catch bugs in the program
> +by forcing a segmentation fault (as
> +.BR strlcpy (3bsd)
> +and
> +.BR strlcat (3bsd)
> +do).
> +.IP \(bu
> +.BR strlcpy (3bsd)
> +and
> +.BR strlcat (3bsd)
> +are designed to crash if the input string is invalid
> +(doesn't contain a terminating null byte).
> +.IP \(bu
> +.BR strscpy (3)
> +reports an error instead of crashing (similar to
> +.BR stpecpy (3)).
> +.IP \(bu
> +.BR stpncpy (3)
> +and
> +.BR strncpy (3)
> +also truncate, but they don't write strings,
> +but rather null-padded character sequences.
> +.\" ----- DESCRIPTION :: Null-padded character sequences --------------/
> +.SS Null-padded character sequences
> +For historic reasons,
> +some standard APIs,
> +such as
> +.BR utmpx (5),
> +use null-padded character sequences in fixed-width buffers.
> +To interface with them,
> +specialized functions need to be used.
> +.PP
> +To copy strings into them, use
> +.BR stpncpy (3).
> +.PP
> +To copy from an unterminated string within a fixed-width buffer into a string,
> +ignoring any trailing null bytes in the source fixed-width buffer,
> +you should use
> +.BR ustr2stp (3)
> +or
> +.BR strncat (3).
> +.\" ----- DESCRIPTION :: Measured character sequences -----------------/
> +.SS Measured character sequences
> +The simplest character sequence copying function is
> +.BR mempcpy (3).
> +It requires always knowing the length of your character sequences,
> +for which structures can be used.
> +It makes the code much faster,
> +since you always know the length of your character sequences,
> +and can do the minimal copies and length measurements.
> +.BR mempcpy (3)
> +copies character sequences,
> +so you need to explicitly set the terminating null byte if you need a string.
> +.PP
> +The following code can be used to
> +chain-copy from a measured character sequence into a string:
> +.PP
> +.in +4n
> +.EX
> +p = mempcpy(p, foo\->ustr, foo\->len);
> +*p = \(aq\e0\(aq;
> +.EE
> +.in
> +.PP
> +The following code can be used to
> +chain-copy from a measured character sequence into an unterminated string:
> +.PP
> +.in +4n
> +.EX
> +p = mempcpy(p, bar\->ustr, bar\->len);
> +.EE
> +.in
> +.PP
> +In programs that make considerable use of strings or character sequences,
> +and need the best performance,
> +using overlapping character sequences can make a big difference.
> +It allows holding subsequences of a larger character sequence.
> +while not duplicating memory
> +nor using time to do a copy.
> +.PP
> +However, this is delicate,
> +since it requires using character sequences.
> +C library APIs use strings,
> +so programs that use character sequences
> +will have to take care of differentiating strings from character sequences.
> +.\" ----- DESCRIPTION :: String vs character sequence -----------------/
> +.SS String vs character sequence
> +Some functions only operate on strings.
> +Those require that the input
> +.I src
> +is a string,
> +and guarantee an output string
> +(even when truncation occurs).
> +Functions that concatenate
> +also require that
> +.I dst
> +holds a string before the call.
> +List of functions:
> +.IP \(bu 3
> +.PD 0
> +.BR stpcpy (3)
> +.IP \(bu
> +.BR strcpy "(3), \c"
> +.BR strcat (3)
> +.IP \(bu
> +.BR stpecpy "(3), \c"
> +.BR stpecpyx (3)
> +.IP \(bu
> +.BR strlcpy "(3bsd), \c"
> +.BR strlcat (3bsd)
> +.IP \(bu
> +.BR strscpy (3)
> +.PD
> +.PP
> +Other functions require an input string,
> +but create a character sequence as output.
> +These functions have confusing names,
> +and have a long history of misuse.
> +List of functions:
> +.IP \(bu 3
> +.PD 0
> +.BR stpncpy (3)
> +.IP \(bu
> +.BR strncpy (3)
> +.PD
> +.PP
> +Other functions operate on an input character sequence,
> +and create an output string.
> +Functions that concatenate
> +also require that
> +.I dst
> +holds a string before the call.
> +.BR strncat (3)
> +has an even more misleading name than the functions above.
> +List of functions:
> +.IP \(bu 3
> +.PD 0
> +.BR ustr2stp (3)
> +.IP \(bu
> +.BR strncat (3)
> +.PD
> +.PP
> +And the last one,
> +operates on an input character sequence
> +to create an output character sequence.
> +But because it asks for the length,
> +and a string is by nature composed of a character sequence of the same length
> +plus a terminating null byte,
> +a string is also accepted as input.
> +Function:
> +.IP \(bu 3
> +.BR mempcpy (3)
> +.\" ----- DESCRIPTION :: Functions :: ---------------------------------/
> +.SS Functions
> +.\" ----- DESCRIPTION :: Functions :: stpcpy(3) -----------------------/
> +.TP
> +.BR stpcpy (3)
> +This function copies the input string into a destination string.
> +The programmer is responsible for allocating a buffer large enough.
> +It returns a pointer suitable for chaining.
> +.IP
> +An implementation of this function might be:
> +.IP
> +.in +4n
> +.EX
> +char *
> +stpcpy(char *restrict dst, const char *restrict src)
> +{
> + return mempcpy(dst, src, strlen(src));
> +}
> +.EE
> +.in
> +.\" ----- DESCRIPTION :: Functions :: strcpy(3), strcat(3) ------------/
> +.TP
> +.BR strcpy (3)
> +.TQ
> +.BR strcat (3)
> +These functions copy the input string into a destination string.
> +The programmer is responsible for allocating a buffer large enough.
> +The return value is useless.
> +.IP
> +.BR stpcpy (3)
> +is a faster alternative to these functions.
> +.IP
> +An implementation of these functions might be:
> +.IP
> +.in +4n
> +.EX
> +char *
> +strcpy(char *restrict dst, const char *restrict src)
> +{
> + stpcpy(dst, src);
> + return dst;
> +}
> +
> +char *
> +strcat(char *restrict dst, const char *restrict src)
> +{
> + stpcpy(dst + strlen(dst), src);
> + return dst;
> +}
> +.EE
> +.in
> +.\" ----- DESCRIPTION :: Functions :: stpecpy(3), stpecpyx(3) ---------/
> +.TP
> +.BR stpecpy (3)
> +.TQ
> +.BR stpecpyx (3)
> +These functions copy the input string into a destination string.
> +If the destination buffer,
> +limited by a pointer to one past the end of it,
> +isn't large enough to hold the copy,
> +the resulting string is truncated
> +(but it is guaranteed to be null-terminated).
> +They return a pointer suitable for chaining.
> +Truncation needs to be detected only once after the last chained call.
> +.BR stpecpyx (3)
> +has identical semantics to
> +.BR stpecpy (3),
> +except that it forces a SIGSEGV if the
> +.I src
> +pointer is not a string.
> +.IP
> +These functions are not provided by any library,
> +but you can define them with the following reference implementations:
> +.IP
> +.in +4n
> +.EX
> +/* This code is in the public domain. */
> +char *
> +stpecpy(char *dst, char past_end[0],
> + const char *restrict src)
> +{
> + char *p;
> +
> + if (dst == past_end)
> + return past_end;
> +
> + p = memccpy(dst, src, \(aq\e0\(aq, past_end \- dst);
> + if (p != NULL)
> + return p \- 1;
> +
> + /* truncation detected */
> + past_end[\-1] = \(aq\e0\(aq;
> + return past_end;
> +}
> +
> +/* This code is in the public domain. */
> +char *
> +stpecpyx(char *dst, char past_end[0],
> + const char *restrict src)
> +{
> + if (src[strlen(src)] != \(aq\e0\(aq)
> + raise(SIGSEGV);
> +
> + return stpecpy(dst, past_end, src);
> +}
> +.EE
> +.in
> +.\" ----- DESCRIPTION :: Functions :: strlcpy(3bsd), strlcat(3bsd) ----/
> +.TP
> +.BR strlcpy (3bsd)
> +.TQ
> +.BR strlcat (3bsd)
> +These functions copy the input string into a destination string.
> +If the destination buffer,
> +limited by its size,
> +isn't large enough to hold the copy,
> +the resulting string is truncated
> +(but it is guaranteed to be null-terminated).
> +They return the length of the total string they tried to create.
> +These functions force a SIGSEGV if the
> +.I src
> +pointer is not a string.
> +.IP
> +.BR stpecpyx (3)
> +is a faster alternative to these functions.
> +.\" ----- DESCRIPTION :: Functions :: strscpy(3) ----------------------/
> +.TP
> +.BR strscpy (3)
> +This function copies the input string into a destination string.
> +If the destination buffer,
> +limited by its size,
> +isn't large enough to hold the copy,
> +the resulting string is truncated
> +(but it is guaranteed to be null-terminated).
> +It returns the length of the destination string, or
> +.B \-E2BIG
> +on truncation.
> +.IP
> +.BR stpecpy (3)
> +is a simpler and faster alternative to this function.
> +.RE
> +.\" ----- DESCRIPTION :: Functions :: stpncpy(3) ----------------------/
> +.TP
> +.BR stpncpy (3)
> +This function copies the input string into
> +a destination null-padded character sequence in a fixed-width buffer.
> +If the destination buffer,
> +limited by its size,
> +isn't large enough to hold the copy,
> +the resulting character sequence is truncated.
> +Since it creates a character sequence,
> +it doesn't need to write a terminating null byte.
> +It returns a pointer suitable for chaining,
> +but it's not ideal for that.
> +Truncation needs to be detected only once after the last chained call.
> +.IP
> +If you're going to use this function in chained calls,
> +it would be useful to develop a similar function
> +that accepts a pointer to one past the end of the buffer instead of a size.
> +.IP
> +An implementation of this function might be:
> +.IP
> +.in +4n
> +.EX
> +char *
> +stpncpy(char *restrict dst, const char *restrict src,
> + size_t sz)
> +{
> + char *p;
> +
> + bzero(dst, sz);
> + p = memccpy(dst, src, \(aq\e0\(aq, sz);
> + if (p == NULL)
> + return dst + sz;
> +
> + return p \- 1;
> +}
> +.EE
> +.in
> +.\" ----- DESCRIPTION :: Functions :: ustr2stp(3) ---------------------/
> +.TP
> +.BR ustr2stp (3)
> +This function copies the input character sequence
> +contained in a null-padded wixed-width buffer,
> +into a destination string.
> +The programmer is responsible for allocating a buffer large enough.
> +It returns a pointer suitable for chaining.
> +.IP
> +A truncating version of this function doesn't exist,
> +since the size of the original character sequence is always known,
> +so it wouldn't be very useful.
> +.IP
> +This function is not provided by any library,
> +but you can define it with the following reference implementation:
> +.IP
> +.in +4n
> +.EX
> +/* This code is in the public domain. */
> +char *
> +ustr2stp(char *restrict dst, const char *restrict src,
> + size_t sz)
> +{
> + char *end;
> +
> + end = memccpy(dst, src, \(aq\e0\(aq, sz)) ?: dst + sz;
> + *end = \(aq\e0\(aq;
> +
> + return end;
> +}
> +.EE
> +.in
> +.\" ----- DESCRIPTION :: Functions :: strncpy(3) ----------------------/
> +.TP
> +.BR strncpy (3)
> +This function is identical to
> +.BR stpncpy (3)
> +except for the useless return value.
> +Due to the return value,
> +with this function it's hard to correctly check for truncation.
> +.IP
> +.BR stpncpy (3)
> +is a simpler alternative to this function.
> +.IP
> +An implementation of this function might be:
> +.IP
> +.in +4n
> +.EX
> +char *
> +strncpy(char *restrict dst, const char *restrict src,
> + size_t sz)
> +{
> + stpncpy(dst, src, sz);
> + return dst;
> +}
> +.EE
> +.in
> +.\" ----- DESCRIPTION :: Functions :: strncat(3) ----------------------/
> +.TP
> +.BR strncat (3)
> +Do not confuse this function with
> +.BR strncpy (3);
> +they are not related at all.
> +.IP
> +This function concatenates the input character sequence
> +contained in a null-padded wixed-width buffer,
> +into a destination string.
> +The programmer is responsible for allocating a buffer large enough.
> +The return value is useless.
> +.IP
> +.BR ustr2stp (3)
> +is a faster alternative to this function.
> +.IP
> +An implementation of this function might be:
> +.IP
> +.in +4n
> +.EX
> +char *
> +strncat(char *restrict dst, const char *restrict src,
> + size_t sz)
> +{
> + ustr2stp(dst + strlen(dst), src, sz);
> + return dst;
> +}
> +.EE
> +.in
> +.\" ----- DESCRIPTION :: Functions :: mempcpy(3) ----------------------/
> +.TP
> +.BR mempcpy (3)
> +This function copies the input character sequence,
> +limited by its length,
> +into a destination character sequence.
> +The programmer is responsible for allocating a buffer large enough.
> +It returns a pointer suitable for chaining.
> +.IP
> +An implementation of this function might be:
> +.IP
> +.in +4n
> +.EX
> +void *
> +mempcpy(void *restrict dst, const void *restrict src,
> + size_t len)
> +{
> + return memcpy(dst, src, len) + len;
> +}
> +.EE
> +.in
> +.\" ----- RETURN VALUE :: ---------------------------------------------/
> .SH RETURN VALUE
> -The
> -.BR strcpy ()
> -function returns a pointer to
> -the destination string
> -.IR dest .
> +The following functions return
> +a pointer to the terminating null byte in the destination string.
> +.IP \(bu 3
> +.PD 0
> +.BR stpcpy (3)
> +.IP \(bu
> +.BR ustr2stp (3)
> +.PD
> +.PP
> +The following functions return
> +a pointer to the terminating null byte in the destination string,
> +except when truncation occurs;
> +if truncation occurs,
> +they return a pointer to one past the end of the destination buffer
> +.RI ( past_end ).
> +.IP \(bu 3
> +.BR stpecpy (3),
> +.BR stpecpyx (3)
> +.PP
> +The following function returns
> +a pointer to one after the last character
> +in the destination character sequence;
> +if truncation occurs,
> +that pointer is equivalent to
> +a pointer to one past the end of the destination buffer.
> +.IP \(bu 3
> +.BR stpncpy (3)
> +.PP
> +The following function returns
> +a pointer to one after the last character
> +in the destination character sequence.
> +.IP \(bu 3
> +.BR mempcpy (3)
> +.PP
> +The following functions return
> +the length of the total string that they tried to create
> +(as if truncation didn't occur).
> +.IP \(bu 3
> +.BR strlcpy (3bsd),
> +.BR strlcat (3bsd)
> +.PP
> +The following function returns
> +the length of the destination string, or
> +.B \-E2BIG
> +on truncation.
> +.IP \(bu 3
> +.BR strscpy (3)
> +.PP
> +The following functions return the
> +.I dst
> +pointer,
> +which is useless.
> +.IP \(bu 3
> +.PD 0
> +.BR strcpy (3),
> +.BR strcat (3)
> +.IP \(bu
> +.BR strncpy (3)
> +.IP \(bu
> +.BR strncat (3)
> +.PD
> +.\" ----- ATTRIBUTES :: -----------------------------------------------/
> .SH ATTRIBUTES
> For an explanation of the terms used in this section, see
> .BR attributes (7).
> @@ -54,73 +773,236 @@ .SH ATTRIBUTES
> l l l.
> Interface Attribute Value
> T{
> -.BR strcpy ()
> +.BR stpcpy (),
> +.BR strcpy (),
> +.BR strcat (),
> +.BR stpecpy (),
> +.BR stpecpyx ()
> +.BR strlcpy (),
> +.BR strlcat (),
> +.BR strscpy (),
> +.BR stpncpy (),
> +.BR strncpy (),
> +.BR ustr2stp (),
> +.BR strncat (),
> +.BR mempcpy ()
> T} Thread safety MT-Safe
> .TE
> .hy
> .ad
> .sp 1
> +.\" ----- STANDARDS :: ------------------------------------------------/
> .SH STANDARDS
> -POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
> -.SH NOTES
> -.SS strlcpy()
> -Some systems (the BSDs, Solaris, and others) provide the following function:
> +.TP
> +.BR strcpy "(3), \c"
> +.BR strcat (3)
> +.TQ
> +.BR strncpy (3)
> +.TQ
> +.BR strncat (3)
> +POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
> +.TP
> +.BR stpcpy (3)
> +.\" This function was added to POSIX.1-2008.
> +.\" Before that, it was not part of
> +.\" the C or POSIX.1 standards, nor customary on UNIX systems.
> +.\" It first appeared at least as early as 1986,
> +.\" in the Lattice C AmigaDOS compiler,
> +.\" then in the GNU fileutils and GNU textutils in 1989,
> +.\" and in the GNU C library by 1992.
> +.\" It is also present on the BSDs.
> +.TQ
> +.BR stpncpy (3)
> +.\" This function was added to POSIX.1-2008.
> +.\" Before that, it was a GNU extension.
> +.\" It first appeared in glibc 1.07 in 1993.
> +POSIX.1-2008.
> +.TP
> +.BR strlcpy "(3bsd), \c"
> +.BR strlcat (3bsd)
> +Functions originated in OpenBSD and present in some Unix systems.
> +.TP
> +.BR mempcpy (3)
> +This function is a GNU extension.
> +.TP
> +.BR strscpy (3)
> +Linux kernel internal function.
> +.TP
> +.BR stpecpy "(3), \c"
> +.BR stpecpyx (3)
> +.TQ
> +.BR ustr2stp (3)
> +Not defined by any standards nor libraries.
> +.\" ----- CAVEATS :: --------------------------------------------------/
> +.SH CAVEATS
> +Don't mix chain calls to truncating and non-truncating functions.
> +It is conceptually wrong
> +unless you know that the first part of a copy will always fit.
> +Anyway, the performance difference will probably be negligible,
> +so it will probably be more clear if you use consistent semantics:
> +either truncating or non-truncating.
> +Calling a non-truncating function after a truncating one is necessarily wrong.
> .PP
> +Some of the functions described here are not provided by any library;
> +you should write your own copy if you want to use them.
> +See STANDARDS.
> +.\" ----- BUGS :: -----------------------------------------------------/
> +.SH BUGS
> +All concatenation
> +.RB (* cat ())
> +functions share the same performance problem:
> +.UR https://www.joelonsoftware.com/\:2001/12/11/\:back\-to\-basics/
> +Shlemiel the painter
> +.UE .
> +.\" ----- EXAMPLES :: -------------------------------------------------/
> +.SH EXAMPLES
> +The following are examples of correct use of each of these functions.
> +.\" ----- EXAMPLES :: stpcpy(3) ---------------------------------------/
> +.TP
> +.BR stpcpy (3)
> .in +4n
> .EX
> -size_t strlcpy(char *dest, const char *src, size_t size);
> +p = buf;
> +p = stpcpy(p, "Hello ");
> +p = stpcpy(p, "world");
> +p = stpcpy(p, "!");
> +len = p \- buf;
> +puts(buf);
> .EE
> .in
> -.PP
> -.\" http://static.usenix.org/event/usenix99/full_papers/millert/millert_html/index.html
> -.\" "strlcpy and strlcat - consistent, safe, string copy and concatenation"
> -.\" 1999 USENIX Annual Technical Conference
> -This function is similar to
> -.BR strcpy (),
> -but it copies at most
> -.I size\-1
> -bytes to
> -.IR dest ,
> -truncating the string as necessary.
> -It always adds a terminating null byte.
> -This function fixes some of the problems of
> -.BR strcpy ()
> -but the caller must still handle the possibility of data loss if
> -.I size
> -is too small.
> -The return value of the function is the length of
> -.IR src ,
> -which allows truncation to be easily detected:
> -if the return value is greater than or equal to
> -.IR size ,
> -truncation occurred.
> -If loss of data matters, the caller
> -.I must
> -either check the arguments before the call,
> -or test the function return value.
> -.BR strlcpy ()
> -is not present in glibc and is not standardized by POSIX,
> -.\" https://lwn.net/Articles/506530/
> -but is available on Linux via the
> -.I libbsd
> -library.
> -.SH BUGS
> -If the destination string of a
> -.BR strcpy ()
> -is not large enough, then anything might happen.
> -Overflowing fixed-length string buffers is a favorite cracker technique
> -for taking complete control of the machine.
> -Any time a program reads or copies data into a buffer,
> -the program first needs to check that there's enough space.
> -This may be unnecessary if you can show that overflow is impossible,
> -but be careful: programs can get changed over time,
> -in ways that may make the impossible possible.
> +.\" ----- EXAMPLES :: strcpy(3), strcat(3) ----------------------------/
> +.TP
> +.BR strcpy (3)
> +.TQ
> +.BR strcat (3)
> +.in +4n
> +.EX
> +strcpy(buf, "Hello ");
> +strcat(buf, "world");
> +strcat(buf, "!");
> +len = strlen(buf);
> +puts(buf);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: stpecpy(3), stpecpyx(3) -------------------------/
> +.TP
> +.BR stpecpy (3)
> +.TQ
> +.BR stpecpyx (3)
> +.in +4n
> +.EX
> +past_end = buf + sizeof(buf);
> +p = buf;
> +p = stpecpy(p, past_end, "Hello ");
> +p = stpecpy(p, past_end, "world");
> +p = stpecpy(p, past_end, "!");
> +if (p == past_end) {
> + p\-\-;
> + goto toolong;
> +}
> +len = p \- buf;
> +puts(buf);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: strlcpy(3bsd), strlcat(3bsd) --------------------/
> +.TP
> +.BR strlcpy (3bsd)
> +.TQ
> +.BR strlcat (3bsd)
> +.in +4n
> +.EX
> +if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
> + goto toolong;
> +if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
> + goto toolong;
> +len = strlcat(buf, "!", sizeof(buf));
> +if (len >= sizeof(buf))
> + goto toolong;
> +puts(buf);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: strscpy(3) --------------------------------------/
> +.TP
> +.BR strscpy (3)
> +.in +4n
> +.EX
> +len = strscpy(buf, "Hello world!", sizeof(buf));
> +if (len == \-E2BIG)
> + goto toolong;
> +puts(buf);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: stpncpy(3) --------------------------------------/
> +.TP
> +.BR stpncpy (3)
> +.in +4n
> +.EX
> +past_end = buf + sizeof(buf);
> +end = stpncpy(buf, "Hello world!", sizeof(buf));
> +if (end == past_end)
> + goto toolong;
> +len = end \- buf;
> +for (size_t i = 0; i < sizeof(buf); i++)
> + putchar(buf[i]);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: strncpy(3) --------------------------------------/
> +.TP
> +.BR strncpy (3)
> +.in +4n
> +.EX
> +strncpy(buf, "Hello world!", sizeof(buf));
> +if (buf + sizeof(buf) \- 1 == \(aq\e0\(aq)
> + goto toolong;
> +len = strnlen(buf, sizeof(buf));
> +for (size_t i = 0; i < sizeof(buf); i++)
> + putchar(buf[i]);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: ustr2stp(3) -------------------------------------/
> +.TP
> +.BR ustr2stp (3)
> +.in +4n
> +.EX
> +p = buf;
> +p = ustr2stp(p, "Hello ", 6);
> +p = ustr2stp(p, "world", 42); // Padding null bytes ignored.
> +p = ustr2stp(p, "!", 1);
> +len = p \- buf;
> +puts(buf);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: strncat(3) --------------------------------------/
> +.TP
> +.BR strncat (3)
> +.in +4n
> +.EX
> +buf[0] = \(aq\e0\(aq; // There's no 'cpy' function to this 'cat'.
> +strncat(buf, "Hello ", 6);
> +strncat(buf, "world", 42); // Padding null bytes ignored.
> +strncat(buf, "!", 1);
> +len = strlen(buf);
> +puts(buf);
> +.EE
> +.in
> +.\" ----- EXAMPLES :: mempcpy(3) --------------------------------------/
> +.TP
> +.BR mempcpy (3)
> +.in +4n
> +.EX
> +p = buf;
> +p = mempcpy(p, "Hello ", 6);
> +p = mempcpy(p, "world", 5);
> +p = mempcpy(p, "!", 1);
> +p = \(aq\e0\(aq;
> +len = p \- buf;
> +puts(buf);
> +.EE
> +.in
> +.\" ----- SEE ALSO :: -------------------------------------------------/
> .SH SEE ALSO
> -.BR bcopy (3),
> -.BR memccpy (3),
> +.BR bzero (3),
> .BR memcpy (3),
> -.BR memmove (3),
> -.BR stpcpy (3),
> -.BR strdup (3),
> -.BR string (3),
> -.BR wcscpy (3)
> +.BR memccpy (3),
> +.BR mempcpy (3),
> +.BR string (3)
> --
> 2.38.1
>
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 1/1] strcpy.3: Rewrite page to document all string-copying functions
2022-12-14 16:22 ` Douglas McIlroy
@ 2022-12-14 16:36 ` Alejandro Colomar
2022-12-14 17:11 ` Alejandro Colomar
0 siblings, 1 reply; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-14 16:36 UTC (permalink / raw)
To: Douglas McIlroy
Cc: linux-man, Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Jakub Wilk
[-- Attachment #1.1: Type: text/plain, Size: 2017 bytes --]
Hi Doug!
Thanks for having a look at it!
On 12/14/22 17:22, Douglas McIlroy wrote:
>> a sequence of zero or more non-null characters followed by a null byte
>
> Varying terminology (character vs byte) is poor style in technical writing.
I thought of using null character, but it was longer, and I prefer shorter terms.
About using byte for everything... it feels a bit wrong especially when I'm
trying to make a clear distinction between strings and character sequences that
don't have a terminating NUL.
And, since the two are distinct things that should not be mixed (as far as
strings are concerned), it doesn't feel so bad using different terms for them.
>
>> concatenate
>
> We began fighting this pomposity before v7. There has only been
> backsliding since..
> "Catenate" is crisper, means the same thing, and concurs with the "cat" command.
> I invite you to join the battle for simplicity.
Heh, I didn't know the word existed. In Spanish we only have "concatenar".
I'll happily join this battle for simplicity :)
>
>> chain copy
>
> This term is never overtly defined. The definition might be inferred
> from, "To chain copy
> functions, they need to return a pointer to the end", but the
> problematic grammar of the
> sentence diverts attention from its content.
Okay, I'll try to improve the wording in that paragraph; indeed that subsection
intended to define the "chain copy" term.
>
>> strscpy
>
> Doesn't it muddy the waters to include a non-library function in man3?
Initially I wanted to discuss it because it always comes up in discussions about
better string-copying functions.
But since I don't provide an implementation for it (since it's hard to get
right) (as opposed to the other functions that are not in libraries, for which I
show trivial implementations), and don't see it very useful, I can remove it.
Less lines wasted with it.
Maybe I'll keep a small mention to it.
>
> Doug
Cheers,
Alex
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 1/1] strcpy.3: Rewrite page to document all string-copying functions
2022-12-14 16:36 ` Alejandro Colomar
@ 2022-12-14 17:11 ` Alejandro Colomar
2022-12-14 17:19 ` Alejandro Colomar
0 siblings, 1 reply; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-14 17:11 UTC (permalink / raw)
To: Douglas McIlroy
Cc: linux-man, Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Jakub Wilk
[-- Attachment #1.1: Type: text/plain, Size: 2409 bytes --]
Hi Doug,
On 12/14/22 17:36, Alejandro Colomar wrote:
> On 12/14/22 17:22, Douglas McIlroy wrote:
>>> chain copy
>>
>> This term is never overtly defined. The definition might be inferred
>> from, "To chain copy
>> functions, they need to return a pointer to the end", but the
>> problematic grammar of the
>> sentence diverts attention from its content.
>
> Okay, I'll try to improve the wording in that paragraph; indeed that subsection
> intended to define the "chain copy" term.
>
>>
I'll hold on sending v5 to see if there is more feedback from others, but here's
what I have for documenting the chain term:
@@ -202,15 +192,36 @@ .SS Terms (and abbreviations)
It is used as a sentinel value,
to be able to truncate strings or character sequences
instead of overrunning the containing buffer.
-.\" ----- DESCRIPTION :: Copy, concatenate, and chain-copy ------------/
-.SS Copy, concatenate, and chain-copy
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: copy ------------/
+.TP
+.I copy
+This term is used when
+the writing starts at the first element pointed to by
+.IR dst .
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: catenate --------/
+.TP
+.I catenate
+This term is used when
+a function first finds the terminating null byte in
+.IR dst ,
+and then starts writing at that position.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: chain -----------/
+.TP
+.I chain
+This term is used when
+it's the programmer who provides a pointer to the
+.IR end ,
+and the function starts writing at that location.
+.IR dst .
+.\" ----- DESCRIPTION :: Copy, catenate, and chain-copy ---------------/
+.SS Copy, catenate, and chain-copy
Originally,
-there was a distinction between functions that copy and those that concatenate.
+there was a distinction between functions that copy and those that catenate.
However, newer functions that copy while allowing chaining
cover both use cases with a single API.
They are also algorithmically faster,
since they don't need to search for the end of the existing string.
-However, functions that concatenate have a much simpler use,
+However, functions that catenate have a much simpler use,
so if performance is not important,
it can make sense to use them for improving readability.
.PP
Cheers,
Alex
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v3 1/1] strcpy.3: Rewrite page to document all string-copying functions
2022-12-14 17:11 ` Alejandro Colomar
@ 2022-12-14 17:19 ` Alejandro Colomar
0 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-14 17:19 UTC (permalink / raw)
To: Douglas McIlroy
Cc: linux-man, Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Jakub Wilk
[-- Attachment #1.1: Type: text/plain, Size: 3030 bytes --]
On 12/14/22 18:11, Alejandro Colomar wrote:
> Hi Doug,
>
> On 12/14/22 17:36, Alejandro Colomar wrote:
>> On 12/14/22 17:22, Douglas McIlroy wrote:
>>>> chain copy
>>>
>>> This term is never overtly defined. The definition might be inferred
>>> from, "To chain copy
>>> functions, they need to return a pointer to the end", but the
>>> problematic grammar of the
>>> sentence diverts attention from its content.
>>
>> Okay, I'll try to improve the wording in that paragraph; indeed that
>> subsection intended to define the "chain copy" term.
>>
>>>
>
> I'll hold on sending v5 to see if there is more feedback from others, but here's
> what I have for documenting the chain term:
>
>
> @@ -202,15 +192,36 @@ .SS Terms (and abbreviations)
> It is used as a sentinel value,
> to be able to truncate strings or character sequences
> instead of overrunning the containing buffer.
> -.\" ----- DESCRIPTION :: Copy, concatenate, and chain-copy ------------/
> -.SS Copy, concatenate, and chain-copy
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: copy ------------/
> +.TP
> +.I copy
> +This term is used when
> +the writing starts at the first element pointed to by
> +.IR dst .
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: catenate --------/
> +.TP
> +.I catenate
> +This term is used when
> +a function first finds the terminating null byte in
> +.IR dst ,
> +and then starts writing at that position.
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: chain -----------/
> +.TP
> +.I chain
> +This term is used when
> +it's the programmer who provides a pointer to the
> +.IR end ,
> +and the function starts writing at that location.
> +.IR dst .
@@ -213,6 +213,10 @@ .SS Terms (and abbreviations)
.IR end ,
and the function starts writing at that location.
.IR dst .
+The function returns a pointer to the new
+.I end
+after the call,
+so that the programmer can use it to chain such calls.
.\" ----- DESCRIPTION :: Copy, catenate, and chain-copy ---------------/
.SS Copy, catenate, and chain-copy
Originally,
And this is for completeness. :)
> +.\" ----- DESCRIPTION :: Copy, catenate, and chain-copy ---------------/
> +.SS Copy, catenate, and chain-copy
> Originally,
> -there was a distinction between functions that copy and those that concatenate.
> +there was a distinction between functions that copy and those that catenate.
> However, newer functions that copy while allowing chaining
> cover both use cases with a single API.
> They are also algorithmically faster,
> since they don't need to search for the end of the existing string.
> -However, functions that concatenate have a much simpler use,
> +However, functions that catenate have a much simpler use,
> so if performance is not important,
> it can make sense to use them for improving readability.
> .PP
>
>
>
> Cheers,
>
> Alex
>
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v5 0/5] Rewrite pages about string-copying functions
2022-12-14 16:17 ` [PATCH v4 " Alejandro Colomar
@ 2022-12-15 0:26 ` Alejandro Colomar
2022-12-19 21:02 ` [PATCH v6 0/5] Rewrite documentation for " Alejandro Colomar
` (5 more replies)
2022-12-15 0:26 ` [PATCH v5 1/5] string_copy.7: Add page to document all string-copying functions Alejandro Colomar
` (4 subsequent siblings)
5 siblings, 6 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-15 0:26 UTC (permalink / raw)
To: linux-man, Martin Sebor, G. Branden Robinson, Douglas McIlroy,
Jakub Wilk, Serge Hallyn, Iker Pedrosa, Andrew Pinski
Cc: Alejandro Colomar
Hi,
After a long investigation, I'm rewriting all manual pages about string-
copying functions (and also non-string ones). When there was no term
for a thing, I used an invented one, and documented it, in an attempt to
form precedent; for example, for non-terminated strings (which is an
oxymoron, since strings are necessarily terminated) I used "character
sequence" (suggested by Martin, to improve my own "unterminated string".
v5 addresses a few suggestions by Doug, and also a few concerns by
Jakub. I'm not anymore replacing current pages, but rather adding a new
one, and also rewriting the old ones to be consistent with it, but I
kept them as a quick reference, for those who need it. They also have
complete example programs each.
This time, I'll send the formatted pages as replies to the corresponding
diffs, since there are several.
Cheers,
Alex
Alejandro Colomar (5):
string_copy.7: Add page to document all string-copying functions
stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3,
zustr2ustp.3: Add new links to string_copy(7)
stpcpy.3, strcpy.3, strcat.3: Document in a single page
stpncpy.3, strncpy.3: Document in a single page
strncat.3: Rewrite to be consistent with string_copy.7.
man3/stpcpy.3 | 13 -
man3/stpecpy.3 | 1 +
man3/stpecpyx.3 | 1 +
man3/stpncpy.3 | 163 +++++----
man3/strcat.3 | 161 +--------
man3/strcpy.3 | 226 +++++++-----
man3/strncat.3 | 147 +++-----
man3/strncpy.3 | 130 +------
man3/ustpcpy.3 | 1 +
man3/ustr2stp.3 | 1 +
man3/zustr2stp.3 | 1 +
man3/zustr2ustp.3 | 1 +
man7/string_copy.7 | 869 +++++++++++++++++++++++++++++++++++++++++++++
13 files changed, 1162 insertions(+), 553 deletions(-)
create mode 100644 man3/stpecpy.3
create mode 100644 man3/stpecpyx.3
create mode 100644 man3/ustpcpy.3
create mode 100644 man3/ustr2stp.3
create mode 100644 man3/zustr2stp.3
create mode 100644 man3/zustr2ustp.3
create mode 100644 man7/string_copy.7
--
2.38.1
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v5 1/5] string_copy.7: Add page to document all string-copying functions
2022-12-14 16:17 ` [PATCH v4 " Alejandro Colomar
2022-12-15 0:26 ` [PATCH v5 0/5] Rewrite pages about " Alejandro Colomar
@ 2022-12-15 0:26 ` Alejandro Colomar
2022-12-15 0:30 ` Alejandro Colomar
2022-12-15 0:26 ` [PATCH v5 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7) Alejandro Colomar
` (3 subsequent siblings)
5 siblings, 1 reply; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-15 0:26 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
This is an opportunity to use consistent language across the
documentation for all string-copying functions.
It is also easier to show the similarities and differences between all
of the functions, so that a reader can use this page to know which
function is needed for a given task.
Alternative functions not provided by libc have been given in the same
page, with reference implementations.
Cc: Martin Sebor <msebor@redhat.com>
Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
Cc: Jakub Wilk <jwilk@jwilk.net>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Iker Pedrosa <ipedrosa@redhat.com>
Cc: Andrew Pinski <pinskia@gmail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man7/string_copy.7 | 869 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 869 insertions(+)
create mode 100644 man7/string_copy.7
diff --git a/man7/string_copy.7 b/man7/string_copy.7
new file mode 100644
index 000000000..be8b841e2
--- /dev/null
+++ b/man7/string_copy.7
@@ -0,0 +1,869 @@
+.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
+.\"
+.\" SPDX-License-Identifier: BSD-3-Clause
+.\"
+.TH string_copy 7 (date) "Linux man-pages (unreleased)"
+.\" ----- NAME :: -----------------------------------------------------/
+.SH NAME
+stpcpy,
+strcpy, strcat,
+stpecpy, stpecpyx,
+strlcpy, strlcat,
+stpncpy,
+strncpy,
+zustr2ustp, zustr2stp,
+strncat,
+ustpcpy, ustr2stp
+\- copy strings and character sequences
+.\" ----- SYNOPSIS :: -------------------------------------------------/
+.SH SYNOPSIS
+.\" ----- SYNOPSIS :: (Null-terminated) strings -----------------------/
+.SS Strings
+.nf
+// Chain-copy a string.
+.BI "char *stpcpy(char *restrict " dst ", const char *restrict " src );
+.PP
+// Copy/catenate a string.
+.BI "char *strcpy(char *restrict " dst ", const char *restrict " src );
+.BI "char *strcat(char *restrict " dst ", const char *restrict " src );
+.PP
+// Chain-copy a string with truncation.
+.BI "char *stpecpy(char *" dst ", char " past_end "[0], \
+const char *restrict " src );
+.PP
+// Chain-copy a string with truncation and SIGSEGV on UB.
+.BI "char *stpecpyx(char *" dst ", char " past_end "[0], \
+const char *restrict " src );
+.PP
+// Copy/catenate a string with truncation and SIGSEGV on UB.
+.BI "size_t strlcpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.BI "size_t strlcat(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.fi
+.\" ----- SYNOPSIS :: Null-padded character sequences --------/
+.SS Null-padded character sequences
+.nf
+// Zero a fixed-width buffer, and
+// copy a string into a character sequence with truncation.
+.BI "char *stpncpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Zero a fixed-width buffer, and
+// copy a string into a character sequence with truncation.
+.BI "char *strncpy(char " dest "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Chain-copy a null-padded character sequence into a character sequence.
+.BI "char *zustr2ustp(char *restrict " dst ", \
+const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.PP
+// Chain-copy a null-padded character sequence into a string.
+.BI "char *zustr2stp(char *restrict " dst ", \
+const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.PP
+// Catenate a null-padded character sequence into a string.
+.BI "char *strncat(char *restrict " dst ", const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.fi
+.\" ----- SYNOPSIS :: Measured character sequences --------------------/
+.SS Measured character sequences
+.nf
+// Chain-copy a measured character sequence.
+.BI "char *ustpcpy(char *restrict " dst ", \
+const char " src "[restrict ." len ],
+.BI " size_t " len );
+.PP
+// Chain-copy a measured character sequence into a string.
+.BI "char *ustr2stp(char *restrict " dst ", \
+const char " src "[restrict ." len ],
+.BI " size_t " len );
+.fi
+.SH DESCRIPTION
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: -----------------/
+.SS Terms (and abbreviations)
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: string (str) ----/
+.TP
+.IR "string " ( str )
+is a sequence of zero or more non-null characters followed by a null byte.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: null-padded character seq
+.TP
+.I character sequence
+is a sequence of zero or more non-null characters.
+A program should never usa a character sequence where a string is required.
+However, with appropriate care,
+a string can be used in the place of a character sequence.
+.RS
+.TP
+.IR "null-padded character sequence " ( zustr )
+Character sequences can be contained in fixed-width buffers,
+which contain padding null bytes after the character sequence,
+to fill the rest of the buffer
+without affecting the character sequence;
+however, those padding null bytes are not part of the character sequence.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: measured character sequence
+.TP
+.IR "measured character sequence " ( ustr )
+Character sequence delimited by its length.
+It may be a slice of a larger character sequence,
+or even of a string.
+.RE
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: length (len) ----/
+.TP
+.IR "length " ( len )
+is the number of non-null characters in a string or character sequence.
+It is the return value of
+.I strlen(str)
+and of
+.IR "strnlen(ustr, sz)" .
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: size (sz) -------/
+.TP
+.IR "size " ( sz )
+refers to the entire buffer
+where the string or character sequence is contained.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: end -------------/
+.TP
+.I end
+is the name of a pointer to the terminating null byte of a string,
+or a pointer to one past the last character of a character sequence.
+This is the return value of functions that allow chaining.
+It is equivalent to
+.IR &str[len] .
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: past_end --------/
+.TP
+.I past_end
+is the name of a pointer to one past the end of the buffer
+that contains a string or character sequence.
+It is equivalent to
+.IR &str[sz] .
+It is used as a sentinel value,
+to be able to truncate strings or character sequences
+instead of overrunning the containing buffer.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: copy ------------/
+.TP
+.I copy
+This term is used when
+the writing starts at the first element pointed to by
+.IR dst .
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: catenate --------/
+.TP
+.I catenate
+This term is used when
+a function first finds the terminating null byte in
+.IR dst ,
+and then starts writing at that position.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: chain -----------/
+.TP
+.I chain
+This term is used when
+it's the programmer who provides a pointer to the
+.I end
+in
+.IR dst ,
+and the function starts writing at that location.
+The function returns a pointer to the new
+.I end
+after the call,
+so that the programmer can use it to chain such calls.
+.\" ----- DESCRIPTION :: Copy, catenate, and chain-copy ---------------/
+.SS Copy, catenate, and chain-copy
+Originally,
+there was a distinction between functions that copy and those that catenate.
+However, newer functions that copy while allowing chaining
+cover both use cases with a single API.
+They are also algorithmically faster,
+since they don't need to search for the end of the existing string.
+However, functions that catenate have a much simpler use,
+so if performance is not important,
+it can make sense to use them for improving readability.
+.PP
+To chain copy functions,
+they need to return a pointer to the
+.IR end .
+That's a byproduct of the copy operation,
+so it has no performance costs.
+Functions that return such a pointer,
+and thus can be chained,
+have names of the form
+.RB * stp *(),
+since it's also common to name the pointer just
+.IR p .
+.PP
+Chain-copying functions that truncate
+should accept a pointer to one past the end of the destination buffer,
+and have names of the form
+.RB * stpe *().
+This allows not having to recalculate the remaining size after each call.
+.\" ----- DESCRIPTION :: Truncate or not? -----------------------------/
+.SS Truncate or not?
+The first thing to note is that programmers should be careful with buffers,
+so they always have the correct size,
+and truncation is not necessary.
+.PP
+In most cases,
+truncation is not desired,
+and it is simpler to just do the copy.
+Simpler code is safer code.
+Programming against programming mistakes by adding more code
+just adds more points where mistakes can be made.
+.PP
+Nowadays,
+compilers can detect most programmer errors with features like
+compiler warnings,
+static analyzers, and
+.BR \%_FORTIFY_SOURCE
+(see
+.BR ftm (7)).
+Keeping the code simple
+helps these overflow-detection features be more precise.
+.PP
+When validating user input,
+however,
+it makes sense to truncate.
+Remember to check the return value of such function calls.
+.PP
+Functions that truncate:
+.IP \(bu 3
+.BR stpecpy (3)
+is the most efficient string copy function that performs truncation.
+It only requires to check for truncation once after all chained calls.
+.IP \(bu
+.BR stpecpyx (3)
+is a variant of
+.BR stpecpy (3)
+that consumes the entire source string,
+to catch bugs in the program
+by forcing a segmentation fault (as
+.BR strlcpy (3bsd)
+and
+.BR strlcat (3bsd)
+do).
+.IP \(bu
+.BR strlcpy (3bsd)
+and
+.BR strlcat (3bsd)
+are designed to crash if the input string is invalid
+(doesn't contain a terminating null byte).
+.IP \(bu
+.BR stpncpy (3)
+and
+.BR strncpy (3)
+also truncate, but they don't write strings,
+but rather null-padded character sequences.
+.\" ----- DESCRIPTION :: Null-padded character sequences --------------/
+.SS Null-padded character sequences
+For historic reasons,
+some standard APIs,
+such as
+.BR utmpx (5),
+use null-padded character sequences in fixed-width buffers.
+To interface with them,
+specialized functions need to be used.
+.PP
+To copy strings into them, use
+.BR stpncpy (3).
+.PP
+To copy from an unterminated string within a fixed-width buffer into a string,
+ignoring any trailing null bytes in the source fixed-width buffer,
+you should use
+.BR zustr2stp (3)
+or
+.BR strncat (3).
+.PP
+To copy from an unterminated string within a fixed-width buffer
+into a character sequence,
+ingoring any trailing null bytes in the source fixed-width buffer,
+you should use
+.BR zustr2ustp (3).
+.\" ----- DESCRIPTION :: Measured character sequences -----------------/
+.SS Measured character sequences
+The simplest character sequence copying function is
+.BR mempcpy (3).
+It requires always knowing the length of your character sequences,
+for which structures can be used.
+It makes the code much faster,
+since you always know the length of your character sequences,
+and can do the minimal copies and length measurements.
+.BR mempcpy (3)
+copies character sequences,
+so you need to explicitly set the terminating null byte if you need a string.
+.PP
+However,
+for keeping type safety,
+it's good to add a wrapper that uses
+.I char\~*
+instead of
+.IR void\~* :
+.BR ustpcpy (3).
+.PP
+In programs that make considerable use of strings or character sequences,
+and need the best performance,
+using overlapping character sequences can make a big difference.
+It allows holding subsequences of a larger character sequence.
+while not duplicating memory
+nor using time to do a copy.
+.PP
+However, this is delicate,
+since it requires using character sequences.
+C library APIs use strings,
+so programs that use character sequences
+will have to take care of differentiating strings from character sequences.
+.PP
+To copy a measured character sequence, use
+.BR ustpcpy (3).
+.PP
+To copy a measured character sequence into a string, use
+.BR ustr2stp (3).
+.PP
+Because these functions ask for the length,
+and a string is by nature composed of a character sequence of the same length
+plus a terminating null byte,
+a string is also accepted as input.
+.\" ----- DESCRIPTION :: String vs character sequence -----------------/
+.SS String vs character sequence
+Some functions only operate on strings.
+Those require that the input
+.I src
+is a string,
+and guarantee an output string
+(even when truncation occurs).
+Functions that catenate
+also require that
+.I dst
+holds a string before the call.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR stpcpy (3)
+.IP \(bu
+.BR strcpy "(3), \c"
+.BR strcat (3)
+.IP \(bu
+.BR stpecpy "(3), \c"
+.BR stpecpyx (3)
+.IP \(bu
+.BR strlcpy "(3bsd), \c"
+.BR strlcat (3bsd)
+.PD
+.PP
+Other functions require an input string,
+but create a character sequence as output.
+These functions have confusing names,
+and have a long history of misuse.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR stpncpy (3)
+.IP \(bu
+.BR strncpy (3)
+.PD
+.PP
+Other functions operate on an input character sequence,
+and create an output string.
+Functions that catenate
+also require that
+.I dst
+holds a string before the call.
+.BR strncat (3)
+has an even more misleading name than the functions above.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR zustr2stp (3)
+.IP \(bu
+.BR strncat (3)
+.IP \(bu
+.BR ustr2stp (3)
+.PD
+.PP
+Other functions operate on an input character sequence
+to create an output character sequence.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR ustpcpy (3)
+.IP \(bu
+.BR zustr2stp (3)
+.PD
+.\" ----- DESCRIPTION :: Functions :: ---------------------------------/
+.SS Functions
+.\" ----- DESCRIPTION :: Functions :: stpcpy(3) -----------------------/
+.TP
+.BR stpcpy (3)
+This function copies the input string into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.\" ----- DESCRIPTION :: Functions :: strcpy(3), strcat(3) ------------/
+.TP
+.BR strcpy (3)
+.TQ
+.BR strcat (3)
+These functions copy and catenate the input string into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+The return value is useless.
+.IP
+.BR stpcpy (3)
+is a faster alternative to these functions.
+.\" ----- DESCRIPTION :: Functions :: stpecpy(3), stpecpyx(3) ---------/
+.TP
+.BR stpecpy (3)
+.TQ
+.BR stpecpyx (3)
+These functions copy the input string into a destination string.
+If the destination buffer,
+limited by a pointer to one past the end of it,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+They return a pointer suitable for chaining.
+Truncation needs to be detected only once after the last chained call.
+.BR stpecpyx (3)
+has identical semantics to
+.BR stpecpy (3),
+except that it forces a SIGSEGV if the
+.I src
+pointer is not a string.
+.IP
+These functions are not provided by any library;
+See EXAMPLES for a reference implementation.
+.\" ----- DESCRIPTION :: Functions :: strlcpy(3bsd), strlcat(3bsd) ----/
+.TP
+.BR strlcpy (3bsd)
+.TQ
+.BR strlcat (3bsd)
+These functions copy and catenate the input string into a destination string.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+They return the length of the total string they tried to create.
+These functions force a SIGSEGV if the
+.I src
+pointer is not a string.
+.IP
+.BR stpecpyx (3)
+is a faster alternative to these functions.
+.\" ----- DESCRIPTION :: Functions :: stpncpy(3) ----------------------/
+.TP
+.BR stpncpy (3)
+This function copies the input string into
+a destination null-padded character sequence in a fixed-width buffer.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting character sequence is truncated.
+Since it creates a character sequence,
+it doesn't need to write a terminating null byte.
+It's impossible to distinguish truncation after the call,
+from a character sequence that just fits the destination buffer;
+truncation should be detected from the length of the original string.
+.\" ----- DESCRIPTION :: Functions :: strncpy(3) ----------------------/
+.TP
+.BR strncpy (3)
+This function is identical to
+.BR stpncpy (3)
+except for the useless return value.
+.IP
+.BR stpncpy (3)
+is a more useful alternative to this function.
+.\" ----- DESCRIPTION :: Functions :: zustr2ustp(3) --------------------/
+.TP
+.BR zustr2ustp (3)
+This function copies the input character sequence
+contained in a null-padded wixed-width buffer,
+into a destination character sequence.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+A truncating version of this function doesn't exist,
+since the size of the original character sequence is always known,
+so it wouldn't be very useful.
+.IP
+This function is not provided by any library;
+See EXAMPLES for a reference implementation.
+.\" ----- DESCRIPTION :: Functions :: zustr2stp(3) --------------------/
+.TP
+.BR zustr2stp (3)
+This function copies the input character sequence
+contained in a null-padded wixed-width buffer,
+into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+A truncating version of this function doesn't exist,
+since the size of the original character sequence is always known,
+so it wouldn't be very useful.
+.IP
+This function is not provided by any library;
+See EXAMPLES for a reference implementation.
+.\" ----- DESCRIPTION :: Functions :: strncat(3) ----------------------/
+.TP
+.BR strncat (3)
+Do not confuse this function with
+.BR strncpy (3);
+they are not related at all.
+.IP
+This function catenates the input character sequence
+contained in a null-padded wixed-width buffer,
+into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+The return value is useless.
+.IP
+.BR zustr2stp (3)
+is a faster alternative to this function.
+.\" ----- DESCRIPTION :: Functions :: ustpcpy(3) ----------------------/
+.TP
+.BR ustpcpy (3)
+This function copies the input character sequence,
+limited by its length,
+into a destination character sequence.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.\" ----- DESCRIPTION :: Functions :: ustr2stp(3) ---------------------/
+.TP
+.BR ustr2stp (3)
+This function copies the input character sequence,
+limited by its length,
+into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.\" ----- RETURN VALUE :: ---------------------------------------------/
+.SH RETURN VALUE
+The following functions return
+a pointer to the terminating null byte in the destination string.
+.IP \(bu 3
+.PD 0
+.BR stpcpy (3)
+.IP \(bu
+.BR ustr2stp (3)
+.IP \(bu
+.BR zustr2stp (3)
+.PD
+.PP
+The following functions return
+a pointer to the terminating null byte in the destination string,
+except when truncation occurs;
+if truncation occurs,
+they return a pointer to one past the end of the destination buffer
+.RI ( past_end ).
+.IP \(bu 3
+.BR stpecpy (3),
+.BR stpecpyx (3)
+.PP
+The following function returns
+a pointer to one after the last character
+in the destination character sequence;
+if truncation occurs,
+that pointer is equivalent to
+a pointer to one past the end of the destination buffer.
+.IP \(bu 3
+.BR stpncpy (3)
+.PP
+The following functions return
+a pointer to one after the last character
+in the destination character sequence.
+.IP \(bu 3
+.PD 0
+.BR zustr2ustp (3)
+.IP \(bu
+.BR ustpcpy (3)
+.PD
+.PP
+The following functions return
+the length of the total string that they tried to create
+(as if truncation didn't occur).
+.IP \(bu 3
+.BR strlcpy (3bsd),
+.BR strlcat (3bsd)
+.PP
+The following functions return the
+.I dst
+pointer,
+which is useless.
+.IP \(bu 3
+.PD 0
+.BR strcpy (3),
+.BR strcat (3)
+.IP \(bu
+.BR strncpy (3)
+.IP \(bu
+.BR strncat (3)
+.PD
+.\" ----- NOTES :: strscpy(9) -----------------------------------------/
+.SH NOTES
+The Linux kernel has an internal function for copying strings,
+which is similar to
+.BR stpecpy (3),
+except that it can't be chained:
+.TP
+.BR strscpy (9)
+This function copies the input string into a destination string.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+It returns the length of the destination string, or
+.B \-E2BIG
+on truncation.
+.IP
+.BR stpecpy (3)
+is a simpler and faster alternative to this function.
+.RE
+.\" ----- CAVEATS :: --------------------------------------------------/
+.SH CAVEATS
+Don't mix chain calls to truncating and non-truncating functions.
+It is conceptually wrong
+unless you know that the first part of a copy will always fit.
+Anyway, the performance difference will probably be negligible,
+so it will probably be more clear if you use consistent semantics:
+either truncating or non-truncating.
+Calling a non-truncating function after a truncating one is necessarily wrong.
+.\" ----- BUGS :: -----------------------------------------------------/
+.SH BUGS
+All catenation functions share the same performance problem:
+.UR https://www.joelonsoftware.com/\:2001/12/11/\:back\-to\-basics/
+Shlemiel the painter
+.UE .
+.\" ----- EXAMPLES :: -------------------------------------------------/
+.SH EXAMPLES
+The following are examples of correct use of each of these functions.
+.\" ----- EXAMPLES :: stpcpy(3) ---------------------------------------/
+.TP
+.BR stpcpy (3)
+.EX
+p = buf;
+p = stpcpy(p, "Hello ");
+p = stpcpy(p, "world");
+p = stpcpy(p, "!");
+len = p \- buf;
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: strcpy(3), strcat(3) ----------------------------/
+.TP
+.BR strcpy (3)
+.TQ
+.BR strcat (3)
+.EX
+strcpy(buf, "Hello ");
+strcat(buf, "world");
+strcat(buf, "!");
+len = strlen(buf);
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: stpecpy(3), stpecpyx(3) -------------------------/
+.TP
+.BR stpecpy (3)
+.TQ
+.BR stpecpyx (3)
+.EX
+past_end = buf + sizeof(buf);
+p = buf;
+p = stpecpy(p, past_end, "Hello ");
+p = stpecpy(p, past_end, "world");
+p = stpecpy(p, past_end, "!");
+if (p == past_end) {
+ p\-\-;
+ goto toolong;
+}
+len = p \- buf;
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: strlcpy(3bsd), strlcat(3bsd) --------------------/
+.TP
+.BR strlcpy (3bsd)
+.TQ
+.BR strlcat (3bsd)
+.EX
+if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
+ goto toolong;
+if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
+ goto toolong;
+len = strlcat(buf, "!", sizeof(buf));
+if (len >= sizeof(buf))
+ goto toolong;
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: strscpy(9) --------------------------------------/
+.TP
+.BR strscpy (9)
+.EX
+len = strscpy(buf, "Hello world!", sizeof(buf));
+if (len == \-E2BIG)
+ goto toolong;
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: stpncpy(3) --------------------------------------/
+.TP
+.BR stpncpy (3)
+.EX
+end = stpncpy(buf, "Hello world!", sizeof(buf));
+if (sizeof(buf) < strlen("Hello world!"))
+ goto toolong;
+len = end \- buf;
+for (size_t i = 0; i < sizeof(buf); i++)
+ putchar(buf[i]);
+.EE
+.\" ----- EXAMPLES :: strncpy(3) --------------------------------------/
+.TP
+.BR strncpy (3)
+.EX
+strncpy(buf, "Hello world!", sizeof(buf));
+if (sizeof(buf) < strlen("Hello world!"))
+ goto toolong;
+len = strnlen(buf, sizeof(buf));
+for (size_t i = 0; i < sizeof(buf); i++)
+ putchar(buf[i]);
+.EE
+.\" ----- EXAMPLES :: zustr2ustp(3) -----------------------------------/
+.TP
+.BR zustr2ustp (3)
+.EX
+p = buf;
+p = zustr2ustp(p, "Hello ", 6);
+p = zustr2ustp(p, "world", 42); // Padding null bytes ignored.
+p = zustr2ustp(p, "!", 1);
+len = p \- buf;
+printf("%.*s\en", (int) len, buf);
+.EE
+.\" ----- EXAMPLES :: zustr2stp(3) ------------------------------------/
+.TP
+.BR zustr2stp (3)
+.EX
+p = buf;
+p = zustr2stp(p, "Hello ", 6);
+p = zustr2stp(p, "world", 42); // Padding null bytes ignored.
+p = zustr2stp(p, "!", 1);
+len = p \- buf;
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: strncat(3) --------------------------------------/
+.TP
+.BR strncat (3)
+.EX
+buf[0] = \(aq\e0\(aq; // There's no 'cpy' function to this 'cat'.
+strncat(buf, "Hello ", 6);
+strncat(buf, "world", 42); // Padding null bytes ignored.
+strncat(buf, "!", 1);
+len = strlen(buf);
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: ustpcpy(3) --------------------------------------/
+.TP
+.BR ustpcpy (3)
+.EX
+p = buf;
+p = ustpcpy(p, "Hello ", 6);
+p = ustpcpy(p, "world", 5);
+p = ustpcpy(p, "!", 1);
+len = p \- buf;
+printf("%.*s\en", (int) len, buf);
+.EE
+.\" ----- EXAMPLES :: ustr2stp(3) -------------------------------------/
+.TP
+.BR ustr2stp (3)
+.EX
+p = buf;
+p = ustr2stp(p, "Hello ", 6);
+p = ustr2stp(p, "world", 5);
+p = ustr2stp(p, "!", 1);
+len = p \- buf;
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: Implementations :: ------------------------------/
+.SS Implementations
+Here are reference implementations for functions not provided by libc.
+.PP
+.in +4n
+.EX
+/* This code is in the public domain. */
+
+.\" ----- EXAMPLES :: Implementations :: stpecpy(3) -------------------/
+char *
+.IR stpecpy "(char *dst, char past_end[0], const char *restrict src)"
+{
+ char *p;
+
+ if (dst == past_end)
+ return past_end;
+
+ p = memccpy(dst, src, \(aq\e0\(aq, past_end \- dst);
+ if (p != NULL)
+ return p \- 1;
+
+ /* truncation detected */
+ past_end[\-1] = \(aq\e0\(aq;
+ return past_end;
+}
+
+.\" ----- EXAMPLES :: Implementations :: stpecpy(3) -------------------/
+char *
+.IR stpecpyx "(char *dst, char past_end[0], const char *restrict src)"
+{
+ if (src[strlen(src)] != \(aq\e0\(aq)
+ raise(SIGSEGV);
+
+ return stpecpy(dst, past_end, src);
+}
+
+.\" ----- EXAMPLES :: Implementations :: zustr2ustp(3) ----------------/
+char *
+.IR zustr2ustp "(char *restrict dst, const char *restrict src, size_t sz)"
+{
+ return ustpcpy(dst, src, strnlen(src, sz));
+}
+
+.\" ----- EXAMPLES :: Implementations :: zustr2stp(3) -----------------/
+char *
+.IR zustr2stp "(char *restrict dst, const char *restrict src, size_t sz)"
+{
+ char *end;
+
+ end = zustr2ustp(dst, src, sz);
+ *end = \(aq\e0\(aq;
+
+ return end;
+}
+
+.\" ----- EXAMPLES :: Implementations :: ustpcpy(3) -------------------/
+char *
+.IR ustpcpy "(char *restrict dst, const char *restrict src, size_t len)"
+{
+ return mempcpy(dst, src, len);
+}
+
+.\" ----- EXAMPLES :: Implementations :: ustr2stp(3) ------------------/
+char *
+.IR ustr2stp "(char *restrict dst, const char *restrict src, size_t len)"
+{
+ char *end;
+
+ end = ustpcpy(dst, src, len);
+ *end = \(aq\e0\(aq;
+
+ return end;
+}
+.EE
+.in
+.EE
+.in
+.EE
+.in
+.\" ----- SEE ALSO :: -------------------------------------------------/
+.SH SEE ALSO
+.BR bzero (3),
+.BR memcpy (3),
+.BR memccpy (3),
+.BR mempcpy (3),
+.BR stpcpy (3),
+.BR strlcpy (3bsd),
+.BR strncat (3),
+.BR strpcpy (3),
+.BR string (3)
--
2.38.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v5 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7)
2022-12-14 16:17 ` [PATCH v4 " Alejandro Colomar
2022-12-15 0:26 ` [PATCH v5 0/5] Rewrite pages about " Alejandro Colomar
2022-12-15 0:26 ` [PATCH v5 1/5] string_copy.7: Add page to document all string-copying functions Alejandro Colomar
@ 2022-12-15 0:26 ` Alejandro Colomar
2022-12-15 0:27 ` Alejandro Colomar
2022-12-15 0:26 ` [PATCH v5 3/5] stpcpy.3, strcpy.3, strcat.3: Document in a single page Alejandro Colomar
` (2 subsequent siblings)
5 siblings, 1 reply; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-15 0:26 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
Cc: Martin Sebor <msebor@redhat.com>
Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
Cc: Jakub Wilk <jwilk@jwilk.net>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Iker Pedrosa <ipedrosa@redhat.com>
Cc: Andrew Pinski <pinskia@gmail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/stpecpy.3 | 1 +
man3/stpecpyx.3 | 1 +
man3/ustpcpy.3 | 1 +
man3/ustr2stp.3 | 1 +
man3/zustr2stp.3 | 1 +
man3/zustr2ustp.3 | 1 +
6 files changed, 6 insertions(+)
create mode 100644 man3/stpecpy.3
create mode 100644 man3/stpecpyx.3
create mode 100644 man3/ustpcpy.3
create mode 100644 man3/ustr2stp.3
create mode 100644 man3/zustr2stp.3
create mode 100644 man3/zustr2ustp.3
diff --git a/man3/stpecpy.3 b/man3/stpecpy.3
new file mode 100644
index 000000000..6ff53887b
--- /dev/null
+++ b/man3/stpecpy.3
@@ -0,0 +1 @@
+.so man7/string_copy.7
diff --git a/man3/stpecpyx.3 b/man3/stpecpyx.3
new file mode 100644
index 000000000..6ff53887b
--- /dev/null
+++ b/man3/stpecpyx.3
@@ -0,0 +1 @@
+.so man7/string_copy.7
diff --git a/man3/ustpcpy.3 b/man3/ustpcpy.3
new file mode 100644
index 000000000..6ff53887b
--- /dev/null
+++ b/man3/ustpcpy.3
@@ -0,0 +1 @@
+.so man7/string_copy.7
diff --git a/man3/ustr2stp.3 b/man3/ustr2stp.3
new file mode 100644
index 000000000..6ff53887b
--- /dev/null
+++ b/man3/ustr2stp.3
@@ -0,0 +1 @@
+.so man7/string_copy.7
diff --git a/man3/zustr2stp.3 b/man3/zustr2stp.3
new file mode 100644
index 000000000..6ff53887b
--- /dev/null
+++ b/man3/zustr2stp.3
@@ -0,0 +1 @@
+.so man7/string_copy.7
diff --git a/man3/zustr2ustp.3 b/man3/zustr2ustp.3
new file mode 100644
index 000000000..6ff53887b
--- /dev/null
+++ b/man3/zustr2ustp.3
@@ -0,0 +1 @@
+.so man7/string_copy.7
--
2.38.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v5 3/5] stpcpy.3, strcpy.3, strcat.3: Document in a single page
2022-12-14 16:17 ` [PATCH v4 " Alejandro Colomar
` (2 preceding siblings ...)
2022-12-15 0:26 ` [PATCH v5 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7) Alejandro Colomar
@ 2022-12-15 0:26 ` Alejandro Colomar
2022-12-16 14:46 ` Alejandro Colomar
2022-12-15 0:26 ` [PATCH v5 4/5] stpncpy.3, strncpy.3: " Alejandro Colomar
2022-12-15 0:26 ` [PATCH v5 5/5] strncat.3: Rewrite to be consistent with string_copy.7 Alejandro Colomar
5 siblings, 1 reply; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-15 0:26 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
Rewrite to be consistent with the new string_copy.7 page.
Cc: Martin Sebor <msebor@redhat.com>
Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
Cc: Jakub Wilk <jwilk@jwilk.net>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Iker Pedrosa <ipedrosa@redhat.com>
Cc: Andrew Pinski <pinskia@gmail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/stpcpy.3 | 13 ---
man3/strcat.3 | 161 +----------------------------------
man3/strcpy.3 | 226 +++++++++++++++++++++++++++++++-------------------
3 files changed, 143 insertions(+), 257 deletions(-)
diff --git a/man3/stpcpy.3 b/man3/stpcpy.3
index 5770790fc..d01c0239b 100644
--- a/man3/stpcpy.3
+++ b/man3/stpcpy.3
@@ -14,19 +14,6 @@ .SH SYNOPSIS
.PP
.BI "char *stpcpy(char *restrict " dest ", const char *restrict " src );
.fi
-.PP
-.RS -4
-Feature Test Macro Requirements for glibc (see
-.BR feature_test_macros (7)):
-.RE
-.PP
-.BR stpcpy ():
-.nf
- Since glibc 2.10:
- _POSIX_C_SOURCE >= 200809L
- Before glibc 2.10:
- _GNU_SOURCE
-.fi
.SH DESCRIPTION
The
.BR stpcpy ()
diff --git a/man3/strcat.3 b/man3/strcat.3
index 277e5b1e4..ff7476a84 100644
--- a/man3/strcat.3
+++ b/man3/strcat.3
@@ -1,160 +1 @@
-.\" Copyright 1993 David Metcalfe (david@prism.demon.co.uk)
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\" References consulted:
-.\" Linux libc source code
-.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
-.\" 386BSD man pages
-.\" Modified Sat Jul 24 18:11:47 1993 by Rik Faith (faith@cs.unc.edu)
-.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
-.\" Improve discussion of strncat().
-.TH strcat 3 (date) "Linux man-pages (unreleased)"
-.SH NAME
-strcat \- concatenate two strings
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <string.h>
-.PP
-.BI "char *strcat(char *restrict " dest ", const char *restrict " src );
-.fi
-.SH DESCRIPTION
-The
-.BR strcat ()
-function appends the
-.I src
-string to the
-.I dest
-string,
-overwriting the terminating null byte (\(aq\e0\(aq) at the end of
-.IR dest ,
-and then adds a terminating null byte.
-The strings may not overlap, and the
-.I dest
-string must have
-enough space for the result.
-If
-.I dest
-is not large enough, program behavior is unpredictable;
-.IR "buffer overruns are a favorite avenue for attacking secure programs" .
-.SH RETURN VALUE
-The
-.BR strcat ()
-function returns a pointer to the resulting string
-.IR dest .
-.SH ATTRIBUTES
-For an explanation of the terms used in this section, see
-.BR attributes (7).
-.ad l
-.nh
-.TS
-allbox;
-lbx lb lb
-l l l.
-Interface Attribute Value
-T{
-.BR strcat (),
-.BR strncat ()
-T} Thread safety MT-Safe
-.TE
-.hy
-.ad
-.sp 1
-.SH STANDARDS
-POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH NOTES
-Some systems (the BSDs, Solaris, and others) provide the following function:
-.PP
-.in +4n
-.EX
-size_t strlcat(char *dest, const char *src, size_t size);
-.EE
-.in
-.PP
-This function appends the null-terminated string
-.I src
-to the string
-.IR dest ,
-copying at most
-.I size\-strlen(dest)\-1
-from
-.IR src ,
-and adds a terminating null byte to the result,
-.I unless
-.I size
-is less than
-.IR strlen(dest) .
-This function fixes the buffer overrun problem of
-.BR strcat (),
-but the caller must still handle the possibility of data loss if
-.I size
-is too small.
-The function returns the length of the string
-.BR strlcat ()
-tried to create; if the return value is greater than or equal to
-.IR size ,
-data loss occurred.
-If data loss matters, the caller
-.I must
-either check the arguments before the call, or test the function return value.
-.BR strlcat ()
-is not present in glibc and is not standardized by POSIX,
-.\" https://lwn.net/Articles/506530/
-but is available on Linux via the
-.I libbsd
-library.
-.\"
-.SH EXAMPLES
-Because
-.BR strcat ()
-must find the null byte that terminates the string
-.I dest
-using a search that starts at the beginning of the string,
-the execution time of this function
-scales according to the length of the string
-.IR dest .
-This can be demonstrated by running the program below.
-(If the goal is to concatenate many strings to one target,
-then manually copying the bytes from each source string
-while maintaining a pointer to the end of the target string
-will provide better performance.)
-.\"
-.SS Program source
-\&
-.\" SRC BEGIN (strcat.c)
-.EX
-#include <stdint.h>
-#include <stdio.h>
-#include <string.h>
-#include <time.h>
-
-int
-main(void)
-{
-#define LIM 4000000
- char p[LIM + 1]; /* +1 for terminating null byte */
- time_t base;
-
- base = time(NULL);
- p[0] = \(aq\e0\(aq;
-
- for (unsigned int j = 0; j < LIM; j++) {
- if ((j % 10000) == 0)
- printf("%u %jd\en", j, (intmax_t) (time(NULL) \- base));
- strcat(p, "a");
- }
-}
-.EE
-.\" SRC END
-.SH SEE ALSO
-.BR bcopy (3),
-.BR memccpy (3),
-.BR memcpy (3),
-.BR strcpy (3),
-.BR string (3),
-.BR strlcat (3bsd),
-.BR wcscat (3),
-.BR wcsncat (3)
+.so man3/strcpy.3
diff --git a/man3/strcpy.3 b/man3/strcpy.3
index 74c3180ae..424648c46 100644
--- a/man3/strcpy.3
+++ b/man3/strcpy.3
@@ -1,20 +1,10 @@
-.\" Copyright (C) 1993 David Metcalfe (david@prism.demon.co.uk)
+.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
.\"
.\" SPDX-License-Identifier: Linux-man-pages-copyleft
.\"
-.\" References consulted:
-.\" Linux libc source code
-.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
-.\" 386BSD man pages
-.\" Modified Sat Jul 24 18:06:49 1993 by Rik Faith (faith@cs.unc.edu)
-.\" Modified Fri Aug 25 23:17:51 1995 by Andries Brouwer (aeb@cwi.nl)
-.\" Modified Wed Dec 18 00:47:18 1996 by Andries Brouwer (aeb@cwi.nl)
-.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
-.\" Improve discussion of strncpy().
-.\"
.TH strcpy 3 (date) "Linux man-pages (unreleased)"
.SH NAME
-strcpy \- copy a string
+strcpy \- copy or catenate a string
.SH LIBRARY
Standard C library
.RI ( libc ", " \-lc )
@@ -22,26 +12,87 @@ .SH SYNOPSIS
.nf
.B #include <string.h>
.PP
-.BI "char *strcpy(char *restrict " dest ", const char *restrict " src );
+.BI "char *stpcpy(char *restrict " dst ", const char *restrict " src );
+.BI "char *strcpy(char *restrict " dst ", const char *restrict " src );
+.BI "char *strcat(char *restrict " dst ", const char *restrict " src );
+.fi
+.PP
+.RS -4
+Feature Test Macro Requirements for glibc (see
+.BR feature_test_macros (7)):
+.RE
+.PP
+.BR stpcpy ():
+.nf
+ Since glibc 2.10:
+ _POSIX_C_SOURCE >= 200809L
+ Before glibc 2.10:
+ _GNU_SOURCE
.fi
.SH DESCRIPTION
-The
+.TP
+.BR stpcpy ()
+.TQ
.BR strcpy ()
-function copies the string pointed to by
+These functions copy the string pointed to by
.IR src ,
-including the terminating null byte (\(aq\e0\(aq),
-to the buffer pointed to by
-.IR dest .
-The strings may not overlap, and the destination string
-.I dest
-must be large enough to receive the copy.
-.I Beware of buffer overruns!
-(See BUGS.)
+into a string
+at the buffer pointed to by
+.IR dst .
+The programmer is responsible for allocating a buffer large enough,
+that is,
+.IR "strlen(src) + 1" .
+They only differ in the return value.
+.TP
+.BR strcat ()
+This function catenates the string pointed to by
+.IR src ,
+at the end of the string pointed to by
+.IR dst .
+The programmer is responsible for allocating a buffer large enough,
+that is,
+.IR "strlen(dst) + strlen(src) + 1" .
+.PP
+An implementation of these functions might be:
+.PP
+.in +4n
+.EX
+char *
+stpcpy(char *restrict dst, const char *restrict src)
+{
+ char *end;
+
+ end = mempcpy(dst, src, strlen(src));
+ *end = \(aq\e0\(aq;
+
+ return end;
+}
+
+char *
+strcpy(char *restrict dst, const char *restrict src)
+{
+ stpcpy(dst, src);
+ return dst;
+}
+
+char *
+strcat(char *restrict dst, const char *restrict src)
+{
+ stpcpy(dst + strlen(dst), src);
+ return dst;
+}
+.EE
+.in
.SH RETURN VALUE
-The
+.TP
+.BR stpcpy ()
+This function returns
+a pointer to the terminating null byte at the end of the copied string.
+.TP
.BR strcpy ()
-function returns a pointer to
-the destination string
+.TQ
+.BR strcat ()
+These functions return
.IR dest .
.SH ATTRIBUTES
For an explanation of the terms used in this section, see
@@ -54,73 +105,80 @@ .SH ATTRIBUTES
l l l.
Interface Attribute Value
T{
-.BR strcpy ()
+.BR stpcpy (),
+.BR strcpy (),
+.BR strcat ()
T} Thread safety MT-Safe
.TE
.hy
.ad
.sp 1
.SH STANDARDS
+.TP
+.BR stpcpy ()
+POSIX.1-2008.
+.TP
+.BR strcpy ()
+.TQ
+.BR strcat ()
POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH NOTES
-.SS strlcpy()
-Some systems (the BSDs, Solaris, and others) provide the following function:
+.SH CAVEATS
+The strings
+.I src
+and
+.I dst
+may not overlap.
.PP
-.in +4n
-.EX
-size_t strlcpy(char *dest, const char *src, size_t size);
-.EE
-.in
-.PP
-.\" http://static.usenix.org/event/usenix99/full_papers/millert/millert_html/index.html
-.\" "strlcpy and strlcat - consistent, safe, string copy and concatenation"
-.\" 1999 USENIX Annual Technical Conference
-This function is similar to
-.BR strcpy (),
-but it copies at most
-.I size\-1
-bytes to
-.IR dest ,
-truncating the string as necessary.
-It always adds a terminating null byte.
-This function fixes some of the problems of
-.BR strcpy ()
-but the caller must still handle the possibility of data loss if
-.I size
-is too small.
-The return value of the function is the length of
-.IR src ,
-which allows truncation to be easily detected:
-if the return value is greater than or equal to
-.IR size ,
-truncation occurred.
-If loss of data matters, the caller
-.I must
-either check the arguments before the call,
-or test the function return value.
-.BR strlcpy ()
-is not present in glibc and is not standardized by POSIX,
-.\" https://lwn.net/Articles/506530/
-but is available on Linux via the
-.I libbsd
-library.
+If the destination buffer is not large enough,
+the behavior is undefined.
+See
+.B _FORTIFY_SOURCE
+in
+.BR feature_test_macros (7).
.SH BUGS
-If the destination string of a
-.BR strcpy ()
-is not large enough, then anything might happen.
-Overflowing fixed-length string buffers is a favorite cracker technique
-for taking complete control of the machine.
-Any time a program reads or copies data into a buffer,
-the program first needs to check that there's enough space.
-This may be unnecessary if you can show that overflow is impossible,
-but be careful: programs can get changed over time,
-in ways that may make the impossible possible.
+.TP
+.BR strcat ()
+This function can be very inefficient.
+Read about
+.UR https://www.joelonsoftware.com/\:2001/12/11/\:back\-to\-basics/
+Shlemiel the painter
+.UE .
+.SH EXAMPLES
+.EX
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+int
+main(void)
+{
+ char *p;
+ char buf1[BUFSIZ];
+ char buf2[BUFSIZ];
+ size_t len;
+
+ p = buf1;
+ p = stpcpy(p, "Hello ");
+ p = stpcpy(p, "world");
+ p = stpcpy(p, "!");
+ len = p \- buf1;
+
+ printf("[len = %zu]: ", len);
+ puts(buf1); // "Hello world!"
+
+ strcpy(buf2, "Hello ");
+ strcat(buf2, "world");
+ strcat(buf2, "!");
+ len = strlen(buf2);
+
+ printf("[len = %zu]: ", len);
+ puts(buf2); // "Hello world!"
+
+ exit(EXIT_SUCCESS);
+}
+.EE
.SH SEE ALSO
-.BR bcopy (3),
-.BR memccpy (3),
-.BR memcpy (3),
-.BR memmove (3),
-.BR stpcpy (3),
.BR strdup (3),
.BR string (3),
-.BR wcscpy (3)
+.BR wcscpy (3),
+.BR string_copy (7)
--
2.38.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v5 4/5] stpncpy.3, strncpy.3: Document in a single page
2022-12-14 16:17 ` [PATCH v4 " Alejandro Colomar
` (3 preceding siblings ...)
2022-12-15 0:26 ` [PATCH v5 3/5] stpcpy.3, strcpy.3, strcat.3: Document in a single page Alejandro Colomar
@ 2022-12-15 0:26 ` Alejandro Colomar
2022-12-15 0:28 ` Alejandro Colomar
2022-12-15 0:26 ` [PATCH v5 5/5] strncat.3: Rewrite to be consistent with string_copy.7 Alejandro Colomar
5 siblings, 1 reply; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-15 0:26 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
Rewrite to be consistent with the new string_copy.7 page.
Cc: Martin Sebor <msebor@redhat.com>
Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
Cc: Jakub Wilk <jwilk@jwilk.net>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Iker Pedrosa <ipedrosa@redhat.com>
Cc: Andrew Pinski <pinskia@gmail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/stpncpy.3 | 163 +++++++++++++++++++++++++++++--------------------
man3/strncpy.3 | 130 +--------------------------------------
2 files changed, 99 insertions(+), 194 deletions(-)
diff --git a/man3/stpncpy.3 b/man3/stpncpy.3
index 0a62e3055..ab69be8ec 100644
--- a/man3/stpncpy.3
+++ b/man3/stpncpy.3
@@ -1,15 +1,13 @@
-.\" Copyright (c) Bruno Haible <haible@clisp.cons.org>
-.\" Copyright (c) 2022 Alejandro Colomar <alx@kernel.org>
+.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
.\"
-.\" SPDX-License-Identifier: GPL-2.0-or-later
+.\" SPDX-License-Identifier: Linux-man-pages-copyleft
.\"
-.\" References consulted:
-.\" GNU glibc-2 source code and manual
-.\"
-.\" Corrected, aeb, 990824
.TH stpncpy 3 (date) "Linux man-pages (unreleased)"
.SH NAME
-stpncpy \- copy string into a fixed-length buffer and zero the rest of it
+stpncpy, strncpy
+\- zero a fixed-width buffer and
+copy a string into a character sequence with truncation
+and zero the rest of it
.SH LIBRARY
Standard C library
.RI ( libc ", " \-lc )
@@ -17,9 +15,12 @@ .SH SYNOPSIS
.nf
.B #include <string.h>
.PP
-.BI "char *stpncpy(char " dest "[restrict ." n "], \
-const char " src "[restrict ." n ],
-.BI " size_t " n );
+.BI "char *stpncpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.BI "char *strncpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
.fi
.PP
.RS -4
@@ -35,67 +36,44 @@ .SH SYNOPSIS
_GNU_SOURCE
.fi
.SH DESCRIPTION
-.IR Note :
-This is probably not the function you want to use.
-For string copying with truncation, see
-.BR strlcpy (3bsd).
-.PP
-The
-.BR stpncpy ()
-function copies at most
-.I n
-characters of
+These functions copy the string pointed to by
.I src
-and fills the rest of the
-.I dest
-buffer with null bytes.
-.BR Warning :
-If there is no null character among the first
-.I n
-bytes of
-.IR src ,
-the string placed in
-.I dest
-will not be null-terminated.
+into a null-padded character sequence at the fixed-width buffer pointer to by
+.IR dst .
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting character sequence is truncated.
+They only differ in the return value.
.PP
-A simple implementation of
-.BR strncpy ()
-might be:
+An implementation of these functions might be:
.PP
.in +4n
.EX
char *
-stpncpy(char *dest, const char *src, size_t n)
+stpncpy(char *restrict dst, const char *restrict src, size_t sz)
{
- char *p
+ bzero(dst, sz);
+ return mempcpy(dst, src, strnlen(src, sz));
+}
- bzero(dest, n);
- p = memccpy(dest, src, \(aq\e0\(aq, n);
- if (p == NULL)
- return dest + n;
-
- return p - 1;
+char *
+strncpy(char *restrict dst, const char *restrict src, size_t sz)
+{
+ stpncpy(dst, src, sz);
+ return dst;
}
.EE
.in
-.PP
-The use of
-.BR strncpy ()
-is to copy a C string to a fixed-length buffer
-while ensuring that unused bytes in the destination buffer are zeroed out
-(perhaps to prevent information leaks if the buffer is to be
-written to media or transmitted to another process via an
-interprocess communication technique).
.SH RETURN VALUE
+.TP
.BR stpncpy ()
-returns a pointer to the terminating null byte
-in
-.IR dest ,
-or, if
-.I dest
-is not null-terminated,
-.IR dest + n
-(that is, a pointer to one-past-the-end of the array).
+returns a pointer to
+one after the last character in the destination character sequence.
+.TP
+.BR strncpy ()
+returns
+.IR dst .
.SH ATTRIBUTES
For an explanation of the terms used in this section, see
.BR attributes (7).
@@ -107,16 +85,71 @@ .SH ATTRIBUTES
l l l.
Interface Attribute Value
T{
-.BR stpncpy ()
+.BR stpncpy (),
+.BR strncpy ()
T} Thread safety MT-Safe
.TE
.hy
.ad
.sp 1
.SH STANDARDS
-This function was added to POSIX.1-2008.
-Before that, it was a GNU extension.
-It first appeared in glibc 1.07 in 1993.
+.TP
+.BR stpncpy ()
+POSIX.1-2008.
+.\" Before that, it was a GNU extension.
+.\" It first appeared in glibc 1.07 in 1993.
+.TP
+.BR strncpy ()
+POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
+.SH CAVEATS
+The name of these functions is confusing.
+These functions produce a null-padded character sequence,
+not a string (see
+.BR string_copy (7)).
+.PP
+Truncation should be determined by
+comparing the length of the input string
+with the size of the destination buffer.
+.PP
+If you're going to use this function in chained calls,
+it would be useful to develop a similar function that accepts
+a pointer to one past the end of the destination buffer instead of its size.
+.SH EXAMPLES
+.\" SRC BEGIN (stpncpy.c)
+.EX
+#include <err.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+int
+main(void)
+{
+ char *end;
+ char buf1[20];
+ char buf2[20];
+ size_t len;
+
+ if (sizeof(buf1) < strlen("Hello world!"))
+ warnx("stpncpy: truncating character sequence");
+ end = stpncpy(buf1, "Hello world!", sizeof(buf1));
+ len = end \- buf1;
+
+ printf("[len = %zu]: ", len);
+ printf("%.*s\en", (int) len, buf1); // "Hello world!"
+
+ if (sizeof(buf2) < strlen("Hello world!"))
+ warnx("strncpy: truncating character sequence");
+ strncpy(buf2, "Hello world!", sizeof(buf));
+ len = strnlen(buf2, sizeof(buf2));
+
+ printf("[len = %zu]: ", len);
+ printf("%.*s\en", (int) len, buf2); // "Hello world!"
+
+ exit(EXIT_SUCCESS);
+}
+.EE
+.\" SRC END
.SH SEE ALSO
-.BR strlcpy (3bsd)
-.BR wcpncpy (3)
+.BR wcpncpy (3),
+.BR string_copy (7)
diff --git a/man3/strncpy.3 b/man3/strncpy.3
index e2ffc683f..4710b0201 100644
--- a/man3/strncpy.3
+++ b/man3/strncpy.3
@@ -1,129 +1 @@
-.\" Copyright (C) 1993 David Metcalfe <david@prism.demon.co.uk>
-.\" Copyright (C) 2022 Alejandro Colomar <alx@kernel.org>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\" References consulted:
-.\" Linux libc source code
-.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
-.\" 386BSD man pages
-.\" Modified Sat Jul 24 18:06:49 1993 by Rik Faith (faith@cs.unc.edu)
-.\" Modified Fri Aug 25 23:17:51 1995 by Andries Brouwer (aeb@cwi.nl)
-.\" Modified Wed Dec 18 00:47:18 1996 by Andries Brouwer (aeb@cwi.nl)
-.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
-.\" Improve discussion of strncpy().
-.\"
-.TH strncpy 3 (date) "Linux man-pages (unreleased)"
-.SH NAME
-strncpy \- copy a string into a fixed-length buffer and zero the rest of it
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <string.h>
-.PP
-.BI "[[deprecated]] char *strncpy(char " dest "[restrict ." n ],
-.BI " const char " src "[restrict ." n "], \
-size_t " n );
-.fi
-.SH DESCRIPTION
-.BI Note: " This is not the function you want to use."
-For string copying with truncation, see
-.BR strlcpy (3bsd).
-For copying a string into a fixed-length buffer with zeroing of the rest,
-see
-.BR stpncpy (3).
-.PP
-.BR strncpy ()
-copies at most
-.I n
-bytes of
-.IR src ,
-and fills the rest of the
-.I dest
-buffer with null bytes.
-.BR Warning :
-If there is no null byte
-among the first
-.I n
-bytes of
-.IR src ,
-the string placed in
-.I dest
-will not be null-terminated.
-.PP
-A simple implementation of
-.BR strncpy ()
-might be:
-.PP
-.in +4n
-.EX
-char *
-strncpy(char *dest, const char *src, size_t n)
-{
- bzero(dest, n);
- memccpy(dest, src, \(aq\e0\(aq, n);
-
- return dest;
-}
-.EE
-.in
-.PP
-The use of
-.BR strncpy ()
-is to copy a C string to a fixed-length buffer
-while ensuring that unused bytes in the destination buffer are zeroed out
-(perhaps to prevent information leaks if the buffer is to be
-written to media or transmitted to another process via an
-interprocess communication technique).
-But
-.BR stpncpy (3)
-is better for this purpose,
-since it detects truncation.
-See BUGS below.
-.SH RETURN VALUE
-The
-.BR strncpy ()
-function returns a pointer to
-the destination buffer
-.IR dest .
-.SH ATTRIBUTES
-For an explanation of the terms used in this section, see
-.BR attributes (7).
-.ad l
-.nh
-.TS
-allbox;
-lbx lb lb
-l l l.
-Interface Attribute Value
-T{
-.BR strncpy ()
-T} Thread safety MT-Safe
-.TE
-.hy
-.ad
-.sp 1
-.SH STANDARDS
-POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH BUGS
-.BR strncpy ()
-has a misleading name.
-It doesn't produce a (null-terminated) string;
-and it should never be used for producing a string.
-.PP
-It can't detect truncation.
-It's probably better to explicitly call
-.BR bzero (3)
-and
-.BR memccpy (3),
-or
-.BR stpncpy (3)
-since they allow detecting truncation.
-.SH SEE ALSO
-.BR bzero (3),
-.BR memccpy (3),
-.BR stpncpy (3),
-.BR string (3),
-.BR wcsncpy (3)
+.so man3/stpncpy.3
--
2.38.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v5 5/5] strncat.3: Rewrite to be consistent with string_copy.7.
2022-12-14 16:17 ` [PATCH v4 " Alejandro Colomar
` (4 preceding siblings ...)
2022-12-15 0:26 ` [PATCH v5 4/5] stpncpy.3, strncpy.3: " Alejandro Colomar
@ 2022-12-15 0:26 ` Alejandro Colomar
2022-12-15 0:29 ` Alejandro Colomar
5 siblings, 1 reply; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-15 0:26 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
Cc: Martin Sebor <msebor@redhat.com>
Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
Cc: Jakub Wilk <jwilk@jwilk.net>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Iker Pedrosa <ipedrosa@redhat.com>
Cc: Andrew Pinski <pinskia@gmail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/strncat.3 | 147 +++++++++++++++----------------------------------
1 file changed, 45 insertions(+), 102 deletions(-)
diff --git a/man3/strncat.3 b/man3/strncat.3
index 6e4bf6d78..108a9c450 100644
--- a/man3/strncat.3
+++ b/man3/strncat.3
@@ -4,7 +4,7 @@
.\"
.TH strncat 3 (date) "Linux man-pages (unreleased)"
.SH NAME
-strncat \- concatenate an unterminated string into a string
+strncat \- concatenate a null-padded character sequence into a string
.SH LIBRARY
Standard C library
.RI ( libc ", " \-lc )
@@ -12,53 +12,39 @@ .SH SYNOPSIS
.nf
.B #include <string.h>
.PP
-.BI "char *strncat(char " dest "[restrict strlen(." dest ") + ." n " + 1],"
-.BI " const char " src "[restrict ." n ],
-.BI " size_t " n );
+.BI "char *strncat(char *restrict " dst ", const char " src "[restrict ." sz ],
+.BI " size_t " sz );
.fi
.SH DESCRIPTION
-.IR Note :
-This is probably not the function you want to use.
-For string concatenation with truncation, see
-.BR strlcat (3bsd).
-For copying or concatenating a string into a fixed-length buffer
-with zeroing of the rest, see
-.BR stpncpy (3).
-.PP
-.BR strncat ()
-appends at most
-.I n
-characters of
-.I src
-to the end of
+This function catenates the input character sequence
+contained in a null-padded fixed-width buffer,
+into a string at the buffer pointed to by
.IR dst .
-It always terminates with a null character the string placed in
-.IR dest .
+The programmer is responsible for allocating a buffer large enough, that is,
+.IR "strlen(dst) + strnlen(src, sz) + 1" .
.PP
-An implementation of
-.BR strncat ()
-might be:
+An implementation of this function might be:
.PP
.in +4n
.EX
char *
-strncat(char *dest, const char *src, size_t n)
+strncat(char *restrict dst, const char *restrict src, size_t sz)
{
- char *cat;
- size_t len;
+ int len;
+ char *end;
- cat = dest + strlen(dest);
- len = strnlen(src, n);
- memcpy(cat, src, len);
- cat[len] = \(aq\e0\(aq;
+ len = strnlen(src, sz);
+ end = dst + strlen(dst);
+ end = mempcpy(end, src, len);
+ *end = \(aq\e0\(aq;
- return dest;
+ return dst;
}
.EE
.in
.SH RETURN VALUE
.BR strncat ()
-returns a pointer to the resulting string
+returns
.IR dest .
.SH ATTRIBUTES
For an explanation of the terms used in this section, see
@@ -79,93 +65,50 @@ .SH ATTRIBUTES
.sp 1
.SH STANDARDS
POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH NOTES
-.SS ustr2stpe()
-You may want to write your own function similar to
-.BR strncpy (),
-with the following improvements:
-.IP \(bu 3
-Copy, instead of concatenating.
-There's no equivalent of
-.BR strncat ()
-that copies instead of concatenating.
-.IP \(bu
-Allow chaining the function,
-by returning a suitable pointer.
-Copy chaining is faster than concatenating.
-.IP \(bu
-Don't check for null characters in the middle of the unterminated string.
-If the string is terminated, this function should not be used.
-If the string is unterminated, it is unnecessary.
-.IP \(bu
-A name that tells what it does:
-Copy from an
-.IR u nterminated
-.IR str ing
-to a
-.IR st ring,
-and return a
-.IR p ointer
-to its end.
-.PP
-.in +4n
-.EX
-/* This code is in the public domain.
- *
- * char *ustr2stp(char dst[restrict .n+1],
- * const char src[restrict .n],
- * size_t len);
- */
-char *
-ustr2stp(char *restrict dst, const char *restrict src, size_t len)
-{
- memcpy(dst, src, len);
- dst[len] = \(aq\e0\(aq;
-
- return dst + len;
-}
-.EE
-.in
.SH CAVEATS
-This function doesn't know the size of the destination buffer,
-so it can overrun the buffer if the programmer wasn't careful enough.
-.SH BUGS
-.BR strncat (3)
-has a misleading name;
-it has no relationship with
+The name of this function is confusing.
+This function has no relation to
.BR strncpy (3).
+.PP
+If the destination buffer is not large enough,
+the behavior is undefined.
+See
+.B _FORTIFY_SOURCE
+in
+.BR feature_test_macros (7).
+.SH BUGS
+This function can be very inefficient.
+Read about
+.UR https://www.joelonsoftware.com/\:2001/12/11/\:back\-to\-basics/
+Shlemiel the painter
+.UE .
.SH EXAMPLES
-The following program creates a string
-from a concatenation of unterminated strings.
.\" SRC BEGIN (strncpy.c)
.EX
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
-#define nitems(arr) (sizeof((arr)) / sizeof((arr)[0]))
-
int
main(void)
{
- char pre[4] = "pre.";
- char *post = ".post";
- char *src = "some_long_body.post";
- char dest[100];
+ char buf[BUFSIZ];
+ size_t len;
- dest[0] = \(aq\e0\(aq;
- strncat(dest, pre, nitems(pre));
- strncat(dest, src, strlen(src) \- strlen(post));
+ buf[0] = \(aq\e0\(aq; // There's no 'cpy' function to this 'cat'.
+ strncat(buf, "Hello XXX", 6);
+ strncat(buf, "world", 42);
+ strncat(buf, "!", 1);
+ len = strlen(buf);
+
+ printf("[len = %zu]: ", len);
+ puts(buf); // "Hello world!"
- puts(dest); // "pre.some_long_body"
exit(EXIT_SUCCESS);
}
.EE
.\" SRC END
.in
.SH SEE ALSO
-.BR memccpy (3),
-.BR memcpy (3),
-.BR mempcpy (3),
-.BR strcpy (3),
-.BR string (3)
+.BR string (3),
+.BR string_copy (3)
--
2.38.1
^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v5 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7)
2022-12-15 0:26 ` [PATCH v5 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7) Alejandro Colomar
@ 2022-12-15 0:27 ` Alejandro Colomar
2022-12-16 18:47 ` Stefan Puiu
0 siblings, 1 reply; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-15 0:27 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
[-- Attachment #1.1: Type: text/plain, Size: 4853 bytes --]
Formatted strpcy(3):
strcpy(3) Library Functions Manual strcpy(3)
NAME
strcpy - copy or catenate a string
LIBRARY
Standard C library (libc, -lc)
SYNOPSIS
#include <string.h>
char *stpcpy(char *restrict dst, const char *restrict src);
char *strcpy(char *restrict dst, const char *restrict src);
char *strcat(char *restrict dst, const char *restrict src);
Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
stpcpy():
Since glibc 2.10:
_POSIX_C_SOURCE >= 200809L
Before glibc 2.10:
_GNU_SOURCE
DESCRIPTION
stpcpy()
strcpy()
These functions copy the string pointed to by src, into a string
at the buffer pointed to by dst. The programmer is responsible
for allocating a buffer large enough, that is, strlen(src) + 1.
They only differ in the return value.
strcat()
This function catenates the string pointed to by src, at the end
of the string pointed to by dst. The programmer is responsible
for allocating a buffer large enough, that is, strlen(dst) +
strlen(src) + 1.
An implementation of these functions might be:
char *
stpcpy(char *restrict dst, const char *restrict src)
{
char *end;
end = mempcpy(dst, src, strlen(src));
*end = '\0';
return end;
}
char *
strcpy(char *restrict dst, const char *restrict src)
{
stpcpy(dst, src);
return dst;
}
char *
strcat(char *restrict dst, const char *restrict src)
{
stpcpy(dst + strlen(dst), src);
return dst;
}
RETURN VALUE
stpcpy()
This function returns a pointer to the terminating null byte at
the end of the copied string.
strcpy()
strcat()
These functions return dest.
ATTRIBUTES
For an explanation of the terms used in this section, see attrib‐
utes(7).
┌────────────────────────────────────────────┬───────────────┬─────────┐
│Interface │ Attribute │ Value │
├────────────────────────────────────────────┼───────────────┼─────────┤
│stpcpy(), strcpy(), strcat() │ Thread safety │ MT‐Safe │
└────────────────────────────────────────────┴───────────────┴─────────┘
STANDARDS
stpcpy()
POSIX.1‐2008.
strcpy()
strcat()
POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
CAVEATS
The strings src and dst may not overlap.
If the destination buffer is not large enough, the behavior is unde‐
fined. See _FORTIFY_SOURCE in feature_test_macros(7).
BUGS
strcat()
This function can be very inefficient. Read about Shlemiel
the painter ⟨https://www.joelonsoftware.com/2001/12/11/
back-to-basics/⟩.
EXAMPLES
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int
main(void)
{
char *p;
char buf1[BUFSIZ];
char buf2[BUFSIZ];
size_t len;
p = buf1;
p = stpcpy(p, "Hello ");
p = stpcpy(p, "world");
p = stpcpy(p, "!");
len = p - buf1;
printf("[len = %zu]: ", len);
puts(buf1); // "Hello world!"
strcpy(buf2, "Hello ");
strcat(buf2, "world");
strcat(buf2, "!");
len = strlen(buf2);
printf("[len = %zu]: ", len);
puts(buf2); // "Hello world!"
exit(EXIT_SUCCESS);
}
SEE ALSO
strdup(3), string(3), wcscpy(3), string_copy(7)
Linux man‐pages (unreleased) (date) strcpy(3)
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v5 4/5] stpncpy.3, strncpy.3: Document in a single page
2022-12-15 0:26 ` [PATCH v5 4/5] stpncpy.3, strncpy.3: " Alejandro Colomar
@ 2022-12-15 0:28 ` Alejandro Colomar
0 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-15 0:28 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
[-- Attachment #1.1: Type: text/plain, Size: 4732 bytes --]
Formatted stpncpy(3):
stpncpy(3) Library Functions Manual stpncpy(3)
NAME
stpncpy, strncpy - zero a fixed‐width buffer and copy a string into a
character sequence with truncation and zero the rest of it
LIBRARY
Standard C library (libc, -lc)
SYNOPSIS
#include <string.h>
char *stpncpy(char dst[restrict .sz], const char *restrict src,
size_t sz);
char *strncpy(char dst[restrict .sz], const char *restrict src,
size_t sz);
Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
stpncpy():
Since glibc 2.10:
_POSIX_C_SOURCE >= 200809L
Before glibc 2.10:
_GNU_SOURCE
DESCRIPTION
These functions copy the string pointed to by src into a null‐padded
character sequence at the fixed‐width buffer pointer to by dst. If the
destination buffer, limited by its size, isn’t large enough to hold the
copy, the resulting character sequence is truncated. They only differ
in the return value.
An implementation of these functions might be:
char *
stpncpy(char *restrict dst, const char *restrict src, size_t sz)
{
bzero(dst, sz);
return mempcpy(dst, src, strnlen(src, sz));
}
char *
strncpy(char *restrict dst, const char *restrict src, size_t sz)
{
stpncpy(dst, src, sz);
return dst;
}
RETURN VALUE
stpncpy()
returns a pointer to one after the last character in the desti‐
nation character sequence.
strncpy()
returns dst.
ATTRIBUTES
For an explanation of the terms used in this section, see attrib‐
utes(7).
┌────────────────────────────────────────────┬───────────────┬─────────┐
│Interface │ Attribute │ Value │
├────────────────────────────────────────────┼───────────────┼─────────┤
│stpncpy(), strncpy() │ Thread safety │ MT‐Safe │
└────────────────────────────────────────────┴───────────────┴─────────┘
STANDARDS
stpncpy()
POSIX.1‐2008.
strncpy()
POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
CAVEATS
The name of these functions is confusing. These functions produce a
null‐padded character sequence, not a string (see string_copy(7)).
Truncation should be determined by comparing the length of the input
string with the size of the destination buffer.
If you’re going to use this function in chained calls, it would be use‐
ful to develop a similar function that accepts a pointer to one past
the end of the destination buffer instead of its size.
EXAMPLES
#include <err.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int
main(void)
{
char *end;
char buf1[20];
char buf2[20];
size_t len;
if (sizeof(buf1) < strlen("Hello world!"))
warnx("stpncpy: truncating character sequence");
end = stpncpy(buf1, "Hello world!", sizeof(buf1));
len = end - buf1;
printf("[len = %zu]: ", len);
printf("%.*s\n", (int) len, buf1); // "Hello world!"
if (sizeof(buf2) < strlen("Hello world!"))
warnx("strncpy: truncating character sequence");
strncpy(buf2, "Hello world!", sizeof(buf));
len = strnlen(buf2, sizeof(buf2));
printf("[len = %zu]: ", len);
printf("%.*s\n", (int) len, buf2); // "Hello world!"
exit(EXIT_SUCCESS);
}
SEE ALSO
wcpncpy(3), string_copy(7)
Linux man‐pages (unreleased) (date) stpncpy(3)
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v5 5/5] strncat.3: Rewrite to be consistent with string_copy.7.
2022-12-15 0:26 ` [PATCH v5 5/5] strncat.3: Rewrite to be consistent with string_copy.7 Alejandro Colomar
@ 2022-12-15 0:29 ` Alejandro Colomar
0 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-15 0:29 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
[-- Attachment #1.1: Type: text/plain, Size: 3433 bytes --]
Formatted strncat(3):
strncat(3) Library Functions Manual strncat(3)
NAME
strncat - concatenate a null‐padded character sequence into a string
LIBRARY
Standard C library (libc, -lc)
SYNOPSIS
#include <string.h>
char *strncat(char *restrict dst, const char src[restrict .sz],
size_t sz);
DESCRIPTION
This function catenates the input character sequence contained in a
null‐padded fixed‐width buffer, into a string at the buffer pointed to
by dst. The programmer is responsible for allocating a buffer large
enough, that is, strlen(dst) + strnlen(src, sz) + 1.
An implementation of this function might be:
char *
strncat(char *restrict dst, const char *restrict src, size_t sz)
{
int len;
char *end;
len = strnlen(src, sz);
end = dst + strlen(dst);
end = mempcpy(end, src, len);
*end = '\0';
return dst;
}
RETURN VALUE
strncat() returns dest.
ATTRIBUTES
For an explanation of the terms used in this section, see attrib‐
utes(7).
┌────────────────────────────────────────────┬───────────────┬─────────┐
│Interface │ Attribute │ Value │
├────────────────────────────────────────────┼───────────────┼─────────┤
│strncat() │ Thread safety │ MT‐Safe │
└────────────────────────────────────────────┴───────────────┴─────────┘
STANDARDS
POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
CAVEATS
The name of this function is confusing. This function has no relation
to strncpy(3).
If the destination buffer is not large enough, the behavior is unde‐
fined. See _FORTIFY_SOURCE in feature_test_macros(7).
BUGS
This function can be very inefficient. Read about Shlemiel the painter
⟨https://www.joelonsoftware.com/2001/12/11/back-to-basics/⟩.
EXAMPLES
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int
main(void)
{
char buf[BUFSIZ];
size_t len;
buf[0] = '\0'; // There’s no ’cpy’ function to this ’cat’.
strncat(buf, "Hello XXX", 6);
strncat(buf, "world", 42);
strncat(buf, "!", 1);
len = strlen(buf);
printf("[len = %zu]: ", len);
puts(buf); // "Hello world!"
exit(EXIT_SUCCESS);
}
SEE ALSO
string(3), string_copy(3)
Linux man‐pages (unreleased) (date) strncat(3)
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v5 1/5] string_copy.7: Add page to document all string-copying functions
2022-12-15 0:26 ` [PATCH v5 1/5] string_copy.7: Add page to document all string-copying functions Alejandro Colomar
@ 2022-12-15 0:30 ` Alejandro Colomar
0 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-15 0:30 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
[-- Attachment #1.1: Type: text/plain, Size: 26068 bytes --]
Formatted string_copy(7):
string_copy(7) Miscellaneous Information Manual string_copy(7)
NAME
stpcpy, strcpy, strcat, stpecpy, stpecpyx, strlcpy, strlcat, stpncpy,
strncpy, zustr2ustp, zustr2stp, strncat, ustpcpy, ustr2stp - copy
strings and character sequences
SYNOPSIS
Strings
// Chain‐copy a string.
char *stpcpy(char *restrict dst, const char *restrict src);
// Copy/catenate a string.
char *strcpy(char *restrict dst, const char *restrict src);
char *strcat(char *restrict dst, const char *restrict src);
// Chain‐copy a string with truncation.
char *stpecpy(char *dst, char past_end[0], const char *restrict src);
// Chain‐copy a string with truncation and SIGSEGV on UB.
char *stpecpyx(char *dst, char past_end[0], const char *restrict src);
// Copy/catenate a string with truncation and SIGSEGV on UB.
size_t strlcpy(char dst[restrict .sz], const char *restrict src,
size_t sz);
size_t strlcat(char dst[restrict .sz], const char *restrict src,
size_t sz);
Null‐padded character sequences
// Zero a fixed‐width buffer, and
// copy a string into a character sequence with truncation.
char *stpncpy(char dst[restrict .sz], const char *restrict src,
size_t sz);
// Zero a fixed‐width buffer, and
// copy a string into a character sequence with truncation.
char *strncpy(char dest[restrict .sz], const char *restrict src,
size_t sz);
// Chain‐copy a null‐padded character sequence into a character sequence.
char *zustr2ustp(char *restrict dst, const char src[restrict .sz],
size_t sz);
// Chain‐copy a null‐padded character sequence into a string.
char *zustr2stp(char *restrict dst, const char src[restrict .sz],
size_t sz);
// Catenate a null‐padded character sequence into a string.
char *strncat(char *restrict dst, const char src[restrict .sz],
size_t sz);
Measured character sequences
// Chain‐copy a measured character sequence.
char *ustpcpy(char *restrict dst, const char src[restrict .len],
size_t len);
// Chain‐copy a measured character sequence into a string.
char *ustr2stp(char *restrict dst, const char src[restrict .len],
size_t len);
DESCRIPTION
Terms (and abbreviations)
string (str)
is a sequence of zero or more non‐null characters followed by a
null byte.
character sequence
is a sequence of zero or more non‐null characters. A program
should never usa a character sequence where a string is re‐
quired. However, with appropriate care, a string can be used in
the place of a character sequence.
null‐padded character sequence (zustr)
Character sequences can be contained in fixed‐width
buffers, which contain padding null bytes after the char‐
acter sequence, to fill the rest of the buffer without
affecting the character sequence; however, those padding
null bytes are not part of the character sequence.
measured character sequence (ustr)
Character sequence delimited by its length. It may be a
slice of a larger character sequence, or even of a
string.
length (len)
is the number of non‐null characters in a string or character
sequence. It is the return value of strlen(str) and of
strnlen(ustr, sz).
size (sz)
refers to the entire buffer where the string or character se‐
quence is contained.
end is the name of a pointer to the terminating null byte of a
string, or a pointer to one past the last character of a charac‐
ter sequence. This is the return value of functions that allow
chaining. It is equivalent to &str[len].
past_end
is the name of a pointer to one past the end of the buffer that
contains a string or character sequence. It is equivalent to
&str[sz]. It is used as a sentinel value, to be able to trun‐
cate strings or character sequences instead of overrunning the
containing buffer.
copy This term is used when the writing starts at the first element
pointed to by dst.
catenate
This term is used when a function first finds the terminating
null byte in dst, and then starts writing at that position.
chain This term is used when it’s the programmer who provides a
pointer to the end in dst, and the function starts writing at
that location. The function returns a pointer to the new end
after the call, so that the programmer can use it to chain such
calls.
Copy, catenate, and chain‐copy
Originally, there was a distinction between functions that copy and
those that catenate. However, newer functions that copy while allowing
chaining cover both use cases with a single API. They are also algo‐
rithmically faster, since they don’t need to search for the end of the
existing string. However, functions that catenate have a much simpler
use, so if performance is not important, it can make sense to use them
for improving readability.
To chain copy functions, they need to return a pointer to the end.
That’s a byproduct of the copy operation, so it has no performance
costs. Functions that return such a pointer, and thus can be chained,
have names of the form *stp*(), since it’s also common to name the
pointer just p.
Chain‐copying functions that truncate should accept a pointer to one
past the end of the destination buffer, and have names of the form
*stpe*(). This allows not having to recalculate the remaining size af‐
ter each call.
Truncate or not?
The first thing to note is that programmers should be careful with
buffers, so they always have the correct size, and truncation is not
necessary.
In most cases, truncation is not desired, and it is simpler to just do
the copy. Simpler code is safer code. Programming against programming
mistakes by adding more code just adds more points where mistakes can
be made.
Nowadays, compilers can detect most programmer errors with features
like compiler warnings, static analyzers, and _FORTIFY_SOURCE (see
ftm(7)). Keeping the code simple helps these overflow‐detection fea‐
tures be more precise.
When validating user input, however, it makes sense to truncate. Re‐
member to check the return value of such function calls.
Functions that truncate:
• stpecpy(3) is the most efficient string copy function that performs
truncation. It only requires to check for truncation once after all
chained calls.
• stpecpyx(3) is a variant of stpecpy(3) that consumes the entire
source string, to catch bugs in the program by forcing a segmenta‐
tion fault (as strlcpy(3bsd) and strlcat(3bsd) do).
• strlcpy(3bsd) and strlcat(3bsd) are designed to crash if the input
string is invalid (doesn’t contain a terminating null byte).
• stpncpy(3) and strncpy(3) also truncate, but they don’t write
strings, but rather null‐padded character sequences.
Null‐padded character sequences
For historic reasons, some standard APIs, such as utmpx(5), use null‐
padded character sequences in fixed‐width buffers. To interface with
them, specialized functions need to be used.
To copy strings into them, use stpncpy(3).
To copy from an unterminated string within a fixed‐width buffer into a
string, ignoring any trailing null bytes in the source fixed‐width
buffer, you should use zustr2stp(3) or strncat(3).
To copy from an unterminated string within a fixed‐width buffer into a
character sequence, ingoring any trailing null bytes in the source
fixed‐width buffer, you should use zustr2ustp(3).
Measured character sequences
The simplest character sequence copying function is mempcpy(3). It re‐
quires always knowing the length of your character sequences, for which
structures can be used. It makes the code much faster, since you al‐
ways know the length of your character sequences, and can do the mini‐
mal copies and length measurements. mempcpy(3) copies character se‐
quences, so you need to explicitly set the terminating null byte if you
need a string.
However, for keeping type safety, it’s good to add a wrapper that uses
char * instead of void *: ustpcpy(3).
In programs that make considerable use of strings or character se‐
quences, and need the best performance, using overlapping character se‐
quences can make a big difference. It allows holding subsequences of a
larger character sequence. while not duplicating memory nor using time
to do a copy.
However, this is delicate, since it requires using character sequences.
C library APIs use strings, so programs that use character sequences
will have to take care of differentiating strings from character se‐
quences.
To copy a measured character sequence, use ustpcpy(3).
To copy a measured character sequence into a string, use ustr2stp(3).
Because these functions ask for the length, and a string is by nature
composed of a character sequence of the same length plus a terminating
null byte, a string is also accepted as input.
String vs character sequence
Some functions only operate on strings. Those require that the input
src is a string, and guarantee an output string (even when truncation
occurs). Functions that catenate also require that dst holds a string
before the call. List of functions:
• stpcpy(3)
• strcpy(3), strcat(3)
• stpecpy(3), stpecpyx(3)
• strlcpy(3bsd), strlcat(3bsd)
Other functions require an input string, but create a character se‐
quence as output. These functions have confusing names, and have a
long history of misuse. List of functions:
• stpncpy(3)
• strncpy(3)
Other functions operate on an input character sequence, and create an
output string. Functions that catenate also require that dst holds a
string before the call. strncat(3) has an even more misleading name
than the functions above. List of functions:
• zustr2stp(3)
• strncat(3)
• ustr2stp(3)
Other functions operate on an input character sequence to create an
output character sequence. List of functions:
• ustpcpy(3)
• zustr2stp(3)
Functions
stpcpy(3)
This function copies the input string into a destination string.
The programmer is responsible for allocating a buffer large
enough. It returns a pointer suitable for chaining.
strcpy(3)
strcat(3)
These functions copy and catenate the input string into a desti‐
nation string. The programmer is responsible for allocating a
buffer large enough. The return value is useless.
stpcpy(3) is a faster alternative to these functions.
stpecpy(3)
stpecpyx(3)
These functions copy the input string into a destination string.
If the destination buffer, limited by a pointer to one past the
end of it, isn’t large enough to hold the copy, the resulting
string is truncated (but it is guaranteed to be null‐termi‐
nated). They return a pointer suitable for chaining. Trunca‐
tion needs to be detected only once after the last chained call.
stpecpyx(3) has identical semantics to stpecpy(3), except that
it forces a SIGSEGV if the src pointer is not a string.
These functions are not provided by any library; See EXAMPLES
for a reference implementation.
strlcpy(3bsd)
strlcat(3bsd)
These functions copy and catenate the input string into a desti‐
nation string. If the destination buffer, limited by its size,
isn’t large enough to hold the copy, the resulting string is
truncated (but it is guaranteed to be null‐terminated). They
return the length of the total string they tried to create.
These functions force a SIGSEGV if the src pointer is not a
string.
stpecpyx(3) is a faster alternative to these functions.
stpncpy(3)
This function copies the input string into a destination null‐
padded character sequence in a fixed‐width buffer. If the des‐
tination buffer, limited by its size, isn’t large enough to hold
the copy, the resulting character sequence is truncated. Since
it creates a character sequence, it doesn’t need to write a ter‐
minating null byte. It’s impossible to distinguish truncation
after the call, from a character sequence that just fits the
destination buffer; truncation should be detected from the
length of the original string.
strncpy(3)
This function is identical to stpncpy(3) except for the useless
return value.
stpncpy(3) is a more useful alternative to this function.
zustr2ustp(3)
This function copies the input character sequence contained in a
null‐padded wixed‐width buffer, into a destination character se‐
quence. The programmer is responsible for allocating a buffer
large enough. It returns a pointer suitable for chaining.
A truncating version of this function doesn’t exist, since the
size of the original character sequence is always known, so it
wouldn’t be very useful.
This function is not provided by any library; See EXAMPLES for a
reference implementation.
zustr2stp(3)
This function copies the input character sequence contained in a
null‐padded wixed‐width buffer, into a destination string. The
programmer is responsible for allocating a buffer large enough.
It returns a pointer suitable for chaining.
A truncating version of this function doesn’t exist, since the
size of the original character sequence is always known, so it
wouldn’t be very useful.
This function is not provided by any library; See EXAMPLES for a
reference implementation.
strncat(3)
Do not confuse this function with strncpy(3); they are not re‐
lated at all.
This function catenates the input character sequence contained
in a null‐padded wixed‐width buffer, into a destination string.
The programmer is responsible for allocating a buffer large
enough. The return value is useless.
zustr2stp(3) is a faster alternative to this function.
ustpcpy(3)
This function copies the input character sequence, limited by
its length, into a destination character sequence. The program‐
mer is responsible for allocating a buffer large enough. It re‐
turns a pointer suitable for chaining.
ustr2stp(3)
This function copies the input character sequence, limited by
its length, into a destination string. The programmer is re‐
sponsible for allocating a buffer large enough. It returns a
pointer suitable for chaining.
RETURN VALUE
The following functions return a pointer to the terminating null byte
in the destination string.
• stpcpy(3)
• ustr2stp(3)
• zustr2stp(3)
The following functions return a pointer to the terminating null byte
in the destination string, except when truncation occurs; if truncation
occurs, they return a pointer to one past the end of the destination
buffer (past_end).
• stpecpy(3), stpecpyx(3)
The following function returns a pointer to one after the last charac‐
ter in the destination character sequence; if truncation occurs, that
pointer is equivalent to a pointer to one past the end of the destina‐
tion buffer.
• stpncpy(3)
The following functions return a pointer to one after the last charac‐
ter in the destination character sequence.
• zustr2ustp(3)
• ustpcpy(3)
The following functions return the length of the total string that they
tried to create (as if truncation didn’t occur).
• strlcpy(3bsd), strlcat(3bsd)
The following functions return the dst pointer, which is useless.
• strcpy(3), strcat(3)
• strncpy(3)
• strncat(3)
NOTES
The Linux kernel has an internal function for copying strings, which is
similar to stpecpy(3), except that it can’t be chained:
strscpy(9)
This function copies the input string into a destination string.
If the destination buffer, limited by its size, isn’t large
enough to hold the copy, the resulting string is truncated (but
it is guaranteed to be null‐terminated). It returns the length
of the destination string, or -E2BIG on truncation.
stpecpy(3) is a simpler and faster alternative to this function.
CAVEATS
Don’t mix chain calls to truncating and non‐truncating functions. It
is conceptually wrong unless you know that the first part of a copy
will always fit. Anyway, the performance difference will probably be
negligible, so it will probably be more clear if you use consistent se‐
mantics: either truncating or non‐truncating. Calling a non‐truncating
function after a truncating one is necessarily wrong.
BUGS
All catenation functions share the same performance problem: Shlemiel
the painter ⟨https://www.joelonsoftware.com/2001/12/11/
back-to-basics/⟩.
EXAMPLES
The following are examples of correct use of each of these functions.
stpcpy(3)
p = buf;
p = stpcpy(p, "Hello ");
p = stpcpy(p, "world");
p = stpcpy(p, "!");
len = p - buf;
puts(buf);
strcpy(3)
strcat(3)
strcpy(buf, "Hello ");
strcat(buf, "world");
strcat(buf, "!");
len = strlen(buf);
puts(buf);
stpecpy(3)
stpecpyx(3)
past_end = buf + sizeof(buf);
p = buf;
p = stpecpy(p, past_end, "Hello ");
p = stpecpy(p, past_end, "world");
p = stpecpy(p, past_end, "!");
if (p == past_end) {
p--;
goto toolong;
}
len = p - buf;
puts(buf);
strlcpy(3bsd)
strlcat(3bsd)
if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
goto toolong;
if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
goto toolong;
len = strlcat(buf, "!", sizeof(buf));
if (len >= sizeof(buf))
goto toolong;
puts(buf);
strscpy(9)
len = strscpy(buf, "Hello world!", sizeof(buf));
if (len == -E2BIG)
goto toolong;
puts(buf);
stpncpy(3)
end = stpncpy(buf, "Hello world!", sizeof(buf));
if (sizeof(buf) < strlen("Hello world!"))
goto toolong;
len = end - buf;
for (size_t i = 0; i < sizeof(buf); i++)
putchar(buf[i]);
strncpy(3)
strncpy(buf, "Hello world!", sizeof(buf));
if (sizeof(buf) < strlen("Hello world!"))
goto toolong;
len = strnlen(buf, sizeof(buf));
for (size_t i = 0; i < sizeof(buf); i++)
putchar(buf[i]);
zustr2ustp(3)
p = buf;
p = zustr2ustp(p, "Hello ", 6);
p = zustr2ustp(p, "world", 42); // Padding null bytes ignored.
p = zustr2ustp(p, "!", 1);
len = p - buf;
printf("%.*s\n", (int) len, buf);
zustr2stp(3)
p = buf;
p = zustr2stp(p, "Hello ", 6);
p = zustr2stp(p, "world", 42); // Padding null bytes ignored.
p = zustr2stp(p, "!", 1);
len = p - buf;
puts(buf);
strncat(3)
buf[0] = '\0'; // There’s no ’cpy’ function to this ’cat’.
strncat(buf, "Hello ", 6);
strncat(buf, "world", 42); // Padding null bytes ignored.
strncat(buf, "!", 1);
len = strlen(buf);
puts(buf);
ustpcpy(3)
p = buf;
p = ustpcpy(p, "Hello ", 6);
p = ustpcpy(p, "world", 5);
p = ustpcpy(p, "!", 1);
len = p - buf;
printf("%.*s\n", (int) len, buf);
ustr2stp(3)
p = buf;
p = ustr2stp(p, "Hello ", 6);
p = ustr2stp(p, "world", 5);
p = ustr2stp(p, "!", 1);
len = p - buf;
puts(buf);
Implementations
Here are reference implementations for functions not provided by libc.
/* This code is in the public domain. */
char *
stpecpy(char *dst, char past_end[0], const char *restrict src)
{
char *p;
if (dst == past_end)
return past_end;
p = memccpy(dst, src, '\0', past_end - dst);
if (p != NULL)
return p - 1;
/* truncation detected */
past_end[-1] = '\0';
return past_end;
}
char *
stpecpyx(char *dst, char past_end[0], const char *restrict src)
{
if (src[strlen(src)] != '\0')
raise(SIGSEGV);
return stpecpy(dst, past_end, src);
}
char *
zustr2ustp(char *restrict dst, const char *restrict src, size_t sz)
{
return ustpcpy(dst, src, strnlen(src, sz));
}
char *
zustr2stp(char *restrict dst, const char *restrict src, size_t sz)
{
char *end;
end = zustr2ustp(dst, src, sz);
*end = '\0';
return end;
}
char *
ustpcpy(char *restrict dst, const char *restrict src, size_t len)
{
return mempcpy(dst, src, len);
}
char *
ustr2stp(char *restrict dst, const char *restrict src, size_t len)
{
char *end;
end = ustpcpy(dst, src, len);
*end = '\0';
return end;
}
SEE ALSO
bzero(3), memcpy(3), memccpy(3), mempcpy(3), stpcpy(3), strlcpy(3bsd),
strncat(3), strpcpy(3), string(3)
Linux man‐pages (unreleased) (date) string_copy(7)
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v5 3/5] stpcpy.3, strcpy.3, strcat.3: Document in a single page
2022-12-15 0:26 ` [PATCH v5 3/5] stpcpy.3, strcpy.3, strcat.3: Document in a single page Alejandro Colomar
@ 2022-12-16 14:46 ` Alejandro Colomar
2022-12-16 14:47 ` Alejandro Colomar
0 siblings, 1 reply; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-16 14:46 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
[-- Attachment #1.1: Type: text/plain, Size: 13361 bytes --]
Hi!
The formatted version of this page was sent accidentally as reply to 2/3.
Since 2/5 are only link pages, there's no formatted page for them.
Cheers,
Alex
On 12/15/22 01:26, Alejandro Colomar wrote:
> Rewrite to be consistent with the new string_copy.7 page.
>
> Cc: Martin Sebor <msebor@redhat.com>
> Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
> Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
> Cc: Jakub Wilk <jwilk@jwilk.net>
> Cc: Serge Hallyn <serge@hallyn.com>
> Cc: Iker Pedrosa <ipedrosa@redhat.com>
> Cc: Andrew Pinski <pinskia@gmail.com>
> Signed-off-by: Alejandro Colomar <alx@kernel.org>
> ---
> man3/stpcpy.3 | 13 ---
> man3/strcat.3 | 161 +----------------------------------
> man3/strcpy.3 | 226 +++++++++++++++++++++++++++++++-------------------
> 3 files changed, 143 insertions(+), 257 deletions(-)
>
> diff --git a/man3/stpcpy.3 b/man3/stpcpy.3
> index 5770790fc..d01c0239b 100644
> --- a/man3/stpcpy.3
> +++ b/man3/stpcpy.3
> @@ -14,19 +14,6 @@ .SH SYNOPSIS
> .PP
> .BI "char *stpcpy(char *restrict " dest ", const char *restrict " src );
> .fi
> -.PP
> -.RS -4
> -Feature Test Macro Requirements for glibc (see
> -.BR feature_test_macros (7)):
> -.RE
> -.PP
> -.BR stpcpy ():
> -.nf
> - Since glibc 2.10:
> - _POSIX_C_SOURCE >= 200809L
> - Before glibc 2.10:
> - _GNU_SOURCE
> -.fi
> .SH DESCRIPTION
> The
> .BR stpcpy ()
> diff --git a/man3/strcat.3 b/man3/strcat.3
> index 277e5b1e4..ff7476a84 100644
> --- a/man3/strcat.3
> +++ b/man3/strcat.3
> @@ -1,160 +1 @@
> -.\" Copyright 1993 David Metcalfe (david@prism.demon.co.uk)
> -.\"
> -.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> -.\"
> -.\" References consulted:
> -.\" Linux libc source code
> -.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
> -.\" 386BSD man pages
> -.\" Modified Sat Jul 24 18:11:47 1993 by Rik Faith (faith@cs.unc.edu)
> -.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
> -.\" Improve discussion of strncat().
> -.TH strcat 3 (date) "Linux man-pages (unreleased)"
> -.SH NAME
> -strcat \- concatenate two strings
> -.SH LIBRARY
> -Standard C library
> -.RI ( libc ", " \-lc )
> -.SH SYNOPSIS
> -.nf
> -.B #include <string.h>
> -.PP
> -.BI "char *strcat(char *restrict " dest ", const char *restrict " src );
> -.fi
> -.SH DESCRIPTION
> -The
> -.BR strcat ()
> -function appends the
> -.I src
> -string to the
> -.I dest
> -string,
> -overwriting the terminating null byte (\(aq\e0\(aq) at the end of
> -.IR dest ,
> -and then adds a terminating null byte.
> -The strings may not overlap, and the
> -.I dest
> -string must have
> -enough space for the result.
> -If
> -.I dest
> -is not large enough, program behavior is unpredictable;
> -.IR "buffer overruns are a favorite avenue for attacking secure programs" .
> -.SH RETURN VALUE
> -The
> -.BR strcat ()
> -function returns a pointer to the resulting string
> -.IR dest .
> -.SH ATTRIBUTES
> -For an explanation of the terms used in this section, see
> -.BR attributes (7).
> -.ad l
> -.nh
> -.TS
> -allbox;
> -lbx lb lb
> -l l l.
> -Interface Attribute Value
> -T{
> -.BR strcat (),
> -.BR strncat ()
> -T} Thread safety MT-Safe
> -.TE
> -.hy
> -.ad
> -.sp 1
> -.SH STANDARDS
> -POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
> -.SH NOTES
> -Some systems (the BSDs, Solaris, and others) provide the following function:
> -.PP
> -.in +4n
> -.EX
> -size_t strlcat(char *dest, const char *src, size_t size);
> -.EE
> -.in
> -.PP
> -This function appends the null-terminated string
> -.I src
> -to the string
> -.IR dest ,
> -copying at most
> -.I size\-strlen(dest)\-1
> -from
> -.IR src ,
> -and adds a terminating null byte to the result,
> -.I unless
> -.I size
> -is less than
> -.IR strlen(dest) .
> -This function fixes the buffer overrun problem of
> -.BR strcat (),
> -but the caller must still handle the possibility of data loss if
> -.I size
> -is too small.
> -The function returns the length of the string
> -.BR strlcat ()
> -tried to create; if the return value is greater than or equal to
> -.IR size ,
> -data loss occurred.
> -If data loss matters, the caller
> -.I must
> -either check the arguments before the call, or test the function return value.
> -.BR strlcat ()
> -is not present in glibc and is not standardized by POSIX,
> -.\" https://lwn.net/Articles/506530/
> -but is available on Linux via the
> -.I libbsd
> -library.
> -.\"
> -.SH EXAMPLES
> -Because
> -.BR strcat ()
> -must find the null byte that terminates the string
> -.I dest
> -using a search that starts at the beginning of the string,
> -the execution time of this function
> -scales according to the length of the string
> -.IR dest .
> -This can be demonstrated by running the program below.
> -(If the goal is to concatenate many strings to one target,
> -then manually copying the bytes from each source string
> -while maintaining a pointer to the end of the target string
> -will provide better performance.)
> -.\"
> -.SS Program source
> -\&
> -.\" SRC BEGIN (strcat.c)
> -.EX
> -#include <stdint.h>
> -#include <stdio.h>
> -#include <string.h>
> -#include <time.h>
> -
> -int
> -main(void)
> -{
> -#define LIM 4000000
> - char p[LIM + 1]; /* +1 for terminating null byte */
> - time_t base;
> -
> - base = time(NULL);
> - p[0] = \(aq\e0\(aq;
> -
> - for (unsigned int j = 0; j < LIM; j++) {
> - if ((j % 10000) == 0)
> - printf("%u %jd\en", j, (intmax_t) (time(NULL) \- base));
> - strcat(p, "a");
> - }
> -}
> -.EE
> -.\" SRC END
> -.SH SEE ALSO
> -.BR bcopy (3),
> -.BR memccpy (3),
> -.BR memcpy (3),
> -.BR strcpy (3),
> -.BR string (3),
> -.BR strlcat (3bsd),
> -.BR wcscat (3),
> -.BR wcsncat (3)
> +.so man3/strcpy.3
> diff --git a/man3/strcpy.3 b/man3/strcpy.3
> index 74c3180ae..424648c46 100644
> --- a/man3/strcpy.3
> +++ b/man3/strcpy.3
> @@ -1,20 +1,10 @@
> -.\" Copyright (C) 1993 David Metcalfe (david@prism.demon.co.uk)
> +.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
> .\"
> .\" SPDX-License-Identifier: Linux-man-pages-copyleft
> .\"
> -.\" References consulted:
> -.\" Linux libc source code
> -.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
> -.\" 386BSD man pages
> -.\" Modified Sat Jul 24 18:06:49 1993 by Rik Faith (faith@cs.unc.edu)
> -.\" Modified Fri Aug 25 23:17:51 1995 by Andries Brouwer (aeb@cwi.nl)
> -.\" Modified Wed Dec 18 00:47:18 1996 by Andries Brouwer (aeb@cwi.nl)
> -.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
> -.\" Improve discussion of strncpy().
> -.\"
> .TH strcpy 3 (date) "Linux man-pages (unreleased)"
> .SH NAME
> -strcpy \- copy a string
> +strcpy \- copy or catenate a string
> .SH LIBRARY
> Standard C library
> .RI ( libc ", " \-lc )
> @@ -22,26 +12,87 @@ .SH SYNOPSIS
> .nf
> .B #include <string.h>
> .PP
> -.BI "char *strcpy(char *restrict " dest ", const char *restrict " src );
> +.BI "char *stpcpy(char *restrict " dst ", const char *restrict " src );
> +.BI "char *strcpy(char *restrict " dst ", const char *restrict " src );
> +.BI "char *strcat(char *restrict " dst ", const char *restrict " src );
> +.fi
> +.PP
> +.RS -4
> +Feature Test Macro Requirements for glibc (see
> +.BR feature_test_macros (7)):
> +.RE
> +.PP
> +.BR stpcpy ():
> +.nf
> + Since glibc 2.10:
> + _POSIX_C_SOURCE >= 200809L
> + Before glibc 2.10:
> + _GNU_SOURCE
> .fi
> .SH DESCRIPTION
> -The
> +.TP
> +.BR stpcpy ()
> +.TQ
> .BR strcpy ()
> -function copies the string pointed to by
> +These functions copy the string pointed to by
> .IR src ,
> -including the terminating null byte (\(aq\e0\(aq),
> -to the buffer pointed to by
> -.IR dest .
> -The strings may not overlap, and the destination string
> -.I dest
> -must be large enough to receive the copy.
> -.I Beware of buffer overruns!
> -(See BUGS.)
> +into a string
> +at the buffer pointed to by
> +.IR dst .
> +The programmer is responsible for allocating a buffer large enough,
> +that is,
> +.IR "strlen(src) + 1" .
> +They only differ in the return value.
> +.TP
> +.BR strcat ()
> +This function catenates the string pointed to by
> +.IR src ,
> +at the end of the string pointed to by
> +.IR dst .
> +The programmer is responsible for allocating a buffer large enough,
> +that is,
> +.IR "strlen(dst) + strlen(src) + 1" .
> +.PP
> +An implementation of these functions might be:
> +.PP
> +.in +4n
> +.EX
> +char *
> +stpcpy(char *restrict dst, const char *restrict src)
> +{
> + char *end;
> +
> + end = mempcpy(dst, src, strlen(src));
> + *end = \(aq\e0\(aq;
> +
> + return end;
> +}
> +
> +char *
> +strcpy(char *restrict dst, const char *restrict src)
> +{
> + stpcpy(dst, src);
> + return dst;
> +}
> +
> +char *
> +strcat(char *restrict dst, const char *restrict src)
> +{
> + stpcpy(dst + strlen(dst), src);
> + return dst;
> +}
> +.EE
> +.in
> .SH RETURN VALUE
> -The
> +.TP
> +.BR stpcpy ()
> +This function returns
> +a pointer to the terminating null byte at the end of the copied string.
> +.TP
> .BR strcpy ()
> -function returns a pointer to
> -the destination string
> +.TQ
> +.BR strcat ()
> +These functions return
> .IR dest .
> .SH ATTRIBUTES
> For an explanation of the terms used in this section, see
> @@ -54,73 +105,80 @@ .SH ATTRIBUTES
> l l l.
> Interface Attribute Value
> T{
> -.BR strcpy ()
> +.BR stpcpy (),
> +.BR strcpy (),
> +.BR strcat ()
> T} Thread safety MT-Safe
> .TE
> .hy
> .ad
> .sp 1
> .SH STANDARDS
> +.TP
> +.BR stpcpy ()
> +POSIX.1-2008.
> +.TP
> +.BR strcpy ()
> +.TQ
> +.BR strcat ()
> POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
> -.SH NOTES
> -.SS strlcpy()
> -Some systems (the BSDs, Solaris, and others) provide the following function:
> +.SH CAVEATS
> +The strings
> +.I src
> +and
> +.I dst
> +may not overlap.
> .PP
> -.in +4n
> -.EX
> -size_t strlcpy(char *dest, const char *src, size_t size);
> -.EE
> -.in
> -.PP
> -.\" http://static.usenix.org/event/usenix99/full_papers/millert/millert_html/index.html
> -.\" "strlcpy and strlcat - consistent, safe, string copy and concatenation"
> -.\" 1999 USENIX Annual Technical Conference
> -This function is similar to
> -.BR strcpy (),
> -but it copies at most
> -.I size\-1
> -bytes to
> -.IR dest ,
> -truncating the string as necessary.
> -It always adds a terminating null byte.
> -This function fixes some of the problems of
> -.BR strcpy ()
> -but the caller must still handle the possibility of data loss if
> -.I size
> -is too small.
> -The return value of the function is the length of
> -.IR src ,
> -which allows truncation to be easily detected:
> -if the return value is greater than or equal to
> -.IR size ,
> -truncation occurred.
> -If loss of data matters, the caller
> -.I must
> -either check the arguments before the call,
> -or test the function return value.
> -.BR strlcpy ()
> -is not present in glibc and is not standardized by POSIX,
> -.\" https://lwn.net/Articles/506530/
> -but is available on Linux via the
> -.I libbsd
> -library.
> +If the destination buffer is not large enough,
> +the behavior is undefined.
> +See
> +.B _FORTIFY_SOURCE
> +in
> +.BR feature_test_macros (7).
> .SH BUGS
> -If the destination string of a
> -.BR strcpy ()
> -is not large enough, then anything might happen.
> -Overflowing fixed-length string buffers is a favorite cracker technique
> -for taking complete control of the machine.
> -Any time a program reads or copies data into a buffer,
> -the program first needs to check that there's enough space.
> -This may be unnecessary if you can show that overflow is impossible,
> -but be careful: programs can get changed over time,
> -in ways that may make the impossible possible.
> +.TP
> +.BR strcat ()
> +This function can be very inefficient.
> +Read about
> +.UR https://www.joelonsoftware.com/\:2001/12/11/\:back\-to\-basics/
> +Shlemiel the painter
> +.UE .
> +.SH EXAMPLES
> +.EX
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +int
> +main(void)
> +{
> + char *p;
> + char buf1[BUFSIZ];
> + char buf2[BUFSIZ];
> + size_t len;
> +
> + p = buf1;
> + p = stpcpy(p, "Hello ");
> + p = stpcpy(p, "world");
> + p = stpcpy(p, "!");
> + len = p \- buf1;
> +
> + printf("[len = %zu]: ", len);
> + puts(buf1); // "Hello world!"
> +
> + strcpy(buf2, "Hello ");
> + strcat(buf2, "world");
> + strcat(buf2, "!");
> + len = strlen(buf2);
> +
> + printf("[len = %zu]: ", len);
> + puts(buf2); // "Hello world!"
> +
> + exit(EXIT_SUCCESS);
> +}
> +.EE
> .SH SEE ALSO
> -.BR bcopy (3),
> -.BR memccpy (3),
> -.BR memcpy (3),
> -.BR memmove (3),
> -.BR stpcpy (3),
> .BR strdup (3),
> .BR string (3),
> -.BR wcscpy (3)
> +.BR wcscpy (3),
> +.BR string_copy (7)
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v5 3/5] stpcpy.3, strcpy.3, strcat.3: Document in a single page
2022-12-16 14:46 ` Alejandro Colomar
@ 2022-12-16 14:47 ` Alejandro Colomar
0 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-16 14:47 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
[-- Attachment #1.1: Type: text/plain, Size: 273 bytes --]
On 12/16/22 15:46, Alejandro Colomar wrote:
> Hi!
>
> The formatted version of this page was sent accidentally as reply to 2/3.
D'oh! I meant 2/5.
> Since 2/5 are only link pages, there's no formatted page for them.
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v5 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7)
2022-12-15 0:27 ` Alejandro Colomar
@ 2022-12-16 18:47 ` Stefan Puiu
2022-12-16 19:03 ` Alejandro Colomar
0 siblings, 1 reply; 53+ messages in thread
From: Stefan Puiu @ 2022-12-16 18:47 UTC (permalink / raw)
To: Alejandro Colomar
Cc: linux-man, Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
Hi Alex!
On Thu, Dec 15, 2022 at 2:46 AM Alejandro Colomar
<alx.manpages@gmail.com> wrote:
>
> Formatted strpcy(3):
>
> strcpy(3) Library Functions Manual strcpy(3)
>
> NAME
> strcpy - copy or catenate a string
>
> LIBRARY
> Standard C library (libc, -lc)
>
> SYNOPSIS
> #include <string.h>
>
> char *stpcpy(char *restrict dst, const char *restrict src);
> char *strcpy(char *restrict dst, const char *restrict src);
> char *strcat(char *restrict dst, const char *restrict src);
>
> Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
>
> stpcpy():
> Since glibc 2.10:
> _POSIX_C_SOURCE >= 200809L
> Before glibc 2.10:
> _GNU_SOURCE
>
> DESCRIPTION
> stpcpy()
> strcpy()
> These functions copy the string pointed to by src, into a string
> at the buffer pointed to by dst. The programmer is responsible
> for allocating a buffer large enough, that is, strlen(src) + 1.
> They only differ in the return value.
A destination buffer large enough? It's not that obvious to me from
the text, but maybe I'm tired :).
I was also a bit at a loss about the difference between the two; maybe
you can say "For the difference between the two, see RETURN VALUE"?
>
> strcat()
> This function catenates the string pointed to by src, at the end
> of the string pointed to by dst. The programmer is responsible
> for allocating a buffer large enough, that is, strlen(dst) +
> strlen(src) + 1.
Ditto here.
>
> An implementation of these functions might be:
>
> char *
> stpcpy(char *restrict dst, const char *restrict src)
> {
> char *end;
>
> end = mempcpy(dst, src, strlen(src));
> *end = '\0';
>
> return end;
> }
>
> char *
> strcpy(char *restrict dst, const char *restrict src)
> {
> stpcpy(dst, src);
> return dst;
> }
>
> char *
> strcat(char *restrict dst, const char *restrict src)
> {
> stpcpy(dst + strlen(dst), src);
> return dst;
> }
Are you sure this section adds any value? I think good documentation
should explain how a function works without delving into the
interpretation. Also, people might get confused and think this is the
actual implementation.
>
> RETURN VALUE
> stpcpy()
> This function returns a pointer to the terminating null byte at
> the end of the copied string.
>
> strcpy()
> strcat()
> These functions return dest.
>
> ATTRIBUTES
> For an explanation of the terms used in this section, see attrib‐
> utes(7).
> ┌────────────────────────────────────────────┬───────────────┬─────────┐
> │Interface │ Attribute │ Value │
> ├────────────────────────────────────────────┼───────────────┼─────────┤
> │stpcpy(), strcpy(), strcat() │ Thread safety │ MT‐Safe │
> └────────────────────────────────────────────┴───────────────┴─────────┘
>
> STANDARDS
> stpcpy()
> POSIX.1‐2008.
>
> strcpy()
> strcat()
> POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
>
> CAVEATS
> The strings src and dst may not overlap.
>
> If the destination buffer is not large enough, the behavior is unde‐
> fined. See _FORTIFY_SOURCE in feature_test_macros(7).
>
> BUGS
> strcat()
> This function can be very inefficient. Read about Shlemiel
> the painter ⟨https://www.joelonsoftware.com/2001/12/11/
> back-to-basics/⟩.
I'm not sure this is a bug, rather a design limitation. Maybe it
belongs in NOTES or CAVEATS? Also, I think this can be summarized
along the lines of 'strcat needs to walk the destination buffer to
find the null terminator, so it has linear complexity with respect to
the size of the destination buffer up to the terminator' (hmm, I'm
sure this can be expressed more concisely), so the page is more self
contained. Outside links sometimes go dead, like on Wikipedia, so I
think just in case, it helps to make explicit the point that you want
the reader to study further in the URL.
Regards,
Stefan.
>
> EXAMPLES
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
>
> int
> main(void)
> {
> char *p;
> char buf1[BUFSIZ];
> char buf2[BUFSIZ];
> size_t len;
>
> p = buf1;
> p = stpcpy(p, "Hello ");
> p = stpcpy(p, "world");
> p = stpcpy(p, "!");
> len = p - buf1;
>
> printf("[len = %zu]: ", len);
> puts(buf1); // "Hello world!"
>
> strcpy(buf2, "Hello ");
> strcat(buf2, "world");
> strcat(buf2, "!");
> len = strlen(buf2);
>
> printf("[len = %zu]: ", len);
> puts(buf2); // "Hello world!"
>
> exit(EXIT_SUCCESS);
> }
>
> SEE ALSO
> strdup(3), string(3), wcscpy(3), string_copy(7)
>
> Linux man‐pages (unreleased) (date) strcpy(3)
>
> --
> <http://www.alejandro-colomar.es/>
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v5 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7)
2022-12-16 18:47 ` Stefan Puiu
@ 2022-12-16 19:03 ` Alejandro Colomar
2022-12-16 19:09 ` Alejandro Colomar
0 siblings, 1 reply; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-16 19:03 UTC (permalink / raw)
To: Stefan Puiu
Cc: linux-man, Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
[-- Attachment #1.1: Type: text/plain, Size: 8165 bytes --]
Hi Stefan,
On 12/16/22 19:47, Stefan Puiu wrote:
> Hi Alex!
>
> On Thu, Dec 15, 2022 at 2:46 AM Alejandro Colomar
> <alx.manpages@gmail.com> wrote:
>>
>> Formatted strpcy(3):
>>
>> strcpy(3) Library Functions Manual strcpy(3)
>>
>> NAME
>> strcpy - copy or catenate a string
>>
>> LIBRARY
>> Standard C library (libc, -lc)
>>
>> SYNOPSIS
>> #include <string.h>
>>
>> char *stpcpy(char *restrict dst, const char *restrict src);
>> char *strcpy(char *restrict dst, const char *restrict src);
>> char *strcat(char *restrict dst, const char *restrict src);
>>
>> Feature Test Macro Requirements for glibc (see feature_test_macros(7)):
>>
>> stpcpy():
>> Since glibc 2.10:
>> _POSIX_C_SOURCE >= 200809L
>> Before glibc 2.10:
>> _GNU_SOURCE
>>
>> DESCRIPTION
>> stpcpy()
>> strcpy()
>> These functions copy the string pointed to by src, into a string
>> at the buffer pointed to by dst. The programmer is responsible
>> for allocating a buffer large enough, that is, strlen(src) + 1.
>> They only differ in the return value.
>
> A destination buffer large enough? It's not that obvious to me from
> the text, but maybe I'm tired :).
Sure. Thanks!
> I was also a bit at a loss about the difference between the two; maybe
> you can say "For the difference between the two, see RETURN VALUE"?
That can make sense, yes.
>
>>
>> strcat()
>> This function catenates the string pointed to by src, at the end
>> of the string pointed to by dst. The programmer is responsible
>> for allocating a buffer large enough, that is, strlen(dst) +
>> strlen(src) + 1.
>
> Ditto here.
:)
>
>>
>> An implementation of these functions might be:
>>
>> char *
>> stpcpy(char *restrict dst, const char *restrict src)
>> {
>> char *end;
>>
>> end = mempcpy(dst, src, strlen(src));
>> *end = '\0';
>>
>> return end;
>> }
>>
>> char *
>> strcpy(char *restrict dst, const char *restrict src)
>> {
>> stpcpy(dst, src);
>> return dst;
>> }
>>
>> char *
>> strcat(char *restrict dst, const char *restrict src)
>> {
>> stpcpy(dst + strlen(dst), src);
>> return dst;
>> }
>
> Are you sure this section adds any value? I think good documentation
> should explain how a function works without delving into the
> interpretation.
To be honest, this page doesn't benefit too much from it. strcpy(3)/strcat(3)
are dead simple, and the explanations above should be enough.
However, the same thing in strncpy(3) and strncat(3) is very helpful, IMO. For
consistency I just showed trivial implementations in all of the pages. (And in
fact, there was an example implementation in the old strncat(3) and maybe a few
others, IIRC.)
> Also, people might get confused and think this is the
> actual implementation.
I don't think there's any problem if one believes this is the implementation.
Except for stpcpy(3), in which I preferred readability, they are actually quite
good implementations. A faster implementation of stpcpy(3) might be done in
terms of memccpy(3).
Funnily enough, I just checked what musl libc does, and it's the same as shown here:
alx@debian:~/src/musl/musl$ grepc -tfd strcpy
./src/string/strcpy.c:3:
char *strcpy(char *restrict dest, const char *restrict src)
{
__stpcpy(dest, src);
return dest;
}
alx@debian:~/src/musl/musl$ grepc -tfd strcat
./src/string/strcat.c:3:
char *strcat(char *restrict dest, const char *restrict src)
{
strcpy(dest + strlen(dest), src);
return dest;
}
>
>>
>> RETURN VALUE
>> stpcpy()
>> This function returns a pointer to the terminating null byte at
>> the end of the copied string.
>>
>> strcpy()
>> strcat()
>> These functions return dest.
>>
>> ATTRIBUTES
>> For an explanation of the terms used in this section, see attrib‐
>> utes(7).
>> ┌────────────────────────────────────────────┬───────────────┬─────────┐
>> │Interface │ Attribute │ Value │
>> ├────────────────────────────────────────────┼───────────────┼─────────┤
>> │stpcpy(), strcpy(), strcat() │ Thread safety │ MT‐Safe │
>> └────────────────────────────────────────────┴───────────────┴─────────┘
>>
>> STANDARDS
>> stpcpy()
>> POSIX.1‐2008.
>>
>> strcpy()
>> strcat()
>> POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
>>
>> CAVEATS
>> The strings src and dst may not overlap.
>>
>> If the destination buffer is not large enough, the behavior is unde‐
>> fined. See _FORTIFY_SOURCE in feature_test_macros(7).
>>
>> BUGS
>> strcat()
>> This function can be very inefficient. Read about Shlemiel
>> the painter ⟨https://www.joelonsoftware.com/2001/12/11/
>> back-to-basics/⟩.
>
> I'm not sure this is a bug, rather a design limitation. Maybe it
> belongs in NOTES or CAVEATS?
Yeah, I had been thinking of downgrading it. I'll do it.
> Also, I think this can be summarized
> along the lines of 'strcat needs to walk the destination buffer to
> find the null terminator, so it has linear complexity with respect to
> the size of the destination buffer up to the terminator' (hmm, I'm
> sure this can be expressed more concisely), so the page is more self
> contained. Outside links sometimes go dead, like on Wikipedia, so I
> think just in case, it helps to make explicit the point that you want
> the reader to study further in the URL.
I wasn't inspired to write it short enough to not be too verbose. Maybe I'll
write something based on your suggestion.
>
> Regards,
> Stefan.
Thanks for the review!
Cheers,
Alex
>
>>
>> EXAMPLES
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <string.h>
>>
>> int
>> main(void)
>> {
>> char *p;
>> char buf1[BUFSIZ];
>> char buf2[BUFSIZ];
>> size_t len;
>>
>> p = buf1;
>> p = stpcpy(p, "Hello ");
>> p = stpcpy(p, "world");
>> p = stpcpy(p, "!");
>> len = p - buf1;
>>
>> printf("[len = %zu]: ", len);
>> puts(buf1); // "Hello world!"
>>
>> strcpy(buf2, "Hello ");
>> strcat(buf2, "world");
>> strcat(buf2, "!");
>> len = strlen(buf2);
>>
>> printf("[len = %zu]: ", len);
>> puts(buf2); // "Hello world!"
>>
>> exit(EXIT_SUCCESS);
>> }
>>
>> SEE ALSO
>> strdup(3), string(3), wcscpy(3), string_copy(7)
>>
>> Linux man‐pages (unreleased) (date) strcpy(3)
>>
>> --
>> <http://www.alejandro-colomar.es/>
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v5 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7)
2022-12-16 19:03 ` Alejandro Colomar
@ 2022-12-16 19:09 ` Alejandro Colomar
0 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-16 19:09 UTC (permalink / raw)
To: Stefan Puiu
Cc: linux-man, Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
[-- Attachment #1.1: Type: text/plain, Size: 8054 bytes --]
On 12/16/22 20:03, Alejandro Colomar wrote:
> On 12/16/22 19:47, Stefan Puiu wrote:
>> On Thu, Dec 15, 2022 at 2:46 AM Alejandro Colomar
>> <alx.manpages@gmail.com> wrote:
>>> An implementation of these functions might be:
>>>
>>> char *
>>> stpcpy(char *restrict dst, const char *restrict src)
>>> {
>>> char *end;
>>>
>>> end = mempcpy(dst, src, strlen(src));
>>> *end = '\0';
>>>
>>> return end;
>>> }
>>>
>>> char *
>>> strcpy(char *restrict dst, const char *restrict src)
>>> {
>>> stpcpy(dst, src);
>>> return dst;
>>> }
>>>
>>> char *
>>> strcat(char *restrict dst, const char *restrict src)
>>> {
>>> stpcpy(dst + strlen(dst), src);
>>> return dst;
>>> }
>>
>> Are you sure this section adds any value? I think good documentation
>> should explain how a function works without delving into the
>> interpretation.
>
> To be honest, this page doesn't benefit too much from it. strcpy(3)/strcat(3)
> are dead simple, and the explanations above should be enough.
>
> However, the same thing in strncpy(3) and strncat(3) is very helpful, IMO. For
> consistency I just showed trivial implementations in all of the pages. (And in
> fact, there was an example implementation in the old strncat(3) and maybe a few
> others, IIRC.)
>
>> Also, people might get confused and think this is the
>> actual implementation.
>
> I don't think there's any problem if one believes this is the implementation.
> Except for stpcpy(3), in which I preferred readability, they are actually quite
> good implementations. A faster implementation of stpcpy(3) might be done in
> terms of memccpy(3).
>
> Funnily enough, I just checked what musl libc does, and it's the same as shown
> here:
>
>
> alx@debian:~/src/musl/musl$ grepc -tfd strcpy
> ./src/string/strcpy.c:3:
> char *strcpy(char *restrict dest, const char *restrict src)
> {
> __stpcpy(dest, src);
> return dest;
> }
> alx@debian:~/src/musl/musl$ grepc -tfd strcat
> ./src/string/strcat.c:3:
> char *strcat(char *restrict dest, const char *restrict src)
> {
> strcpy(dest + strlen(dest), src);
> return dest;
> }
>
>
And considering memccpy(3) is defined in terms of memchr(3) and mempcpy(3) in
glibc, I don't feel so bad about my own stpcpy(3) :). See:
alx@debian:~/src/gnu/glibc$ grepc -tfd __memccpy
./string/memccpy.c:30:
void *
__memccpy (void *dest, const void *src, int c, size_t n)
{
void *p = memchr (src, c, n);
if (p != NULL)
return __mempcpy (dest, src, p - src + 1);
memcpy (dest, src, n);
return NULL;
}
Cheers,
Alex
>>
>>>
>>> RETURN VALUE
>>> stpcpy()
>>> This function returns a pointer to the terminating null byte at
>>> the end of the copied string.
>>>
>>> strcpy()
>>> strcat()
>>> These functions return dest.
>>>
>>> ATTRIBUTES
>>> For an explanation of the terms used in this section, see attrib‐
>>> utes(7).
>>>
>>> ┌────────────────────────────────────────────┬───────────────┬─────────┐
>>> │Interface │ Attribute │
>>> Value │
>>>
>>> ├────────────────────────────────────────────┼───────────────┼─────────┤
>>> │stpcpy(), strcpy(), strcat() │ Thread safety │
>>> MT‐Safe │
>>>
>>> └────────────────────────────────────────────┴───────────────┴─────────┘
>>>
>>> STANDARDS
>>> stpcpy()
>>> POSIX.1‐2008.
>>>
>>> strcpy()
>>> strcat()
>>> POSIX.1‐2001, POSIX.1‐2008, C89, C99, SVr4, 4.3BSD.
>>>
>>> CAVEATS
>>> The strings src and dst may not overlap.
>>>
>>> If the destination buffer is not large enough, the behavior is unde‐
>>> fined. See _FORTIFY_SOURCE in feature_test_macros(7).
>>>
>>> BUGS
>>> strcat()
>>> This function can be very inefficient. Read about Shlemiel
>>> the painter ⟨https://www.joelonsoftware.com/2001/12/11/
>>> back-to-basics/⟩.
>>
>> I'm not sure this is a bug, rather a design limitation. Maybe it
>> belongs in NOTES or CAVEATS?
>
> Yeah, I had been thinking of downgrading it. I'll do it.
>
>> Also, I think this can be summarized
>> along the lines of 'strcat needs to walk the destination buffer to
>> find the null terminator, so it has linear complexity with respect to
>> the size of the destination buffer up to the terminator' (hmm, I'm
>> sure this can be expressed more concisely), so the page is more self
>> contained. Outside links sometimes go dead, like on Wikipedia, so I
>> think just in case, it helps to make explicit the point that you want
>> the reader to study further in the URL.
>
> I wasn't inspired to write it short enough to not be too verbose. Maybe I'll
> write something based on your suggestion.
>
>>
>> Regards,
>> Stefan.
>
> Thanks for the review!
>
> Cheers,
>
> Alex
>
>>
>>>
>>> EXAMPLES
>>> #include <stdio.h>
>>> #include <stdlib.h>
>>> #include <string.h>
>>>
>>> int
>>> main(void)
>>> {
>>> char *p;
>>> char buf1[BUFSIZ];
>>> char buf2[BUFSIZ];
>>> size_t len;
>>>
>>> p = buf1;
>>> p = stpcpy(p, "Hello ");
>>> p = stpcpy(p, "world");
>>> p = stpcpy(p, "!");
>>> len = p - buf1;
>>>
>>> printf("[len = %zu]: ", len);
>>> puts(buf1); // "Hello world!"
>>>
>>> strcpy(buf2, "Hello ");
>>> strcat(buf2, "world");
>>> strcat(buf2, "!");
>>> len = strlen(buf2);
>>>
>>> printf("[len = %zu]: ", len);
>>> puts(buf2); // "Hello world!"
>>>
>>> exit(EXIT_SUCCESS);
>>> }
>>>
>>> SEE ALSO
>>> strdup(3), string(3), wcscpy(3), string_copy(7)
>>>
>>> Linux man‐pages (unreleased) (date) strcpy(3)
>>>
>>> --
>>> <http://www.alejandro-colomar.es/>
>
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v6 0/5] Rewrite documentation for string-copying functions
2022-12-15 0:26 ` [PATCH v5 0/5] Rewrite pages about " Alejandro Colomar
@ 2022-12-19 21:02 ` Alejandro Colomar
2022-12-19 21:02 ` [PATCH v6 1/5] string_copy.7: Add page to document all " Alejandro Colomar
` (4 subsequent siblings)
5 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-19 21:02 UTC (permalink / raw)
To: linux-man, Martin Sebor, G. Branden Robinson, Douglas McIlroy,
Jakub Wilk, Serge Hallyn, Iker Pedrosa, Andrew Pinski,
Stefan Puiu
Cc: Alejandro Colomar
Hi,
Yet another revision of this patch set.
v6:
- Fixed a link page (stpcpy(3)).
- Use malloc(3) in the examples, to show that buffers need to be
properly allocated before these calls.
- Return to the example program of strncat(3) that showed a more
reallistic use (based on groff(1)'s source code, plus some
imagination).
- Use the term 'end' for one after the last element of an array, to be
consistent with C++ (as Andrew pointed out). It is also less to
type, and using end for the end of the string and past_end for the
buffer was a bit confusing, since it wasn't true that
'end == past_end - 1'. Now, I don't have a term for the end of a
string, so I used the description instead of a term. The name of
such pointers is called 'p', following tradition (and the name of
mempcpy(3) and stpcpy(3) and others).
This is likely to be the last revision before pushing. I don't expect
important changes to occur, and I think we can improve the page once
it's been published. This is already a big improvement over what we've
had for many years, and worth of being released to the public.
Cheers,
Alex
P.S.: I'm writing a library that implements the functions suggested here
that are not part of libc. The code is already done, and I'm now
working on the build system. After that, manual pages and Debian
packaging (I'll need help for the latter), and it'll be done.
Alejandro Colomar (5):
string_copy.7: Add page to document all string-copying functions
stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3,
zustr2ustp.3: Add new links to string_copy(7)
stpcpy.3, strcpy.3, strcat.3: Document in a single page
stpncpy.3, strncpy.3: Document in a single page
strncat.3: Rewrite to be consistent with string_copy.7.
man3/stpcpy.3 | 116 +-----
man3/stpecpy.3 | 1 +
man3/stpecpyx.3 | 1 +
man3/stpncpy.3 | 166 +++++----
man3/strcat.3 | 162 +--------
man3/strcpy.3 | 234 ++++++++-----
man3/strncat.3 | 157 +++------
man3/strncpy.3 | 130 +------
man3/ustpcpy.3 | 1 +
man3/ustr2stp.3 | 1 +
man3/zustr2stp.3 | 1 +
man3/zustr2ustp.3 | 1 +
man7/string_copy.7 | 855 +++++++++++++++++++++++++++++++++++++++++++++
13 files changed, 1172 insertions(+), 654 deletions(-)
create mode 100644 man3/stpecpy.3
create mode 100644 man3/stpecpyx.3
create mode 100644 man3/ustpcpy.3
create mode 100644 man3/ustr2stp.3
create mode 100644 man3/zustr2stp.3
create mode 100644 man3/zustr2ustp.3
create mode 100644 man7/string_copy.7
--
2.39.0
^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v6 1/5] string_copy.7: Add page to document all string-copying functions
2022-12-15 0:26 ` [PATCH v5 0/5] Rewrite pages about " Alejandro Colomar
2022-12-19 21:02 ` [PATCH v6 0/5] Rewrite documentation for " Alejandro Colomar
@ 2022-12-19 21:02 ` Alejandro Colomar
2022-12-20 15:00 ` Stefan Puiu
2023-01-20 3:43 ` Eric Biggers
2022-12-19 21:02 ` [PATCH v6 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7) Alejandro Colomar
` (3 subsequent siblings)
5 siblings, 2 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-19 21:02 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski, Stefan Puiu
This is an opportunity to use consistent language across the
documentation for all string-copying functions.
It is also easier to show the similarities and differences between all
of the functions, so that a reader can use this page to know which
function is needed for a given task.
Alternative functions not provided by libc have been given in the same
page, with reference implementations.
Cc: Martin Sebor <msebor@redhat.com>
Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
Cc: Jakub Wilk <jwilk@jwilk.net>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Iker Pedrosa <ipedrosa@redhat.com>
Cc: Andrew Pinski <pinskia@gmail.com>
Cc: Stefan Puiu <stefan.puiu@gmail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man7/string_copy.7 | 855 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 855 insertions(+)
create mode 100644 man7/string_copy.7
diff --git a/man7/string_copy.7 b/man7/string_copy.7
new file mode 100644
index 000000000..a32b93c01
--- /dev/null
+++ b/man7/string_copy.7
@@ -0,0 +1,855 @@
+.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
+.\"
+.\" SPDX-License-Identifier: BSD-3-Clause
+.\"
+.TH string_copy 7 (date) "Linux man-pages (unreleased)"
+.\" ----- NAME :: -----------------------------------------------------/
+.SH NAME
+stpcpy,
+strcpy, strcat,
+stpecpy, stpecpyx,
+strlcpy, strlcat,
+stpncpy,
+strncpy,
+zustr2ustp, zustr2stp,
+strncat,
+ustpcpy, ustr2stp
+\- copy strings and character sequences
+.\" ----- SYNOPSIS :: -------------------------------------------------/
+.SH SYNOPSIS
+.\" ----- SYNOPSIS :: (Null-terminated) strings -----------------------/
+.SS Strings
+.nf
+// Chain-copy a string.
+.BI "char *stpcpy(char *restrict " dst ", const char *restrict " src );
+.PP
+// Copy/catenate a string.
+.BI "char *strcpy(char *restrict " dst ", const char *restrict " src );
+.BI "char *strcat(char *restrict " dst ", const char *restrict " src );
+.PP
+// Chain-copy a string with truncation.
+.BI "char *stpecpy(char *" dst ", char " end "[0], const char *restrict " src );
+.PP
+// Chain-copy a string with truncation and SIGSEGV on UB.
+.BI "char *stpecpyx(char *" dst ", char " end "[0], const char *restrict " src );
+.PP
+// Copy/catenate a string with truncation and SIGSEGV on UB.
+.BI "size_t strlcpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.BI "size_t strlcat(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.fi
+.\" ----- SYNOPSIS :: Null-padded character sequences --------/
+.SS Null-padded character sequences
+.nf
+// Zero a fixed-width buffer, and
+// copy a string into a character sequence with truncation.
+.BI "char *stpncpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Zero a fixed-width buffer, and
+// copy a string into a character sequence with truncation.
+.BI "char *strncpy(char " dest "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.PP
+// Chain-copy a null-padded character sequence into a character sequence.
+.BI "char *zustr2ustp(char *restrict " dst ", \
+const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.PP
+// Chain-copy a null-padded character sequence into a string.
+.BI "char *zustr2stp(char *restrict " dst ", \
+const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.PP
+// Catenate a null-padded character sequence into a string.
+.BI "char *strncat(char *restrict " dst ", const char " src "[restrict ." sz ],
+.BI " size_t " sz );
+.fi
+.\" ----- SYNOPSIS :: Measured character sequences --------------------/
+.SS Measured character sequences
+.nf
+// Chain-copy a measured character sequence.
+.BI "char *ustpcpy(char *restrict " dst ", \
+const char " src "[restrict ." len ],
+.BI " size_t " len );
+.PP
+// Chain-copy a measured character sequence into a string.
+.BI "char *ustr2stp(char *restrict " dst ", \
+const char " src "[restrict ." len ],
+.BI " size_t " len );
+.fi
+.SH DESCRIPTION
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: -----------------/
+.SS Terms (and abbreviations)
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: string (str) ----/
+.TP
+.IR "string " ( str )
+is a sequence of zero or more non-null characters followed by a null byte.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: null-padded character seq
+.TP
+.I character sequence
+is a sequence of zero or more non-null characters.
+A program should never usa a character sequence where a string is required.
+However, with appropriate care,
+a string can be used in the place of a character sequence.
+.RS
+.TP
+.IR "null-padded character sequence " ( zustr )
+Character sequences can be contained in fixed-width buffers,
+which contain padding null bytes after the character sequence,
+to fill the rest of the buffer
+without affecting the character sequence;
+however, those padding null bytes are not part of the character sequence.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: measured character sequence
+.TP
+.IR "measured character sequence " ( ustr )
+Character sequence delimited by its length.
+It may be a slice of a larger character sequence,
+or even of a string.
+.RE
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: length (len) ----/
+.TP
+.IR "length " ( len )
+is the number of non-null characters in a string or character sequence.
+It is the return value of
+.I strlen(str)
+and of
+.IR "strnlen(ustr, sz)" .
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: size (sz) -------/
+.TP
+.IR "size " ( sz )
+refers to the entire buffer
+where the string or character sequence is contained.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: end -------------/
+.TP
+.I end
+is the name of a pointer to one past the last element of a buffer.
+It is equivalent to
+.IR &str[sz] .
+It is used as a sentinel value,
+to be able to truncate strings or character sequences
+instead of overrunning the containing buffer.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: copy ------------/
+.TP
+.I copy
+This term is used when
+the writing starts at the first element pointed to by
+.IR dst .
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: catenate --------/
+.TP
+.I catenate
+This term is used when
+a function first finds the terminating null byte in
+.IR dst ,
+and then starts writing at that position.
+.\" ----- DESCRIPTION :: Terms (and abbreviations) :: chain -----------/
+.TP
+.I chain
+This term is used when
+it's the programmer who provides
+a pointer to the terminating null byte in the string
+.I dst
+(or one after the last character in a character sequence),
+and the function starts writing at that location.
+The function returns
+a pointer to the new location of the terminating null byte
+(or one after the last character in a character sequence)
+after the call,
+so that the programmer can use it to chain such calls.
+.\" ----- DESCRIPTION :: Copy, catenate, and chain-copy ---------------/
+.SS Copy, catenate, and chain-copy
+Originally,
+there was a distinction between functions that copy and those that catenate.
+However, newer functions that copy while allowing chaining
+cover both use cases with a single API.
+They are also algorithmically faster,
+since they don't need to search for
+the terminating null byte of the existing string.
+However, functions that catenate have a much simpler use,
+so if performance is not important,
+it can make sense to use them for improving readability.
+.PP
+The pointer returned by functions that allow chaining
+is a byproduct of the copy operation,
+so it has no performance costs.
+Functions that return such a pointer,
+and thus can be chained,
+have names of the form
+.RB * stp *(),
+since it's common to name the pointer just
+.IR p .
+.PP
+Chain-copying functions that truncate
+should accept a pointer to the end of the destination buffer,
+and have names of the form
+.RB * stpe *().
+This allows not having to recalculate the remaining size after each call.
+.\" ----- DESCRIPTION :: Truncate or not? -----------------------------/
+.SS Truncate or not?
+The first thing to note is that programmers should be careful with buffers,
+so they always have the correct size,
+and truncation is not necessary.
+.PP
+In most cases,
+truncation is not desired,
+and it is simpler to just do the copy.
+Simpler code is safer code.
+Programming against programming mistakes by adding more code
+just adds more points where mistakes can be made.
+.PP
+Nowadays,
+compilers can detect most programmer errors with features like
+compiler warnings,
+static analyzers, and
+.BR \%_FORTIFY_SOURCE
+(see
+.BR ftm (7)).
+Keeping the code simple
+helps these overflow-detection features be more precise.
+.PP
+When validating user input,
+however,
+it makes sense to truncate.
+Remember to check the return value of such function calls.
+.PP
+Functions that truncate:
+.IP \(bu 3
+.BR stpecpy (3)
+is the most efficient string copy function that performs truncation.
+It only requires to check for truncation once after all chained calls.
+.IP \(bu
+.BR stpecpyx (3)
+is a variant of
+.BR stpecpy (3)
+that consumes the entire source string,
+to catch bugs in the program
+by forcing a segmentation fault (as
+.BR strlcpy (3bsd)
+and
+.BR strlcat (3bsd)
+do).
+.IP \(bu
+.BR strlcpy (3bsd)
+and
+.BR strlcat (3bsd)
+are designed to crash if the input string is invalid
+(doesn't contain a terminating null byte).
+.IP \(bu
+.BR stpncpy (3)
+and
+.BR strncpy (3)
+also truncate, but they don't write strings,
+but rather null-padded character sequences.
+.\" ----- DESCRIPTION :: Null-padded character sequences --------------/
+.SS Null-padded character sequences
+For historic reasons,
+some standard APIs,
+such as
+.BR utmpx (5),
+use null-padded character sequences in fixed-width buffers.
+To interface with them,
+specialized functions need to be used.
+.PP
+To copy strings into them, use
+.BR stpncpy (3).
+.PP
+To copy from an unterminated string within a fixed-width buffer into a string,
+ignoring any trailing null bytes in the source fixed-width buffer,
+you should use
+.BR zustr2stp (3)
+or
+.BR strncat (3).
+.PP
+To copy from an unterminated string within a fixed-width buffer
+into a character sequence,
+ingoring any trailing null bytes in the source fixed-width buffer,
+you should use
+.BR zustr2ustp (3).
+.\" ----- DESCRIPTION :: Measured character sequences -----------------/
+.SS Measured character sequences
+The simplest character sequence copying function is
+.BR mempcpy (3).
+It requires always knowing the length of your character sequences,
+for which structures can be used.
+It makes the code much faster,
+since you always know the length of your character sequences,
+and can do the minimal copies and length measurements.
+.BR mempcpy (3)
+copies character sequences,
+so you need to explicitly set the terminating null byte if you need a string.
+.PP
+However,
+for keeping type safety,
+it's good to add a wrapper that uses
+.I char\~*
+instead of
+.IR void\~* :
+.BR ustpcpy (3).
+.PP
+In programs that make considerable use of strings or character sequences,
+and need the best performance,
+using overlapping character sequences can make a big difference.
+It allows holding subsequences of a larger character sequence.
+while not duplicating memory
+nor using time to do a copy.
+.PP
+However, this is delicate,
+since it requires using character sequences.
+C library APIs use strings,
+so programs that use character sequences
+will have to take care of differentiating strings from character sequences.
+.PP
+To copy a measured character sequence, use
+.BR ustpcpy (3).
+.PP
+To copy a measured character sequence into a string, use
+.BR ustr2stp (3).
+.PP
+Because these functions ask for the length,
+and a string is by nature composed of a character sequence of the same length
+plus a terminating null byte,
+a string is also accepted as input.
+.\" ----- DESCRIPTION :: String vs character sequence -----------------/
+.SS String vs character sequence
+Some functions only operate on strings.
+Those require that the input
+.I src
+is a string,
+and guarantee an output string
+(even when truncation occurs).
+Functions that catenate
+also require that
+.I dst
+holds a string before the call.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR stpcpy (3)
+.IP \(bu
+.BR strcpy "(3), \c"
+.BR strcat (3)
+.IP \(bu
+.BR stpecpy "(3), \c"
+.BR stpecpyx (3)
+.IP \(bu
+.BR strlcpy "(3bsd), \c"
+.BR strlcat (3bsd)
+.PD
+.PP
+Other functions require an input string,
+but create a character sequence as output.
+These functions have confusing names,
+and have a long history of misuse.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR stpncpy (3)
+.IP \(bu
+.BR strncpy (3)
+.PD
+.PP
+Other functions operate on an input character sequence,
+and create an output string.
+Functions that catenate
+also require that
+.I dst
+holds a string before the call.
+.BR strncat (3)
+has an even more misleading name than the functions above.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR zustr2stp (3)
+.IP \(bu
+.BR strncat (3)
+.IP \(bu
+.BR ustr2stp (3)
+.PD
+.PP
+Other functions operate on an input character sequence
+to create an output character sequence.
+List of functions:
+.IP \(bu 3
+.PD 0
+.BR ustpcpy (3)
+.IP \(bu
+.BR zustr2stp (3)
+.PD
+.\" ----- DESCRIPTION :: Functions :: ---------------------------------/
+.SS Functions
+.\" ----- DESCRIPTION :: Functions :: stpcpy(3) -----------------------/
+.TP
+.BR stpcpy (3)
+This function copies the input string into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.\" ----- DESCRIPTION :: Functions :: strcpy(3), strcat(3) ------------/
+.TP
+.BR strcpy (3)
+.TQ
+.BR strcat (3)
+These functions copy and catenate the input string into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+The return value is useless.
+.IP
+.BR stpcpy (3)
+is a faster alternative to these functions.
+.\" ----- DESCRIPTION :: Functions :: stpecpy(3), stpecpyx(3) ---------/
+.TP
+.BR stpecpy (3)
+.TQ
+.BR stpecpyx (3)
+These functions copy the input string into a destination string.
+If the destination buffer,
+limited by a pointer to its end,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+They return a pointer suitable for chaining.
+Truncation needs to be detected only once after the last chained call.
+.BR stpecpyx (3)
+has identical semantics to
+.BR stpecpy (3),
+except that it forces a SIGSEGV if the
+.I src
+pointer is not a string.
+.IP
+These functions are not provided by any library;
+See EXAMPLES for a reference implementation.
+.\" ----- DESCRIPTION :: Functions :: strlcpy(3bsd), strlcat(3bsd) ----/
+.TP
+.BR strlcpy (3bsd)
+.TQ
+.BR strlcat (3bsd)
+These functions copy and catenate the input string into a destination string.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+They return the length of the total string they tried to create.
+These functions force a SIGSEGV if the
+.I src
+pointer is not a string.
+.IP
+.BR stpecpyx (3)
+is a faster alternative to these functions.
+.\" ----- DESCRIPTION :: Functions :: stpncpy(3) ----------------------/
+.TP
+.BR stpncpy (3)
+This function copies the input string into
+a destination null-padded character sequence in a fixed-width buffer.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting character sequence is truncated.
+Since it creates a character sequence,
+it doesn't need to write a terminating null byte.
+It's impossible to distinguish truncation by the result of the call,
+from a character sequence that just fits the destination buffer;
+truncation should be detected by
+comparing the length of the input string
+with the size of the destination buffer.
+.\" ----- DESCRIPTION :: Functions :: strncpy(3) ----------------------/
+.TP
+.BR strncpy (3)
+This function is identical to
+.BR stpncpy (3)
+except for the useless return value.
+.IP
+.BR stpncpy (3)
+is a more useful alternative to this function.
+.\" ----- DESCRIPTION :: Functions :: zustr2ustp(3) --------------------/
+.TP
+.BR zustr2ustp (3)
+This function copies the input character sequence
+contained in a null-padded wixed-width buffer,
+into a destination character sequence.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+A truncating version of this function doesn't exist,
+since the size of the original character sequence is always known,
+so it wouldn't be very useful.
+.IP
+This function is not provided by any library;
+See EXAMPLES for a reference implementation.
+.\" ----- DESCRIPTION :: Functions :: zustr2stp(3) --------------------/
+.TP
+.BR zustr2stp (3)
+This function copies the input character sequence
+contained in a null-padded wixed-width buffer,
+into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.IP
+A truncating version of this function doesn't exist,
+since the size of the original character sequence is always known,
+so it wouldn't be very useful.
+.IP
+This function is not provided by any library;
+See EXAMPLES for a reference implementation.
+.\" ----- DESCRIPTION :: Functions :: strncat(3) ----------------------/
+.TP
+.BR strncat (3)
+Do not confuse this function with
+.BR strncpy (3);
+they are not related at all.
+.IP
+This function catenates the input character sequence
+contained in a null-padded wixed-width buffer,
+into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+The return value is useless.
+.IP
+.BR zustr2stp (3)
+is a faster alternative to this function.
+.\" ----- DESCRIPTION :: Functions :: ustpcpy(3) ----------------------/
+.TP
+.BR ustpcpy (3)
+This function copies the input character sequence,
+limited by its length,
+into a destination character sequence.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.\" ----- DESCRIPTION :: Functions :: ustr2stp(3) ---------------------/
+.TP
+.BR ustr2stp (3)
+This function copies the input character sequence,
+limited by its length,
+into a destination string.
+The programmer is responsible for allocating a buffer large enough.
+It returns a pointer suitable for chaining.
+.\" ----- RETURN VALUE :: ---------------------------------------------/
+.SH RETURN VALUE
+The following functions return
+a pointer to the terminating null byte in the destination string.
+.IP \(bu 3
+.PD 0
+.BR stpcpy (3)
+.IP \(bu
+.BR ustr2stp (3)
+.IP \(bu
+.BR zustr2stp (3)
+.PD
+.PP
+The following functions return
+a pointer to the terminating null byte in the destination string,
+except when truncation occurs;
+if truncation occurs,
+they return a pointer to the end of the destination buffer.
+.IP \(bu 3
+.BR stpecpy (3),
+.BR stpecpyx (3)
+.PP
+The following function returns
+a pointer to one after the last character
+in the destination character sequence;
+if truncation occurs,
+that pointer is equivalent to
+a pointer to the end of the destination buffer.
+.IP \(bu 3
+.BR stpncpy (3)
+.PP
+The following functions return
+a pointer to one after the last character
+in the destination character sequence.
+.IP \(bu 3
+.PD 0
+.BR zustr2ustp (3)
+.IP \(bu
+.BR ustpcpy (3)
+.PD
+.PP
+The following functions return
+the length of the total string that they tried to create
+(as if truncation didn't occur).
+.IP \(bu 3
+.BR strlcpy (3bsd),
+.BR strlcat (3bsd)
+.PP
+The following functions return the
+.I dst
+pointer,
+which is useless.
+.IP \(bu 3
+.PD 0
+.BR strcpy (3),
+.BR strcat (3)
+.IP \(bu
+.BR strncpy (3)
+.IP \(bu
+.BR strncat (3)
+.PD
+.\" ----- NOTES :: strscpy(9) -----------------------------------------/
+.SH NOTES
+The Linux kernel has an internal function for copying strings,
+which is similar to
+.BR stpecpy (3),
+except that it can't be chained:
+.TP
+.BR strscpy (9)
+This function copies the input string into a destination string.
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting string is truncated
+(but it is guaranteed to be null-terminated).
+It returns the length of the destination string, or
+.B \-E2BIG
+on truncation.
+.IP
+.BR stpecpy (3)
+is a simpler and faster alternative to this function.
+.RE
+.\" ----- CAVEATS :: --------------------------------------------------/
+.SH CAVEATS
+Don't mix chain calls to truncating and non-truncating functions.
+It is conceptually wrong
+unless you know that the first part of a copy will always fit.
+Anyway, the performance difference will probably be negligible,
+so it will probably be more clear if you use consistent semantics:
+either truncating or non-truncating.
+Calling a non-truncating function after a truncating one is necessarily wrong.
+.\" ----- BUGS :: -----------------------------------------------------/
+.SH BUGS
+All catenation functions share the same performance problem:
+.UR https://www.joelonsoftware.com/\:2001/12/11/\:back\-to\-basics/
+Shlemiel the painter
+.UE .
+.\" ----- EXAMPLES :: -------------------------------------------------/
+.SH EXAMPLES
+The following are examples of correct use of each of these functions.
+.\" ----- EXAMPLES :: stpcpy(3) ---------------------------------------/
+.TP
+.BR stpcpy (3)
+.EX
+p = buf;
+p = stpcpy(p, "Hello ");
+p = stpcpy(p, "world");
+p = stpcpy(p, "!");
+len = p \- buf;
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: strcpy(3), strcat(3) ----------------------------/
+.TP
+.BR strcpy (3)
+.TQ
+.BR strcat (3)
+.EX
+strcpy(buf, "Hello ");
+strcat(buf, "world");
+strcat(buf, "!");
+len = strlen(buf);
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: stpecpy(3), stpecpyx(3) -------------------------/
+.TP
+.BR stpecpy (3)
+.TQ
+.BR stpecpyx (3)
+.EX
+end = buf + sizeof(buf);
+p = buf;
+p = stpecpy(p, end, "Hello ");
+p = stpecpy(p, end, "world");
+p = stpecpy(p, end, "!");
+if (p == end) {
+ p\-\-;
+ goto toolong;
+}
+len = p \- buf;
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: strlcpy(3bsd), strlcat(3bsd) --------------------/
+.TP
+.BR strlcpy (3bsd)
+.TQ
+.BR strlcat (3bsd)
+.EX
+if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
+ goto toolong;
+if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
+ goto toolong;
+len = strlcat(buf, "!", sizeof(buf));
+if (len >= sizeof(buf))
+ goto toolong;
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: strscpy(9) --------------------------------------/
+.TP
+.BR strscpy (9)
+.EX
+len = strscpy(buf, "Hello world!", sizeof(buf));
+if (len == \-E2BIG)
+ goto toolong;
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: stpncpy(3) --------------------------------------/
+.TP
+.BR stpncpy (3)
+.EX
+p = stpncpy(buf, "Hello world!", sizeof(buf));
+if (sizeof(buf) < strlen("Hello world!"))
+ goto toolong;
+len = p \- buf;
+for (size_t i = 0; i < sizeof(buf); i++)
+ putchar(buf[i]);
+.EE
+.\" ----- EXAMPLES :: strncpy(3) --------------------------------------/
+.TP
+.BR strncpy (3)
+.EX
+strncpy(buf, "Hello world!", sizeof(buf));
+if (sizeof(buf) < strlen("Hello world!"))
+ goto toolong;
+len = strnlen(buf, sizeof(buf));
+for (size_t i = 0; i < sizeof(buf); i++)
+ putchar(buf[i]);
+.EE
+.\" ----- EXAMPLES :: zustr2ustp(3) -----------------------------------/
+.TP
+.BR zustr2ustp (3)
+.EX
+p = buf;
+p = zustr2ustp(p, "Hello ", 6);
+p = zustr2ustp(p, "world", 42); // Padding null bytes ignored.
+p = zustr2ustp(p, "!", 1);
+len = p \- buf;
+printf("%.*s\en", (int) len, buf);
+.EE
+.\" ----- EXAMPLES :: zustr2stp(3) ------------------------------------/
+.TP
+.BR zustr2stp (3)
+.EX
+p = buf;
+p = zustr2stp(p, "Hello ", 6);
+p = zustr2stp(p, "world", 42); // Padding null bytes ignored.
+p = zustr2stp(p, "!", 1);
+len = p \- buf;
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: strncat(3) --------------------------------------/
+.TP
+.BR strncat (3)
+.EX
+buf[0] = \(aq\e0\(aq; // There's no 'cpy' function to this 'cat'.
+strncat(buf, "Hello ", 6);
+strncat(buf, "world", 42); // Padding null bytes ignored.
+strncat(buf, "!", 1);
+len = strlen(buf);
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: ustpcpy(3) --------------------------------------/
+.TP
+.BR ustpcpy (3)
+.EX
+p = buf;
+p = ustpcpy(p, "Hello ", 6);
+p = ustpcpy(p, "world", 5);
+p = ustpcpy(p, "!", 1);
+len = p \- buf;
+printf("%.*s\en", (int) len, buf);
+.EE
+.\" ----- EXAMPLES :: ustr2stp(3) -------------------------------------/
+.TP
+.BR ustr2stp (3)
+.EX
+p = buf;
+p = ustr2stp(p, "Hello ", 6);
+p = ustr2stp(p, "world", 5);
+p = ustr2stp(p, "!", 1);
+len = p \- buf;
+puts(buf);
+.EE
+.\" ----- EXAMPLES :: Implementations :: ------------------------------/
+.SS Implementations
+Here are reference implementations for functions not provided by libc.
+.PP
+.in +4n
+.EX
+/* This code is in the public domain. */
+
+.\" ----- EXAMPLES :: Implementations :: stpecpy(3) -------------------/
+char *
+.IR stpecpy "(char *dst, char end[0], const char *restrict src)"
+{
+ char *p;
+
+ if (dst == end)
+ return end;
+
+ p = memccpy(dst, src, \(aq\e0\(aq, end \- dst);
+ if (p != NULL)
+ return p \- 1;
+
+ /* truncation detected */
+ end[\-1] = \(aq\e0\(aq;
+ return end;
+}
+
+.\" ----- EXAMPLES :: Implementations :: stpecpy(3) -------------------/
+char *
+.IR stpecpyx "(char *dst, char end[0], const char *restrict src)"
+{
+ if (src[strlen(src)] != \(aq\e0\(aq)
+ raise(SIGSEGV);
+
+ return stpecpy(dst, end, src);
+}
+
+.\" ----- EXAMPLES :: Implementations :: zustr2ustp(3) ----------------/
+char *
+.IR zustr2ustp "(char *restrict dst, const char *restrict src, size_t sz)"
+{
+ return ustpcpy(dst, src, strnlen(src, sz));
+}
+
+.\" ----- EXAMPLES :: Implementations :: zustr2stp(3) -----------------/
+char *
+.IR zustr2stp "(char *restrict dst, const char *restrict src, size_t sz)"
+{
+ char *p;
+
+ p = zustr2ustp(dst, src, sz);
+ *p = \(aq\e0\(aq;
+
+ return p;
+}
+
+.\" ----- EXAMPLES :: Implementations :: ustpcpy(3) -------------------/
+char *
+.IR ustpcpy "(char *restrict dst, const char *restrict src, size_t len)"
+{
+ return mempcpy(dst, src, len);
+}
+
+.\" ----- EXAMPLES :: Implementations :: ustr2stp(3) ------------------/
+char *
+.IR ustr2stp "(char *restrict dst, const char *restrict src, size_t len)"
+{
+ char *p;
+
+ p = ustpcpy(dst, src, len);
+ *p = \(aq\e0\(aq;
+
+ return p;
+}
+.EE
+.in
+.\" ----- SEE ALSO :: -------------------------------------------------/
+.SH SEE ALSO
+.BR bzero (3),
+.BR memcpy (3),
+.BR memccpy (3),
+.BR mempcpy (3),
+.BR stpcpy (3),
+.BR strlcpy (3bsd),
+.BR strncat (3),
+.BR stpncpy (3),
+.BR string (3)
--
2.39.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v6 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7)
2022-12-15 0:26 ` [PATCH v5 0/5] Rewrite pages about " Alejandro Colomar
2022-12-19 21:02 ` [PATCH v6 0/5] Rewrite documentation for " Alejandro Colomar
2022-12-19 21:02 ` [PATCH v6 1/5] string_copy.7: Add page to document all " Alejandro Colomar
@ 2022-12-19 21:02 ` Alejandro Colomar
2022-12-19 21:02 ` [PATCH v6 3/5] stpcpy.3, strcpy.3, strcat.3: Document in a single page Alejandro Colomar
` (2 subsequent siblings)
5 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-19 21:02 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski, Stefan Puiu
Cc: Martin Sebor <msebor@redhat.com>
Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
Cc: Jakub Wilk <jwilk@jwilk.net>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Iker Pedrosa <ipedrosa@redhat.com>
Cc: Andrew Pinski <pinskia@gmail.com>
Cc: Stefan Puiu <stefan.puiu@gmail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/stpecpy.3 | 1 +
man3/stpecpyx.3 | 1 +
man3/ustpcpy.3 | 1 +
man3/ustr2stp.3 | 1 +
man3/zustr2stp.3 | 1 +
man3/zustr2ustp.3 | 1 +
6 files changed, 6 insertions(+)
create mode 100644 man3/stpecpy.3
create mode 100644 man3/stpecpyx.3
create mode 100644 man3/ustpcpy.3
create mode 100644 man3/ustr2stp.3
create mode 100644 man3/zustr2stp.3
create mode 100644 man3/zustr2ustp.3
diff --git a/man3/stpecpy.3 b/man3/stpecpy.3
new file mode 100644
index 000000000..6ff53887b
--- /dev/null
+++ b/man3/stpecpy.3
@@ -0,0 +1 @@
+.so man7/string_copy.7
diff --git a/man3/stpecpyx.3 b/man3/stpecpyx.3
new file mode 100644
index 000000000..6ff53887b
--- /dev/null
+++ b/man3/stpecpyx.3
@@ -0,0 +1 @@
+.so man7/string_copy.7
diff --git a/man3/ustpcpy.3 b/man3/ustpcpy.3
new file mode 100644
index 000000000..6ff53887b
--- /dev/null
+++ b/man3/ustpcpy.3
@@ -0,0 +1 @@
+.so man7/string_copy.7
diff --git a/man3/ustr2stp.3 b/man3/ustr2stp.3
new file mode 100644
index 000000000..6ff53887b
--- /dev/null
+++ b/man3/ustr2stp.3
@@ -0,0 +1 @@
+.so man7/string_copy.7
diff --git a/man3/zustr2stp.3 b/man3/zustr2stp.3
new file mode 100644
index 000000000..6ff53887b
--- /dev/null
+++ b/man3/zustr2stp.3
@@ -0,0 +1 @@
+.so man7/string_copy.7
diff --git a/man3/zustr2ustp.3 b/man3/zustr2ustp.3
new file mode 100644
index 000000000..6ff53887b
--- /dev/null
+++ b/man3/zustr2ustp.3
@@ -0,0 +1 @@
+.so man7/string_copy.7
--
2.39.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v6 3/5] stpcpy.3, strcpy.3, strcat.3: Document in a single page
2022-12-15 0:26 ` [PATCH v5 0/5] Rewrite pages about " Alejandro Colomar
` (2 preceding siblings ...)
2022-12-19 21:02 ` [PATCH v6 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7) Alejandro Colomar
@ 2022-12-19 21:02 ` Alejandro Colomar
2022-12-19 21:02 ` [PATCH v6 4/5] stpncpy.3, strncpy.3: " Alejandro Colomar
2022-12-19 21:02 ` [PATCH v6 5/5] strncat.3: Rewrite to be consistent with string_copy.7 Alejandro Colomar
5 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-19 21:02 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski, Stefan Puiu
Rewrite to be consistent with the new string_copy.7 page.
Cc: Martin Sebor <msebor@redhat.com>
Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
Cc: Jakub Wilk <jwilk@jwilk.net>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Iker Pedrosa <ipedrosa@redhat.com>
Cc: Andrew Pinski <pinskia@gmail.com>
Cc: Stefan Puiu <stefan.puiu@gmail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/stpcpy.3 | 116 +------------------------
man3/strcat.3 | 162 +---------------------------------
man3/strcpy.3 | 234 ++++++++++++++++++++++++++++++++------------------
3 files changed, 152 insertions(+), 360 deletions(-)
diff --git a/man3/stpcpy.3 b/man3/stpcpy.3
index 42751d356..ff7476a84 100644
--- a/man3/stpcpy.3
+++ b/man3/stpcpy.3
@@ -1,115 +1 @@
-'\" t
-.\" Copyright 1995 James R. Van Zandt <jrv@vanzandt.mv.com>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.TH stpcpy 3 (date) "Linux man-pages (unreleased)"
-.SH NAME
-stpcpy \- copy a string returning a pointer to its end
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <string.h>
-.PP
-.BI "char *stpcpy(char *restrict " dest ", const char *restrict " src );
-.fi
-.PP
-.RS -4
-Feature Test Macro Requirements for glibc (see
-.BR feature_test_macros (7)):
-.RE
-.PP
-.BR stpcpy ():
-.nf
- Since glibc 2.10:
- _POSIX_C_SOURCE >= 200809L
- Before glibc 2.10:
- _GNU_SOURCE
-.fi
-.SH DESCRIPTION
-The
-.BR stpcpy ()
-function copies the string pointed to by
-.I src
-(including the terminating null byte (\(aq\e0\(aq)) to the array pointed to by
-.IR dest .
-The strings may not overlap, and the destination string
-.I dest
-must be large enough to receive the copy.
-.SH RETURN VALUE
-.BR stpcpy ()
-returns a pointer to the
-.B end
-of the string
-.I dest
-(that is, the address of the terminating null byte)
-rather than the beginning.
-.SH ATTRIBUTES
-For an explanation of the terms used in this section, see
-.BR attributes (7).
-.ad l
-.nh
-.TS
-allbox;
-lbx lb lb
-l l l.
-Interface Attribute Value
-T{
-.BR stpcpy ()
-T} Thread safety MT-Safe
-.TE
-.hy
-.ad
-.sp 1
-.SH STANDARDS
-This function was added to POSIX.1-2008.
-Before that, it was not part of
-the C or POSIX.1 standards, nor customary on UNIX systems.
-It first appeared at least as early as 1986,
-in the Lattice C AmigaDOS compiler,
-then in the GNU fileutils and GNU textutils in 1989,
-and in the GNU C library by 1992.
-It is also present on the BSDs.
-.SH BUGS
-This function may overrun the buffer
-.IR dest .
-.SH EXAMPLES
-For example, this program uses
-.BR stpcpy ()
-to concatenate
-.B foo
-and
-.B bar
-to produce
-.BR foobar ,
-which it then prints.
-.PP
-.\" SRC BEGIN (stpcpy.c)
-.EX
-#define _GNU_SOURCE
-#include <stdio.h>
-#include <string.h>
-
-int
-main(void)
-{
- char buffer[20];
- char *to = buffer;
-
- to = stpcpy(to, "foo");
- to = stpcpy(to, "bar");
- printf("%s\en", buffer);
-}
-.EE
-.\" SRC END
-.SH SEE ALSO
-.BR bcopy (3),
-.BR memccpy (3),
-.BR memcpy (3),
-.BR memmove (3),
-.BR stpncpy (3),
-.BR strcpy (3),
-.BR string (3),
-.BR wcpcpy (3)
+.so man3/strcpy.3
diff --git a/man3/strcat.3 b/man3/strcat.3
index 90b9d260d..ff7476a84 100644
--- a/man3/strcat.3
+++ b/man3/strcat.3
@@ -1,161 +1 @@
-'\" t
-.\" Copyright 1993 David Metcalfe (david@prism.demon.co.uk)
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\" References consulted:
-.\" Linux libc source code
-.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
-.\" 386BSD man pages
-.\" Modified Sat Jul 24 18:11:47 1993 by Rik Faith (faith@cs.unc.edu)
-.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
-.\" Improve discussion of strncat().
-.TH strcat 3 (date) "Linux man-pages (unreleased)"
-.SH NAME
-strcat \- concatenate two strings
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <string.h>
-.PP
-.BI "char *strcat(char *restrict " dest ", const char *restrict " src );
-.fi
-.SH DESCRIPTION
-The
-.BR strcat ()
-function appends the
-.I src
-string to the
-.I dest
-string,
-overwriting the terminating null byte (\(aq\e0\(aq) at the end of
-.IR dest ,
-and then adds a terminating null byte.
-The strings may not overlap, and the
-.I dest
-string must have
-enough space for the result.
-If
-.I dest
-is not large enough, program behavior is unpredictable;
-.IR "buffer overruns are a favorite avenue for attacking secure programs" .
-.SH RETURN VALUE
-The
-.BR strcat ()
-function returns a pointer to the resulting string
-.IR dest .
-.SH ATTRIBUTES
-For an explanation of the terms used in this section, see
-.BR attributes (7).
-.ad l
-.nh
-.TS
-allbox;
-lbx lb lb
-l l l.
-Interface Attribute Value
-T{
-.BR strcat (),
-.BR strncat ()
-T} Thread safety MT-Safe
-.TE
-.hy
-.ad
-.sp 1
-.SH STANDARDS
-POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH NOTES
-Some systems (the BSDs, Solaris, and others) provide the following function:
-.PP
-.in +4n
-.EX
-size_t strlcat(char *dest, const char *src, size_t size);
-.EE
-.in
-.PP
-This function appends the null-terminated string
-.I src
-to the string
-.IR dest ,
-copying at most
-.I size\-strlen(dest)\-1
-from
-.IR src ,
-and adds a terminating null byte to the result,
-.I unless
-.I size
-is less than
-.IR strlen(dest) .
-This function fixes the buffer overrun problem of
-.BR strcat (),
-but the caller must still handle the possibility of data loss if
-.I size
-is too small.
-The function returns the length of the string
-.BR strlcat ()
-tried to create; if the return value is greater than or equal to
-.IR size ,
-data loss occurred.
-If data loss matters, the caller
-.I must
-either check the arguments before the call, or test the function return value.
-.BR strlcat ()
-is not present in glibc and is not standardized by POSIX,
-.\" https://lwn.net/Articles/506530/
-but is available on Linux via the
-.I libbsd
-library.
-.\"
-.SH EXAMPLES
-Because
-.BR strcat ()
-must find the null byte that terminates the string
-.I dest
-using a search that starts at the beginning of the string,
-the execution time of this function
-scales according to the length of the string
-.IR dest .
-This can be demonstrated by running the program below.
-(If the goal is to concatenate many strings to one target,
-then manually copying the bytes from each source string
-while maintaining a pointer to the end of the target string
-will provide better performance.)
-.\"
-.SS Program source
-\&
-.\" SRC BEGIN (strcat.c)
-.EX
-#include <stdint.h>
-#include <stdio.h>
-#include <string.h>
-#include <time.h>
-
-int
-main(void)
-{
-#define LIM 4000000
- char p[LIM + 1]; /* +1 for terminating null byte */
- time_t base;
-
- base = time(NULL);
- p[0] = \(aq\e0\(aq;
-
- for (unsigned int j = 0; j < LIM; j++) {
- if ((j % 10000) == 0)
- printf("%u %jd\en", j, (intmax_t) (time(NULL) \- base));
- strcat(p, "a");
- }
-}
-.EE
-.\" SRC END
-.SH SEE ALSO
-.BR bcopy (3),
-.BR memccpy (3),
-.BR memcpy (3),
-.BR strcpy (3),
-.BR string (3),
-.BR strlcat (3bsd),
-.BR wcscat (3),
-.BR wcsncat (3)
+.so man3/strcpy.3
diff --git a/man3/strcpy.3 b/man3/strcpy.3
index 685a8e77a..ba6820dab 100644
--- a/man3/strcpy.3
+++ b/man3/strcpy.3
@@ -1,21 +1,11 @@
'\" t
-.\" Copyright (C) 1993 David Metcalfe (david@prism.demon.co.uk)
+.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
.\"
.\" SPDX-License-Identifier: Linux-man-pages-copyleft
.\"
-.\" References consulted:
-.\" Linux libc source code
-.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
-.\" 386BSD man pages
-.\" Modified Sat Jul 24 18:06:49 1993 by Rik Faith (faith@cs.unc.edu)
-.\" Modified Fri Aug 25 23:17:51 1995 by Andries Brouwer (aeb@cwi.nl)
-.\" Modified Wed Dec 18 00:47:18 1996 by Andries Brouwer (aeb@cwi.nl)
-.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
-.\" Improve discussion of strncpy().
-.\"
.TH strcpy 3 (date) "Linux man-pages (unreleased)"
.SH NAME
-strcpy \- copy a string
+stpcpy, strcpy, strcat \- copy or catenate a string
.SH LIBRARY
Standard C library
.RI ( libc ", " \-lc )
@@ -23,27 +13,89 @@ .SH SYNOPSIS
.nf
.B #include <string.h>
.PP
-.BI "char *strcpy(char *restrict " dest ", const char *restrict " src );
+.BI "char *stpcpy(char *restrict " dst ", const char *restrict " src );
+.BI "char *strcpy(char *restrict " dst ", const char *restrict " src );
+.BI "char *strcat(char *restrict " dst ", const char *restrict " src );
+.fi
+.PP
+.RS -4
+Feature Test Macro Requirements for glibc (see
+.BR feature_test_macros (7)):
+.RE
+.PP
+.BR stpcpy ():
+.nf
+ Since glibc 2.10:
+ _POSIX_C_SOURCE >= 200809L
+ Before glibc 2.10:
+ _GNU_SOURCE
.fi
.SH DESCRIPTION
-The
+.TP
+.BR stpcpy ()
+.TQ
.BR strcpy ()
-function copies the string pointed to by
+These functions copy the string pointed to by
.IR src ,
-including the terminating null byte (\(aq\e0\(aq),
-to the buffer pointed to by
-.IR dest .
-The strings may not overlap, and the destination string
-.I dest
-must be large enough to receive the copy.
-.I Beware of buffer overruns!
-(See BUGS.)
+into a string
+at the buffer pointed to by
+.IR dst .
+The programmer is responsible for allocating a destination buffer large enough,
+that is,
+.IR "strlen(src) + 1" .
+For the difference between the two functions, see RETURN VALUE.
+.TP
+.BR strcat ()
+This function catenates the string pointed to by
+.IR src ,
+after the string pointed to by
+.I dst
+(overwriting its terminating null byte).
+The programmer is responsible for allocating a destination buffer large enough,
+that is,
+.IR "strlen(dst) + strlen(src) + 1" .
+.PP
+An implementation of these functions might be:
+.PP
+.in +4n
+.EX
+char *
+stpcpy(char *restrict dst, const char *restrict src)
+{
+ char *p;
+
+ p = mempcpy(dst, src, strlen(src));
+ *p = \(aq\e0\(aq;
+
+ return p;
+}
+
+char *
+strcpy(char *restrict dst, const char *restrict src)
+{
+ stpcpy(dst, src);
+ return dst;
+}
+
+char *
+strcat(char *restrict dst, const char *restrict src)
+{
+ stpcpy(dst + strlen(dst), src);
+ return dst;
+}
+.EE
+.in
.SH RETURN VALUE
-The
+.TP
+.BR stpcpy ()
+This function returns
+a pointer to the terminating null byte of the copied string.
+.TP
.BR strcpy ()
-function returns a pointer to
-the destination string
-.IR dest .
+.TQ
+.BR strcat ()
+These functions return
+.IR dst .
.SH ATTRIBUTES
For an explanation of the terms used in this section, see
.BR attributes (7).
@@ -55,73 +107,87 @@ .SH ATTRIBUTES
l l l.
Interface Attribute Value
T{
-.BR strcpy ()
+.BR stpcpy (),
+.BR strcpy (),
+.BR strcat ()
T} Thread safety MT-Safe
.TE
.hy
.ad
.sp 1
.SH STANDARDS
+.TP
+.BR stpcpy ()
+POSIX.1-2008.
+.TP
+.BR strcpy ()
+.TQ
+.BR strcat ()
POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH NOTES
-.SS strlcpy()
-Some systems (the BSDs, Solaris, and others) provide the following function:
+.SH CAVEATS
+The strings
+.I src
+and
+.I dst
+may not overlap.
.PP
-.in +4n
+If the destination buffer is not large enough,
+the behavior is undefined.
+See
+.B _FORTIFY_SOURCE
+in
+.BR feature_test_macros (7).
+.PP
+.BR strcat ()
+can be very inefficient.
+Read about
+.UR https://www.joelonsoftware.com/\:2001/12/11/\:back\-to\-basics/
+Shlemiel the painter
+.UE .
+.SH EXAMPLES
+.\" SRC BEGIN (strcpy.c)
.EX
-size_t strlcpy(char *dest, const char *src, size_t size);
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+int
+main(void)
+{
+ char *p;
+ char *buf1;
+ char *buf2;
+ size_t len, maxsize;
+
+ maxsize = strlen("Hello ") + strlen("world") + strlen("!") + 1;
+ buf1 = malloc(sizeof(*buf1) * maxsize);
+ buf2 = malloc(sizeof(*buf2) * maxsize);
+
+ p = buf1;
+ p = stpcpy(p, "Hello ");
+ p = stpcpy(p, "world");
+ p = stpcpy(p, "!");
+ len = p \- buf1;
+
+ printf("[len = %zu]: ", len);
+ puts(buf1); // "Hello world!"
+ free(buf1);
+
+ strcpy(buf2, "Hello ");
+ strcat(buf2, "world");
+ strcat(buf2, "!");
+ len = strlen(buf2);
+
+ printf("[len = %zu]: ", len);
+ puts(buf2); // "Hello world!"
+ free(buf2);
+
+ exit(EXIT_SUCCESS);
+}
.EE
-.in
-.PP
-.\" http://static.usenix.org/event/usenix99/full_papers/millert/millert_html/index.html
-.\" "strlcpy and strlcat - consistent, safe, string copy and concatenation"
-.\" 1999 USENIX Annual Technical Conference
-This function is similar to
-.BR strcpy (),
-but it copies at most
-.I size\-1
-bytes to
-.IR dest ,
-truncating the string as necessary.
-It always adds a terminating null byte.
-This function fixes some of the problems of
-.BR strcpy ()
-but the caller must still handle the possibility of data loss if
-.I size
-is too small.
-The return value of the function is the length of
-.IR src ,
-which allows truncation to be easily detected:
-if the return value is greater than or equal to
-.IR size ,
-truncation occurred.
-If loss of data matters, the caller
-.I must
-either check the arguments before the call,
-or test the function return value.
-.BR strlcpy ()
-is not present in glibc and is not standardized by POSIX,
-.\" https://lwn.net/Articles/506530/
-but is available on Linux via the
-.I libbsd
-library.
-.SH BUGS
-If the destination string of a
-.BR strcpy ()
-is not large enough, then anything might happen.
-Overflowing fixed-length string buffers is a favorite cracker technique
-for taking complete control of the machine.
-Any time a program reads or copies data into a buffer,
-the program first needs to check that there's enough space.
-This may be unnecessary if you can show that overflow is impossible,
-but be careful: programs can get changed over time,
-in ways that may make the impossible possible.
+.\" SRC END
.SH SEE ALSO
-.BR bcopy (3),
-.BR memccpy (3),
-.BR memcpy (3),
-.BR memmove (3),
-.BR stpcpy (3),
.BR strdup (3),
.BR string (3),
-.BR wcscpy (3)
+.BR wcscpy (3),
+.BR string_copy (7)
--
2.39.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v6 4/5] stpncpy.3, strncpy.3: Document in a single page
2022-12-15 0:26 ` [PATCH v5 0/5] Rewrite pages about " Alejandro Colomar
` (3 preceding siblings ...)
2022-12-19 21:02 ` [PATCH v6 3/5] stpcpy.3, strcpy.3, strcat.3: Document in a single page Alejandro Colomar
@ 2022-12-19 21:02 ` Alejandro Colomar
2022-12-19 21:02 ` [PATCH v6 5/5] strncat.3: Rewrite to be consistent with string_copy.7 Alejandro Colomar
5 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-19 21:02 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski, Stefan Puiu
Rewrite to be consistent with the new string_copy.7 page.
Cc: Martin Sebor <msebor@redhat.com>
Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
Cc: Jakub Wilk <jwilk@jwilk.net>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Iker Pedrosa <ipedrosa@redhat.com>
Cc: Andrew Pinski <pinskia@gmail.com>
Cc: Stefan Puiu <stefan.puiu@gmail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/stpncpy.3 | 166 ++++++++++++++++++++++++++++++-------------------
man3/strncpy.3 | 130 +-------------------------------------
2 files changed, 102 insertions(+), 194 deletions(-)
diff --git a/man3/stpncpy.3 b/man3/stpncpy.3
index e7b24036b..e80ec2fd4 100644
--- a/man3/stpncpy.3
+++ b/man3/stpncpy.3
@@ -1,16 +1,14 @@
'\" t
-.\" Copyright (c) Bruno Haible <haible@clisp.cons.org>
-.\" Copyright (c) 2022 Alejandro Colomar <alx@kernel.org>
+.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
.\"
-.\" SPDX-License-Identifier: GPL-2.0-or-later
+.\" SPDX-License-Identifier: Linux-man-pages-copyleft
.\"
-.\" References consulted:
-.\" GNU glibc-2 source code and manual
-.\"
-.\" Corrected, aeb, 990824
.TH stpncpy 3 (date) "Linux man-pages (unreleased)"
.SH NAME
-stpncpy \- copy string into a fixed-length buffer and zero the rest of it
+stpncpy, strncpy
+\- zero a fixed-width buffer and
+copy a string into a character sequence with truncation
+and zero the rest of it
.SH LIBRARY
Standard C library
.RI ( libc ", " \-lc )
@@ -18,9 +16,12 @@ .SH SYNOPSIS
.nf
.B #include <string.h>
.PP
-.BI "char *stpncpy(char " dest "[restrict ." n "], \
-const char " src "[restrict ." n ],
-.BI " size_t " n );
+.BI "char *stpncpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
+.BI "char *strncpy(char " dst "[restrict ." sz "], \
+const char *restrict " src ,
+.BI " size_t " sz );
.fi
.PP
.RS -4
@@ -36,67 +37,44 @@ .SH SYNOPSIS
_GNU_SOURCE
.fi
.SH DESCRIPTION
-.IR Note :
-This is probably not the function you want to use.
-For string copying with truncation, see
-.BR strlcpy (3bsd).
-.PP
-The
-.BR stpncpy ()
-function copies at most
-.I n
-characters of
+These functions copy the string pointed to by
.I src
-and fills the rest of the
-.I dest
-buffer with null bytes.
-.BR Warning :
-If there is no null character among the first
-.I n
-bytes of
-.IR src ,
-the string placed in
-.I dest
-will not be null-terminated.
+into a null-padded character sequence at the fixed-width buffer pointed to by
+.IR dst .
+If the destination buffer,
+limited by its size,
+isn't large enough to hold the copy,
+the resulting character sequence is truncated.
+For the difference between the two functions, see RETURN VALUE.
.PP
-A simple implementation of
-.BR strncpy ()
-might be:
+An implementation of these functions might be:
.PP
.in +4n
.EX
char *
-stpncpy(char *dest, const char *src, size_t n)
+stpncpy(char *restrict dst, const char *restrict src, size_t sz)
{
- char *p
+ bzero(dst, sz);
+ return mempcpy(dst, src, strnlen(src, sz));
+}
- bzero(dest, n);
- p = memccpy(dest, src, \(aq\e0\(aq, n);
- if (p == NULL)
- return dest + n;
-
- return p - 1;
+char *
+strncpy(char *restrict dst, const char *restrict src, size_t sz)
+{
+ stpncpy(dst, src, sz);
+ return dst;
}
.EE
.in
-.PP
-The use of
-.BR strncpy ()
-is to copy a C string to a fixed-length buffer
-while ensuring that unused bytes in the destination buffer are zeroed out
-(perhaps to prevent information leaks if the buffer is to be
-written to media or transmitted to another process via an
-interprocess communication technique).
.SH RETURN VALUE
+.TP
.BR stpncpy ()
-returns a pointer to the terminating null byte
-in
-.IR dest ,
-or, if
-.I dest
-is not null-terminated,
-.IR dest + n
-(that is, a pointer to one-past-the-end of the array).
+returns a pointer to
+one after the last character in the destination character sequence.
+.TP
+.BR strncpy ()
+returns
+.IR dst .
.SH ATTRIBUTES
For an explanation of the terms used in this section, see
.BR attributes (7).
@@ -108,16 +86,74 @@ .SH ATTRIBUTES
l l l.
Interface Attribute Value
T{
-.BR stpncpy ()
+.BR stpncpy (),
+.BR strncpy ()
T} Thread safety MT-Safe
.TE
.hy
.ad
.sp 1
.SH STANDARDS
-This function was added to POSIX.1-2008.
-Before that, it was a GNU extension.
-It first appeared in glibc 1.07 in 1993.
+.TP
+.BR stpncpy ()
+POSIX.1-2008.
+.\" Before that, it was a GNU extension.
+.\" It first appeared in glibc 1.07 in 1993.
+.TP
+.BR strncpy ()
+POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
+.SH CAVEATS
+The name of these functions is confusing.
+These functions produce a null-padded character sequence,
+not a string (see
+.BR string_copy (7)).
+.PP
+It's impossible to distinguish truncation by the result of the call,
+from a character sequence that just fits the destination buffer;
+truncation should be detected by
+comparing the length of the input string
+with the size of the destination buffer.
+.PP
+If you're going to use this function in chained calls,
+it would be useful to develop a similar function that accepts
+a pointer to the end (one after the last element) of the destination buffer
+instead of its size.
+.SH EXAMPLES
+.\" SRC BEGIN (stpncpy.c)
+.EX
+#include <err.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+int
+main(void)
+{
+ char *p;
+ char buf1[20];
+ char buf2[20];
+ size_t len;
+
+ if (sizeof(buf1) < strlen("Hello world!"))
+ warnx("stpncpy: truncating character sequence");
+ p = stpncpy(buf1, "Hello world!", sizeof(buf1));
+ len = p \- buf1;
+
+ printf("[len = %zu]: ", len);
+ printf("%.*s\en", (int) len, buf1); // "Hello world!"
+
+ if (sizeof(buf2) < strlen("Hello world!"))
+ warnx("strncpy: truncating character sequence");
+ strncpy(buf2, "Hello world!", sizeof(buf));
+ len = strnlen(buf2, sizeof(buf2));
+
+ printf("[len = %zu]: ", len);
+ printf("%.*s\en", (int) len, buf2); // "Hello world!"
+
+ exit(EXIT_SUCCESS);
+}
+.EE
+.\" SRC END
.SH SEE ALSO
-.BR strlcpy (3bsd)
-.BR wcpncpy (3)
+.BR wcpncpy (3),
+.BR string_copy (7)
diff --git a/man3/strncpy.3 b/man3/strncpy.3
index e2ffc683f..4710b0201 100644
--- a/man3/strncpy.3
+++ b/man3/strncpy.3
@@ -1,129 +1 @@
-.\" Copyright (C) 1993 David Metcalfe <david@prism.demon.co.uk>
-.\" Copyright (C) 2022 Alejandro Colomar <alx@kernel.org>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\" References consulted:
-.\" Linux libc source code
-.\" Lewine's _POSIX Programmer's Guide_ (O'Reilly & Associates, 1991)
-.\" 386BSD man pages
-.\" Modified Sat Jul 24 18:06:49 1993 by Rik Faith (faith@cs.unc.edu)
-.\" Modified Fri Aug 25 23:17:51 1995 by Andries Brouwer (aeb@cwi.nl)
-.\" Modified Wed Dec 18 00:47:18 1996 by Andries Brouwer (aeb@cwi.nl)
-.\" 2007-06-15, Marc Boyer <marc.boyer@enseeiht.fr> + mtk
-.\" Improve discussion of strncpy().
-.\"
-.TH strncpy 3 (date) "Linux man-pages (unreleased)"
-.SH NAME
-strncpy \- copy a string into a fixed-length buffer and zero the rest of it
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <string.h>
-.PP
-.BI "[[deprecated]] char *strncpy(char " dest "[restrict ." n ],
-.BI " const char " src "[restrict ." n "], \
-size_t " n );
-.fi
-.SH DESCRIPTION
-.BI Note: " This is not the function you want to use."
-For string copying with truncation, see
-.BR strlcpy (3bsd).
-For copying a string into a fixed-length buffer with zeroing of the rest,
-see
-.BR stpncpy (3).
-.PP
-.BR strncpy ()
-copies at most
-.I n
-bytes of
-.IR src ,
-and fills the rest of the
-.I dest
-buffer with null bytes.
-.BR Warning :
-If there is no null byte
-among the first
-.I n
-bytes of
-.IR src ,
-the string placed in
-.I dest
-will not be null-terminated.
-.PP
-A simple implementation of
-.BR strncpy ()
-might be:
-.PP
-.in +4n
-.EX
-char *
-strncpy(char *dest, const char *src, size_t n)
-{
- bzero(dest, n);
- memccpy(dest, src, \(aq\e0\(aq, n);
-
- return dest;
-}
-.EE
-.in
-.PP
-The use of
-.BR strncpy ()
-is to copy a C string to a fixed-length buffer
-while ensuring that unused bytes in the destination buffer are zeroed out
-(perhaps to prevent information leaks if the buffer is to be
-written to media or transmitted to another process via an
-interprocess communication technique).
-But
-.BR stpncpy (3)
-is better for this purpose,
-since it detects truncation.
-See BUGS below.
-.SH RETURN VALUE
-The
-.BR strncpy ()
-function returns a pointer to
-the destination buffer
-.IR dest .
-.SH ATTRIBUTES
-For an explanation of the terms used in this section, see
-.BR attributes (7).
-.ad l
-.nh
-.TS
-allbox;
-lbx lb lb
-l l l.
-Interface Attribute Value
-T{
-.BR strncpy ()
-T} Thread safety MT-Safe
-.TE
-.hy
-.ad
-.sp 1
-.SH STANDARDS
-POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH BUGS
-.BR strncpy ()
-has a misleading name.
-It doesn't produce a (null-terminated) string;
-and it should never be used for producing a string.
-.PP
-It can't detect truncation.
-It's probably better to explicitly call
-.BR bzero (3)
-and
-.BR memccpy (3),
-or
-.BR stpncpy (3)
-since they allow detecting truncation.
-.SH SEE ALSO
-.BR bzero (3),
-.BR memccpy (3),
-.BR stpncpy (3),
-.BR string (3),
-.BR wcsncpy (3)
+.so man3/stpncpy.3
--
2.39.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v6 5/5] strncat.3: Rewrite to be consistent with string_copy.7.
2022-12-15 0:26 ` [PATCH v5 0/5] Rewrite pages about " Alejandro Colomar
` (4 preceding siblings ...)
2022-12-19 21:02 ` [PATCH v6 4/5] stpncpy.3, strncpy.3: " Alejandro Colomar
@ 2022-12-19 21:02 ` Alejandro Colomar
5 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-19 21:02 UTC (permalink / raw)
To: linux-man
Cc: Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski, Stefan Puiu
Cc: Martin Sebor <msebor@redhat.com>
Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
Cc: Jakub Wilk <jwilk@jwilk.net>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Iker Pedrosa <ipedrosa@redhat.com>
Cc: Andrew Pinski <pinskia@gmail.com>
Cc: Stefan Puiu <stefan.puiu@gmail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
man3/strncat.3 | 157 ++++++++++++++++++-------------------------------
1 file changed, 57 insertions(+), 100 deletions(-)
diff --git a/man3/strncat.3 b/man3/strncat.3
index 6e4bf6d78..45fe0575c 100644
--- a/man3/strncat.3
+++ b/man3/strncat.3
@@ -1,10 +1,11 @@
+'\" t
.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
.\"
.\" SPDX-License-Identifier: Linux-man-pages-copyleft
.\"
.TH strncat 3 (date) "Linux man-pages (unreleased)"
.SH NAME
-strncat \- concatenate an unterminated string into a string
+strncat \- concatenate a null-padded character sequence into a string
.SH LIBRARY
Standard C library
.RI ( libc ", " \-lc )
@@ -12,54 +13,41 @@ .SH SYNOPSIS
.nf
.B #include <string.h>
.PP
-.BI "char *strncat(char " dest "[restrict strlen(." dest ") + ." n " + 1],"
-.BI " const char " src "[restrict ." n ],
-.BI " size_t " n );
+.BI "char *strncat(char *restrict " dst ", const char " src "[restrict ." sz ],
+.BI " size_t " sz );
.fi
.SH DESCRIPTION
-.IR Note :
-This is probably not the function you want to use.
-For string concatenation with truncation, see
-.BR strlcat (3bsd).
-For copying or concatenating a string into a fixed-length buffer
-with zeroing of the rest, see
-.BR stpncpy (3).
-.PP
-.BR strncat ()
-appends at most
-.I n
-characters of
-.I src
-to the end of
+This function catenates the input character sequence
+contained in a null-padded fixed-width buffer,
+into a string at the buffer pointed to by
.IR dst .
-It always terminates with a null character the string placed in
-.IR dest .
+The programmer is responsible for allocating a destination buffer large enough,
+that is,
+.IR "strlen(dst) + strnlen(src, sz) + 1" .
.PP
-An implementation of
-.BR strncat ()
-might be:
+An implementation of this function might be:
.PP
.in +4n
.EX
char *
-strncat(char *dest, const char *src, size_t n)
+strncat(char *restrict dst, const char *restrict src, size_t sz)
{
- char *cat;
- size_t len;
+ int len;
+ char *p;
- cat = dest + strlen(dest);
- len = strnlen(src, n);
- memcpy(cat, src, len);
- cat[len] = \(aq\e0\(aq;
+ len = strnlen(src, sz);
+ p = dst + strlen(dst);
+ p = mempcpy(p, src, len);
+ *p = \(aq\e0\(aq;
- return dest;
+ return dst;
}
.EE
.in
.SH RETURN VALUE
.BR strncat ()
-returns a pointer to the resulting string
-.IR dest .
+returns
+.IR dst .
.SH ATTRIBUTES
For an explanation of the terms used in this section, see
.BR attributes (7).
@@ -79,65 +67,25 @@ .SH ATTRIBUTES
.sp 1
.SH STANDARDS
POSIX.1-2001, POSIX.1-2008, C89, C99, SVr4, 4.3BSD.
-.SH NOTES
-.SS ustr2stpe()
-You may want to write your own function similar to
-.BR strncpy (),
-with the following improvements:
-.IP \(bu 3
-Copy, instead of concatenating.
-There's no equivalent of
-.BR strncat ()
-that copies instead of concatenating.
-.IP \(bu
-Allow chaining the function,
-by returning a suitable pointer.
-Copy chaining is faster than concatenating.
-.IP \(bu
-Don't check for null characters in the middle of the unterminated string.
-If the string is terminated, this function should not be used.
-If the string is unterminated, it is unnecessary.
-.IP \(bu
-A name that tells what it does:
-Copy from an
-.IR u nterminated
-.IR str ing
-to a
-.IR st ring,
-and return a
-.IR p ointer
-to its end.
-.PP
-.in +4n
-.EX
-/* This code is in the public domain.
- *
- * char *ustr2stp(char dst[restrict .n+1],
- * const char src[restrict .n],
- * size_t len);
- */
-char *
-ustr2stp(char *restrict dst, const char *restrict src, size_t len)
-{
- memcpy(dst, src, len);
- dst[len] = \(aq\e0\(aq;
-
- return dst + len;
-}
-.EE
-.in
.SH CAVEATS
-This function doesn't know the size of the destination buffer,
-so it can overrun the buffer if the programmer wasn't careful enough.
-.SH BUGS
-.BR strncat (3)
-has a misleading name;
-it has no relationship with
+The name of this function is confusing.
+This function has no relation to
.BR strncpy (3).
+.PP
+If the destination buffer is not large enough,
+the behavior is undefined.
+See
+.B _FORTIFY_SOURCE
+in
+.BR feature_test_macros (7).
+.SH BUGS
+This function can be very inefficient.
+Read about
+.UR https://www.joelonsoftware.com/\:2001/12/11/\:back\-to\-basics/
+Shlemiel the painter
+.UE .
.SH EXAMPLES
-The following program creates a string
-from a concatenation of unterminated strings.
-.\" SRC BEGIN (strncpy.c)
+.\" SRC BEGIN (strncat.c)
.EX
#include <stdio.h>
#include <stdlib.h>
@@ -148,24 +96,33 @@ .SH EXAMPLES
int
main(void)
{
- char pre[4] = "pre.";
- char *post = ".post";
- char *src = "some_long_body.post";
- char dest[100];
+ size_t maxsize;
- dest[0] = \(aq\e0\(aq;
+ // Null-padded fixed-width character sequences
+ char pre[4] = "pre.";
+ char new_post[50] = ".foo.bar";
+
+ // Strings
+ char post[] = ".post";
+ char src[] = "some_long_body.post";
+ char *dest;
+
+ maxsize = nitems(pre) + strlen(src) \- strlen(post) +
+ nitems(new_post) + 1;
+ dest = malloc(sizeof(*dest) * maxsize);
+
+ dest[0] = \(aq\e0\(aq; // There's no 'cpy' function to this 'cat'.
strncat(dest, pre, nitems(pre));
strncat(dest, src, strlen(src) \- strlen(post));
+ strncat(dest, new_post, nitems(new_post));
- puts(dest); // "pre.some_long_body"
+ puts(dest); // "pre.some_long_body.foo.bar"
+ free(dest);
exit(EXIT_SUCCESS);
}
.EE
.\" SRC END
.in
.SH SEE ALSO
-.BR memccpy (3),
-.BR memcpy (3),
-.BR mempcpy (3),
-.BR strcpy (3),
-.BR string (3)
+.BR string (3),
+.BR string_copy (3)
--
2.39.0
^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v6 1/5] string_copy.7: Add page to document all string-copying functions
2022-12-19 21:02 ` [PATCH v6 1/5] string_copy.7: Add page to document all " Alejandro Colomar
@ 2022-12-20 15:00 ` Stefan Puiu
2022-12-20 15:03 ` Alejandro Colomar
2023-01-20 3:43 ` Eric Biggers
1 sibling, 1 reply; 53+ messages in thread
From: Stefan Puiu @ 2022-12-20 15:00 UTC (permalink / raw)
To: Alejandro Colomar
Cc: linux-man, Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
Hi,
Noticed a typo below
On Mon, Dec 19, 2022 at 11:02 PM Alejandro Colomar
<alx.manpages@gmail.com> wrote:
>
> This is an opportunity to use consistent language across the
> documentation for all string-copying functions.
>
> It is also easier to show the similarities and differences between all
> of the functions, so that a reader can use this page to know which
> function is needed for a given task.
>
> Alternative functions not provided by libc have been given in the same
> page, with reference implementations.
>
> Cc: Martin Sebor <msebor@redhat.com>
> Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
> Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
> Cc: Jakub Wilk <jwilk@jwilk.net>
> Cc: Serge Hallyn <serge@hallyn.com>
> Cc: Iker Pedrosa <ipedrosa@redhat.com>
> Cc: Andrew Pinski <pinskia@gmail.com>
> Cc: Stefan Puiu <stefan.puiu@gmail.com>
> Signed-off-by: Alejandro Colomar <alx@kernel.org>
> ---
> man7/string_copy.7 | 855 +++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 855 insertions(+)
> create mode 100644 man7/string_copy.7
>
> diff --git a/man7/string_copy.7 b/man7/string_copy.7
> new file mode 100644
> index 000000000..a32b93c01
> --- /dev/null
> +++ b/man7/string_copy.7
> @@ -0,0 +1,855 @@
> +.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
> +.\"
> +.\" SPDX-License-Identifier: BSD-3-Clause
> +.\"
> +.TH string_copy 7 (date) "Linux man-pages (unreleased)"
> +.\" ----- NAME :: -----------------------------------------------------/
> +.SH NAME
> +stpcpy,
> +strcpy, strcat,
> +stpecpy, stpecpyx,
> +strlcpy, strlcat,
> +stpncpy,
> +strncpy,
> +zustr2ustp, zustr2stp,
> +strncat,
> +ustpcpy, ustr2stp
> +\- copy strings and character sequences
> +.\" ----- SYNOPSIS :: -------------------------------------------------/
> +.SH SYNOPSIS
> +.\" ----- SYNOPSIS :: (Null-terminated) strings -----------------------/
> +.SS Strings
> +.nf
> +// Chain-copy a string.
> +.BI "char *stpcpy(char *restrict " dst ", const char *restrict " src );
> +.PP
> +// Copy/catenate a string.
> +.BI "char *strcpy(char *restrict " dst ", const char *restrict " src );
> +.BI "char *strcat(char *restrict " dst ", const char *restrict " src );
> +.PP
> +// Chain-copy a string with truncation.
> +.BI "char *stpecpy(char *" dst ", char " end "[0], const char *restrict " src );
> +.PP
> +// Chain-copy a string with truncation and SIGSEGV on UB.
> +.BI "char *stpecpyx(char *" dst ", char " end "[0], const char *restrict " src );
> +.PP
> +// Copy/catenate a string with truncation and SIGSEGV on UB.
> +.BI "size_t strlcpy(char " dst "[restrict ." sz "], \
> +const char *restrict " src ,
> +.BI " size_t " sz );
> +.BI "size_t strlcat(char " dst "[restrict ." sz "], \
> +const char *restrict " src ,
> +.BI " size_t " sz );
> +.fi
> +.\" ----- SYNOPSIS :: Null-padded character sequences --------/
> +.SS Null-padded character sequences
> +.nf
> +// Zero a fixed-width buffer, and
> +// copy a string into a character sequence with truncation.
> +.BI "char *stpncpy(char " dst "[restrict ." sz "], \
> +const char *restrict " src ,
> +.BI " size_t " sz );
> +.PP
> +// Zero a fixed-width buffer, and
> +// copy a string into a character sequence with truncation.
> +.BI "char *strncpy(char " dest "[restrict ." sz "], \
> +const char *restrict " src ,
> +.BI " size_t " sz );
> +.PP
> +// Chain-copy a null-padded character sequence into a character sequence.
> +.BI "char *zustr2ustp(char *restrict " dst ", \
> +const char " src "[restrict ." sz ],
> +.BI " size_t " sz );
> +.PP
> +// Chain-copy a null-padded character sequence into a string.
> +.BI "char *zustr2stp(char *restrict " dst ", \
> +const char " src "[restrict ." sz ],
> +.BI " size_t " sz );
> +.PP
> +// Catenate a null-padded character sequence into a string.
> +.BI "char *strncat(char *restrict " dst ", const char " src "[restrict ." sz ],
> +.BI " size_t " sz );
> +.fi
> +.\" ----- SYNOPSIS :: Measured character sequences --------------------/
> +.SS Measured character sequences
> +.nf
> +// Chain-copy a measured character sequence.
> +.BI "char *ustpcpy(char *restrict " dst ", \
> +const char " src "[restrict ." len ],
> +.BI " size_t " len );
> +.PP
> +// Chain-copy a measured character sequence into a string.
> +.BI "char *ustr2stp(char *restrict " dst ", \
> +const char " src "[restrict ." len ],
> +.BI " size_t " len );
> +.fi
> +.SH DESCRIPTION
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: -----------------/
> +.SS Terms (and abbreviations)
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: string (str) ----/
> +.TP
> +.IR "string " ( str )
> +is a sequence of zero or more non-null characters followed by a null byte.
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: null-padded character seq
> +.TP
> +.I character sequence
> +is a sequence of zero or more non-null characters.
> +A program should never usa a character sequence where a string is required.
Here I think you want s/usa/use above.
Thanks,
Stefan.
> +However, with appropriate care,
> +a string can be used in the place of a character sequence.
> +.RS
> +.TP
> +.IR "null-padded character sequence " ( zustr )
> +Character sequences can be contained in fixed-width buffers,
> +which contain padding null bytes after the character sequence,
> +to fill the rest of the buffer
> +without affecting the character sequence;
> +however, those padding null bytes are not part of the character sequence.
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: measured character sequence
> +.TP
> +.IR "measured character sequence " ( ustr )
> +Character sequence delimited by its length.
> +It may be a slice of a larger character sequence,
> +or even of a string.
> +.RE
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: length (len) ----/
> +.TP
> +.IR "length " ( len )
> +is the number of non-null characters in a string or character sequence.
> +It is the return value of
> +.I strlen(str)
> +and of
> +.IR "strnlen(ustr, sz)" .
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: size (sz) -------/
> +.TP
> +.IR "size " ( sz )
> +refers to the entire buffer
> +where the string or character sequence is contained.
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: end -------------/
> +.TP
> +.I end
> +is the name of a pointer to one past the last element of a buffer.
> +It is equivalent to
> +.IR &str[sz] .
> +It is used as a sentinel value,
> +to be able to truncate strings or character sequences
> +instead of overrunning the containing buffer.
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: copy ------------/
> +.TP
> +.I copy
> +This term is used when
> +the writing starts at the first element pointed to by
> +.IR dst .
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: catenate --------/
> +.TP
> +.I catenate
> +This term is used when
> +a function first finds the terminating null byte in
> +.IR dst ,
> +and then starts writing at that position.
> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: chain -----------/
> +.TP
> +.I chain
> +This term is used when
> +it's the programmer who provides
> +a pointer to the terminating null byte in the string
> +.I dst
> +(or one after the last character in a character sequence),
> +and the function starts writing at that location.
> +The function returns
> +a pointer to the new location of the terminating null byte
> +(or one after the last character in a character sequence)
> +after the call,
> +so that the programmer can use it to chain such calls.
> +.\" ----- DESCRIPTION :: Copy, catenate, and chain-copy ---------------/
> +.SS Copy, catenate, and chain-copy
> +Originally,
> +there was a distinction between functions that copy and those that catenate.
> +However, newer functions that copy while allowing chaining
> +cover both use cases with a single API.
> +They are also algorithmically faster,
> +since they don't need to search for
> +the terminating null byte of the existing string.
> +However, functions that catenate have a much simpler use,
> +so if performance is not important,
> +it can make sense to use them for improving readability.
> +.PP
> +The pointer returned by functions that allow chaining
> +is a byproduct of the copy operation,
> +so it has no performance costs.
> +Functions that return such a pointer,
> +and thus can be chained,
> +have names of the form
> +.RB * stp *(),
> +since it's common to name the pointer just
> +.IR p .
> +.PP
> +Chain-copying functions that truncate
> +should accept a pointer to the end of the destination buffer,
> +and have names of the form
> +.RB * stpe *().
> +This allows not having to recalculate the remaining size after each call.
> +.\" ----- DESCRIPTION :: Truncate or not? -----------------------------/
> +.SS Truncate or not?
> +The first thing to note is that programmers should be careful with buffers,
> +so they always have the correct size,
> +and truncation is not necessary.
> +.PP
> +In most cases,
> +truncation is not desired,
> +and it is simpler to just do the copy.
> +Simpler code is safer code.
> +Programming against programming mistakes by adding more code
> +just adds more points where mistakes can be made.
> +.PP
> +Nowadays,
> +compilers can detect most programmer errors with features like
> +compiler warnings,
> +static analyzers, and
> +.BR \%_FORTIFY_SOURCE
> +(see
> +.BR ftm (7)).
> +Keeping the code simple
> +helps these overflow-detection features be more precise.
> +.PP
> +When validating user input,
> +however,
> +it makes sense to truncate.
> +Remember to check the return value of such function calls.
> +.PP
> +Functions that truncate:
> +.IP \(bu 3
> +.BR stpecpy (3)
> +is the most efficient string copy function that performs truncation.
> +It only requires to check for truncation once after all chained calls.
> +.IP \(bu
> +.BR stpecpyx (3)
> +is a variant of
> +.BR stpecpy (3)
> +that consumes the entire source string,
> +to catch bugs in the program
> +by forcing a segmentation fault (as
> +.BR strlcpy (3bsd)
> +and
> +.BR strlcat (3bsd)
> +do).
> +.IP \(bu
> +.BR strlcpy (3bsd)
> +and
> +.BR strlcat (3bsd)
> +are designed to crash if the input string is invalid
> +(doesn't contain a terminating null byte).
> +.IP \(bu
> +.BR stpncpy (3)
> +and
> +.BR strncpy (3)
> +also truncate, but they don't write strings,
> +but rather null-padded character sequences.
> +.\" ----- DESCRIPTION :: Null-padded character sequences --------------/
> +.SS Null-padded character sequences
> +For historic reasons,
> +some standard APIs,
> +such as
> +.BR utmpx (5),
> +use null-padded character sequences in fixed-width buffers.
> +To interface with them,
> +specialized functions need to be used.
> +.PP
> +To copy strings into them, use
> +.BR stpncpy (3).
> +.PP
> +To copy from an unterminated string within a fixed-width buffer into a string,
> +ignoring any trailing null bytes in the source fixed-width buffer,
> +you should use
> +.BR zustr2stp (3)
> +or
> +.BR strncat (3).
> +.PP
> +To copy from an unterminated string within a fixed-width buffer
> +into a character sequence,
> +ingoring any trailing null bytes in the source fixed-width buffer,
> +you should use
> +.BR zustr2ustp (3).
> +.\" ----- DESCRIPTION :: Measured character sequences -----------------/
> +.SS Measured character sequences
> +The simplest character sequence copying function is
> +.BR mempcpy (3).
> +It requires always knowing the length of your character sequences,
> +for which structures can be used.
> +It makes the code much faster,
> +since you always know the length of your character sequences,
> +and can do the minimal copies and length measurements.
> +.BR mempcpy (3)
> +copies character sequences,
> +so you need to explicitly set the terminating null byte if you need a string.
> +.PP
> +However,
> +for keeping type safety,
> +it's good to add a wrapper that uses
> +.I char\~*
> +instead of
> +.IR void\~* :
> +.BR ustpcpy (3).
> +.PP
> +In programs that make considerable use of strings or character sequences,
> +and need the best performance,
> +using overlapping character sequences can make a big difference.
> +It allows holding subsequences of a larger character sequence.
> +while not duplicating memory
> +nor using time to do a copy.
> +.PP
> +However, this is delicate,
> +since it requires using character sequences.
> +C library APIs use strings,
> +so programs that use character sequences
> +will have to take care of differentiating strings from character sequences.
> +.PP
> +To copy a measured character sequence, use
> +.BR ustpcpy (3).
> +.PP
> +To copy a measured character sequence into a string, use
> +.BR ustr2stp (3).
> +.PP
> +Because these functions ask for the length,
> +and a string is by nature composed of a character sequence of the same length
> +plus a terminating null byte,
> +a string is also accepted as input.
> +.\" ----- DESCRIPTION :: String vs character sequence -----------------/
> +.SS String vs character sequence
> +Some functions only operate on strings.
> +Those require that the input
> +.I src
> +is a string,
> +and guarantee an output string
> +(even when truncation occurs).
> +Functions that catenate
> +also require that
> +.I dst
> +holds a string before the call.
> +List of functions:
> +.IP \(bu 3
> +.PD 0
> +.BR stpcpy (3)
> +.IP \(bu
> +.BR strcpy "(3), \c"
> +.BR strcat (3)
> +.IP \(bu
> +.BR stpecpy "(3), \c"
> +.BR stpecpyx (3)
> +.IP \(bu
> +.BR strlcpy "(3bsd), \c"
> +.BR strlcat (3bsd)
> +.PD
> +.PP
> +Other functions require an input string,
> +but create a character sequence as output.
> +These functions have confusing names,
> +and have a long history of misuse.
> +List of functions:
> +.IP \(bu 3
> +.PD 0
> +.BR stpncpy (3)
> +.IP \(bu
> +.BR strncpy (3)
> +.PD
> +.PP
> +Other functions operate on an input character sequence,
> +and create an output string.
> +Functions that catenate
> +also require that
> +.I dst
> +holds a string before the call.
> +.BR strncat (3)
> +has an even more misleading name than the functions above.
> +List of functions:
> +.IP \(bu 3
> +.PD 0
> +.BR zustr2stp (3)
> +.IP \(bu
> +.BR strncat (3)
> +.IP \(bu
> +.BR ustr2stp (3)
> +.PD
> +.PP
> +Other functions operate on an input character sequence
> +to create an output character sequence.
> +List of functions:
> +.IP \(bu 3
> +.PD 0
> +.BR ustpcpy (3)
> +.IP \(bu
> +.BR zustr2stp (3)
> +.PD
> +.\" ----- DESCRIPTION :: Functions :: ---------------------------------/
> +.SS Functions
> +.\" ----- DESCRIPTION :: Functions :: stpcpy(3) -----------------------/
> +.TP
> +.BR stpcpy (3)
> +This function copies the input string into a destination string.
> +The programmer is responsible for allocating a buffer large enough.
> +It returns a pointer suitable for chaining.
> +.\" ----- DESCRIPTION :: Functions :: strcpy(3), strcat(3) ------------/
> +.TP
> +.BR strcpy (3)
> +.TQ
> +.BR strcat (3)
> +These functions copy and catenate the input string into a destination string.
> +The programmer is responsible for allocating a buffer large enough.
> +The return value is useless.
> +.IP
> +.BR stpcpy (3)
> +is a faster alternative to these functions.
> +.\" ----- DESCRIPTION :: Functions :: stpecpy(3), stpecpyx(3) ---------/
> +.TP
> +.BR stpecpy (3)
> +.TQ
> +.BR stpecpyx (3)
> +These functions copy the input string into a destination string.
> +If the destination buffer,
> +limited by a pointer to its end,
> +isn't large enough to hold the copy,
> +the resulting string is truncated
> +(but it is guaranteed to be null-terminated).
> +They return a pointer suitable for chaining.
> +Truncation needs to be detected only once after the last chained call.
> +.BR stpecpyx (3)
> +has identical semantics to
> +.BR stpecpy (3),
> +except that it forces a SIGSEGV if the
> +.I src
> +pointer is not a string.
> +.IP
> +These functions are not provided by any library;
> +See EXAMPLES for a reference implementation.
> +.\" ----- DESCRIPTION :: Functions :: strlcpy(3bsd), strlcat(3bsd) ----/
> +.TP
> +.BR strlcpy (3bsd)
> +.TQ
> +.BR strlcat (3bsd)
> +These functions copy and catenate the input string into a destination string.
> +If the destination buffer,
> +limited by its size,
> +isn't large enough to hold the copy,
> +the resulting string is truncated
> +(but it is guaranteed to be null-terminated).
> +They return the length of the total string they tried to create.
> +These functions force a SIGSEGV if the
> +.I src
> +pointer is not a string.
> +.IP
> +.BR stpecpyx (3)
> +is a faster alternative to these functions.
> +.\" ----- DESCRIPTION :: Functions :: stpncpy(3) ----------------------/
> +.TP
> +.BR stpncpy (3)
> +This function copies the input string into
> +a destination null-padded character sequence in a fixed-width buffer.
> +If the destination buffer,
> +limited by its size,
> +isn't large enough to hold the copy,
> +the resulting character sequence is truncated.
> +Since it creates a character sequence,
> +it doesn't need to write a terminating null byte.
> +It's impossible to distinguish truncation by the result of the call,
> +from a character sequence that just fits the destination buffer;
> +truncation should be detected by
> +comparing the length of the input string
> +with the size of the destination buffer.
> +.\" ----- DESCRIPTION :: Functions :: strncpy(3) ----------------------/
> +.TP
> +.BR strncpy (3)
> +This function is identical to
> +.BR stpncpy (3)
> +except for the useless return value.
> +.IP
> +.BR stpncpy (3)
> +is a more useful alternative to this function.
> +.\" ----- DESCRIPTION :: Functions :: zustr2ustp(3) --------------------/
> +.TP
> +.BR zustr2ustp (3)
> +This function copies the input character sequence
> +contained in a null-padded wixed-width buffer,
> +into a destination character sequence.
> +The programmer is responsible for allocating a buffer large enough.
> +It returns a pointer suitable for chaining.
> +.IP
> +A truncating version of this function doesn't exist,
> +since the size of the original character sequence is always known,
> +so it wouldn't be very useful.
> +.IP
> +This function is not provided by any library;
> +See EXAMPLES for a reference implementation.
> +.\" ----- DESCRIPTION :: Functions :: zustr2stp(3) --------------------/
> +.TP
> +.BR zustr2stp (3)
> +This function copies the input character sequence
> +contained in a null-padded wixed-width buffer,
> +into a destination string.
> +The programmer is responsible for allocating a buffer large enough.
> +It returns a pointer suitable for chaining.
> +.IP
> +A truncating version of this function doesn't exist,
> +since the size of the original character sequence is always known,
> +so it wouldn't be very useful.
> +.IP
> +This function is not provided by any library;
> +See EXAMPLES for a reference implementation.
> +.\" ----- DESCRIPTION :: Functions :: strncat(3) ----------------------/
> +.TP
> +.BR strncat (3)
> +Do not confuse this function with
> +.BR strncpy (3);
> +they are not related at all.
> +.IP
> +This function catenates the input character sequence
> +contained in a null-padded wixed-width buffer,
> +into a destination string.
> +The programmer is responsible for allocating a buffer large enough.
> +The return value is useless.
> +.IP
> +.BR zustr2stp (3)
> +is a faster alternative to this function.
> +.\" ----- DESCRIPTION :: Functions :: ustpcpy(3) ----------------------/
> +.TP
> +.BR ustpcpy (3)
> +This function copies the input character sequence,
> +limited by its length,
> +into a destination character sequence.
> +The programmer is responsible for allocating a buffer large enough.
> +It returns a pointer suitable for chaining.
> +.\" ----- DESCRIPTION :: Functions :: ustr2stp(3) ---------------------/
> +.TP
> +.BR ustr2stp (3)
> +This function copies the input character sequence,
> +limited by its length,
> +into a destination string.
> +The programmer is responsible for allocating a buffer large enough.
> +It returns a pointer suitable for chaining.
> +.\" ----- RETURN VALUE :: ---------------------------------------------/
> +.SH RETURN VALUE
> +The following functions return
> +a pointer to the terminating null byte in the destination string.
> +.IP \(bu 3
> +.PD 0
> +.BR stpcpy (3)
> +.IP \(bu
> +.BR ustr2stp (3)
> +.IP \(bu
> +.BR zustr2stp (3)
> +.PD
> +.PP
> +The following functions return
> +a pointer to the terminating null byte in the destination string,
> +except when truncation occurs;
> +if truncation occurs,
> +they return a pointer to the end of the destination buffer.
> +.IP \(bu 3
> +.BR stpecpy (3),
> +.BR stpecpyx (3)
> +.PP
> +The following function returns
> +a pointer to one after the last character
> +in the destination character sequence;
> +if truncation occurs,
> +that pointer is equivalent to
> +a pointer to the end of the destination buffer.
> +.IP \(bu 3
> +.BR stpncpy (3)
> +.PP
> +The following functions return
> +a pointer to one after the last character
> +in the destination character sequence.
> +.IP \(bu 3
> +.PD 0
> +.BR zustr2ustp (3)
> +.IP \(bu
> +.BR ustpcpy (3)
> +.PD
> +.PP
> +The following functions return
> +the length of the total string that they tried to create
> +(as if truncation didn't occur).
> +.IP \(bu 3
> +.BR strlcpy (3bsd),
> +.BR strlcat (3bsd)
> +.PP
> +The following functions return the
> +.I dst
> +pointer,
> +which is useless.
> +.IP \(bu 3
> +.PD 0
> +.BR strcpy (3),
> +.BR strcat (3)
> +.IP \(bu
> +.BR strncpy (3)
> +.IP \(bu
> +.BR strncat (3)
> +.PD
> +.\" ----- NOTES :: strscpy(9) -----------------------------------------/
> +.SH NOTES
> +The Linux kernel has an internal function for copying strings,
> +which is similar to
> +.BR stpecpy (3),
> +except that it can't be chained:
> +.TP
> +.BR strscpy (9)
> +This function copies the input string into a destination string.
> +If the destination buffer,
> +limited by its size,
> +isn't large enough to hold the copy,
> +the resulting string is truncated
> +(but it is guaranteed to be null-terminated).
> +It returns the length of the destination string, or
> +.B \-E2BIG
> +on truncation.
> +.IP
> +.BR stpecpy (3)
> +is a simpler and faster alternative to this function.
> +.RE
> +.\" ----- CAVEATS :: --------------------------------------------------/
> +.SH CAVEATS
> +Don't mix chain calls to truncating and non-truncating functions.
> +It is conceptually wrong
> +unless you know that the first part of a copy will always fit.
> +Anyway, the performance difference will probably be negligible,
> +so it will probably be more clear if you use consistent semantics:
> +either truncating or non-truncating.
> +Calling a non-truncating function after a truncating one is necessarily wrong.
> +.\" ----- BUGS :: -----------------------------------------------------/
> +.SH BUGS
> +All catenation functions share the same performance problem:
> +.UR https://www.joelonsoftware.com/\:2001/12/11/\:back\-to\-basics/
> +Shlemiel the painter
> +.UE .
> +.\" ----- EXAMPLES :: -------------------------------------------------/
> +.SH EXAMPLES
> +The following are examples of correct use of each of these functions.
> +.\" ----- EXAMPLES :: stpcpy(3) ---------------------------------------/
> +.TP
> +.BR stpcpy (3)
> +.EX
> +p = buf;
> +p = stpcpy(p, "Hello ");
> +p = stpcpy(p, "world");
> +p = stpcpy(p, "!");
> +len = p \- buf;
> +puts(buf);
> +.EE
> +.\" ----- EXAMPLES :: strcpy(3), strcat(3) ----------------------------/
> +.TP
> +.BR strcpy (3)
> +.TQ
> +.BR strcat (3)
> +.EX
> +strcpy(buf, "Hello ");
> +strcat(buf, "world");
> +strcat(buf, "!");
> +len = strlen(buf);
> +puts(buf);
> +.EE
> +.\" ----- EXAMPLES :: stpecpy(3), stpecpyx(3) -------------------------/
> +.TP
> +.BR stpecpy (3)
> +.TQ
> +.BR stpecpyx (3)
> +.EX
> +end = buf + sizeof(buf);
> +p = buf;
> +p = stpecpy(p, end, "Hello ");
> +p = stpecpy(p, end, "world");
> +p = stpecpy(p, end, "!");
> +if (p == end) {
> + p\-\-;
> + goto toolong;
> +}
> +len = p \- buf;
> +puts(buf);
> +.EE
> +.\" ----- EXAMPLES :: strlcpy(3bsd), strlcat(3bsd) --------------------/
> +.TP
> +.BR strlcpy (3bsd)
> +.TQ
> +.BR strlcat (3bsd)
> +.EX
> +if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
> + goto toolong;
> +if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
> + goto toolong;
> +len = strlcat(buf, "!", sizeof(buf));
> +if (len >= sizeof(buf))
> + goto toolong;
> +puts(buf);
> +.EE
> +.\" ----- EXAMPLES :: strscpy(9) --------------------------------------/
> +.TP
> +.BR strscpy (9)
> +.EX
> +len = strscpy(buf, "Hello world!", sizeof(buf));
> +if (len == \-E2BIG)
> + goto toolong;
> +puts(buf);
> +.EE
> +.\" ----- EXAMPLES :: stpncpy(3) --------------------------------------/
> +.TP
> +.BR stpncpy (3)
> +.EX
> +p = stpncpy(buf, "Hello world!", sizeof(buf));
> +if (sizeof(buf) < strlen("Hello world!"))
> + goto toolong;
> +len = p \- buf;
> +for (size_t i = 0; i < sizeof(buf); i++)
> + putchar(buf[i]);
> +.EE
> +.\" ----- EXAMPLES :: strncpy(3) --------------------------------------/
> +.TP
> +.BR strncpy (3)
> +.EX
> +strncpy(buf, "Hello world!", sizeof(buf));
> +if (sizeof(buf) < strlen("Hello world!"))
> + goto toolong;
> +len = strnlen(buf, sizeof(buf));
> +for (size_t i = 0; i < sizeof(buf); i++)
> + putchar(buf[i]);
> +.EE
> +.\" ----- EXAMPLES :: zustr2ustp(3) -----------------------------------/
> +.TP
> +.BR zustr2ustp (3)
> +.EX
> +p = buf;
> +p = zustr2ustp(p, "Hello ", 6);
> +p = zustr2ustp(p, "world", 42); // Padding null bytes ignored.
> +p = zustr2ustp(p, "!", 1);
> +len = p \- buf;
> +printf("%.*s\en", (int) len, buf);
> +.EE
> +.\" ----- EXAMPLES :: zustr2stp(3) ------------------------------------/
> +.TP
> +.BR zustr2stp (3)
> +.EX
> +p = buf;
> +p = zustr2stp(p, "Hello ", 6);
> +p = zustr2stp(p, "world", 42); // Padding null bytes ignored.
> +p = zustr2stp(p, "!", 1);
> +len = p \- buf;
> +puts(buf);
> +.EE
> +.\" ----- EXAMPLES :: strncat(3) --------------------------------------/
> +.TP
> +.BR strncat (3)
> +.EX
> +buf[0] = \(aq\e0\(aq; // There's no 'cpy' function to this 'cat'.
> +strncat(buf, "Hello ", 6);
> +strncat(buf, "world", 42); // Padding null bytes ignored.
> +strncat(buf, "!", 1);
> +len = strlen(buf);
> +puts(buf);
> +.EE
> +.\" ----- EXAMPLES :: ustpcpy(3) --------------------------------------/
> +.TP
> +.BR ustpcpy (3)
> +.EX
> +p = buf;
> +p = ustpcpy(p, "Hello ", 6);
> +p = ustpcpy(p, "world", 5);
> +p = ustpcpy(p, "!", 1);
> +len = p \- buf;
> +printf("%.*s\en", (int) len, buf);
> +.EE
> +.\" ----- EXAMPLES :: ustr2stp(3) -------------------------------------/
> +.TP
> +.BR ustr2stp (3)
> +.EX
> +p = buf;
> +p = ustr2stp(p, "Hello ", 6);
> +p = ustr2stp(p, "world", 5);
> +p = ustr2stp(p, "!", 1);
> +len = p \- buf;
> +puts(buf);
> +.EE
> +.\" ----- EXAMPLES :: Implementations :: ------------------------------/
> +.SS Implementations
> +Here are reference implementations for functions not provided by libc.
> +.PP
> +.in +4n
> +.EX
> +/* This code is in the public domain. */
> +
> +.\" ----- EXAMPLES :: Implementations :: stpecpy(3) -------------------/
> +char *
> +.IR stpecpy "(char *dst, char end[0], const char *restrict src)"
> +{
> + char *p;
> +
> + if (dst == end)
> + return end;
> +
> + p = memccpy(dst, src, \(aq\e0\(aq, end \- dst);
> + if (p != NULL)
> + return p \- 1;
> +
> + /* truncation detected */
> + end[\-1] = \(aq\e0\(aq;
> + return end;
> +}
> +
> +.\" ----- EXAMPLES :: Implementations :: stpecpy(3) -------------------/
> +char *
> +.IR stpecpyx "(char *dst, char end[0], const char *restrict src)"
> +{
> + if (src[strlen(src)] != \(aq\e0\(aq)
> + raise(SIGSEGV);
> +
> + return stpecpy(dst, end, src);
> +}
> +
> +.\" ----- EXAMPLES :: Implementations :: zustr2ustp(3) ----------------/
> +char *
> +.IR zustr2ustp "(char *restrict dst, const char *restrict src, size_t sz)"
> +{
> + return ustpcpy(dst, src, strnlen(src, sz));
> +}
> +
> +.\" ----- EXAMPLES :: Implementations :: zustr2stp(3) -----------------/
> +char *
> +.IR zustr2stp "(char *restrict dst, const char *restrict src, size_t sz)"
> +{
> + char *p;
> +
> + p = zustr2ustp(dst, src, sz);
> + *p = \(aq\e0\(aq;
> +
> + return p;
> +}
> +
> +.\" ----- EXAMPLES :: Implementations :: ustpcpy(3) -------------------/
> +char *
> +.IR ustpcpy "(char *restrict dst, const char *restrict src, size_t len)"
> +{
> + return mempcpy(dst, src, len);
> +}
> +
> +.\" ----- EXAMPLES :: Implementations :: ustr2stp(3) ------------------/
> +char *
> +.IR ustr2stp "(char *restrict dst, const char *restrict src, size_t len)"
> +{
> + char *p;
> +
> + p = ustpcpy(dst, src, len);
> + *p = \(aq\e0\(aq;
> +
> + return p;
> +}
> +.EE
> +.in
> +.\" ----- SEE ALSO :: -------------------------------------------------/
> +.SH SEE ALSO
> +.BR bzero (3),
> +.BR memcpy (3),
> +.BR memccpy (3),
> +.BR mempcpy (3),
> +.BR stpcpy (3),
> +.BR strlcpy (3bsd),
> +.BR strncat (3),
> +.BR stpncpy (3),
> +.BR string (3)
> --
> 2.39.0
>
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 1/5] string_copy.7: Add page to document all string-copying functions
2022-12-20 15:00 ` Stefan Puiu
@ 2022-12-20 15:03 ` Alejandro Colomar
0 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2022-12-20 15:03 UTC (permalink / raw)
To: Stefan Puiu
Cc: linux-man, Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski
[-- Attachment #1.1: Type: text/plain, Size: 31116 bytes --]
Hi Stefan,
On 12/20/22 16:00, Stefan Puiu wrote:
> Hi,
>
> Noticed a typo below
Typo fixed. Thanks,
Alex
<https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=3d395282860f7b86f65c6735351f24b52c486718>
>
> On Mon, Dec 19, 2022 at 11:02 PM Alejandro Colomar
> <alx.manpages@gmail.com> wrote:
>>
>> This is an opportunity to use consistent language across the
>> documentation for all string-copying functions.
>>
>> It is also easier to show the similarities and differences between all
>> of the functions, so that a reader can use this page to know which
>> function is needed for a given task.
>>
>> Alternative functions not provided by libc have been given in the same
>> page, with reference implementations.
>>
>> Cc: Martin Sebor <msebor@redhat.com>
>> Cc: "G. Branden Robinson" <g.branden.robinson@gmail.com>
>> Cc: Douglas McIlroy <douglas.mcilroy@dartmouth.edu>
>> Cc: Jakub Wilk <jwilk@jwilk.net>
>> Cc: Serge Hallyn <serge@hallyn.com>
>> Cc: Iker Pedrosa <ipedrosa@redhat.com>
>> Cc: Andrew Pinski <pinskia@gmail.com>
>> Cc: Stefan Puiu <stefan.puiu@gmail.com>
>> Signed-off-by: Alejandro Colomar <alx@kernel.org>
>> ---
>> man7/string_copy.7 | 855 +++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 855 insertions(+)
>> create mode 100644 man7/string_copy.7
>>
>> diff --git a/man7/string_copy.7 b/man7/string_copy.7
>> new file mode 100644
>> index 000000000..a32b93c01
>> --- /dev/null
>> +++ b/man7/string_copy.7
>> @@ -0,0 +1,855 @@
>> +.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
>> +.\"
>> +.\" SPDX-License-Identifier: BSD-3-Clause
>> +.\"
>> +.TH string_copy 7 (date) "Linux man-pages (unreleased)"
>> +.\" ----- NAME :: -----------------------------------------------------/
>> +.SH NAME
>> +stpcpy,
>> +strcpy, strcat,
>> +stpecpy, stpecpyx,
>> +strlcpy, strlcat,
>> +stpncpy,
>> +strncpy,
>> +zustr2ustp, zustr2stp,
>> +strncat,
>> +ustpcpy, ustr2stp
>> +\- copy strings and character sequences
>> +.\" ----- SYNOPSIS :: -------------------------------------------------/
>> +.SH SYNOPSIS
>> +.\" ----- SYNOPSIS :: (Null-terminated) strings -----------------------/
>> +.SS Strings
>> +.nf
>> +// Chain-copy a string.
>> +.BI "char *stpcpy(char *restrict " dst ", const char *restrict " src );
>> +.PP
>> +// Copy/catenate a string.
>> +.BI "char *strcpy(char *restrict " dst ", const char *restrict " src );
>> +.BI "char *strcat(char *restrict " dst ", const char *restrict " src );
>> +.PP
>> +// Chain-copy a string with truncation.
>> +.BI "char *stpecpy(char *" dst ", char " end "[0], const char *restrict " src );
>> +.PP
>> +// Chain-copy a string with truncation and SIGSEGV on UB.
>> +.BI "char *stpecpyx(char *" dst ", char " end "[0], const char *restrict " src );
>> +.PP
>> +// Copy/catenate a string with truncation and SIGSEGV on UB.
>> +.BI "size_t strlcpy(char " dst "[restrict ." sz "], \
>> +const char *restrict " src ,
>> +.BI " size_t " sz );
>> +.BI "size_t strlcat(char " dst "[restrict ." sz "], \
>> +const char *restrict " src ,
>> +.BI " size_t " sz );
>> +.fi
>> +.\" ----- SYNOPSIS :: Null-padded character sequences --------/
>> +.SS Null-padded character sequences
>> +.nf
>> +// Zero a fixed-width buffer, and
>> +// copy a string into a character sequence with truncation.
>> +.BI "char *stpncpy(char " dst "[restrict ." sz "], \
>> +const char *restrict " src ,
>> +.BI " size_t " sz );
>> +.PP
>> +// Zero a fixed-width buffer, and
>> +// copy a string into a character sequence with truncation.
>> +.BI "char *strncpy(char " dest "[restrict ." sz "], \
>> +const char *restrict " src ,
>> +.BI " size_t " sz );
>> +.PP
>> +// Chain-copy a null-padded character sequence into a character sequence.
>> +.BI "char *zustr2ustp(char *restrict " dst ", \
>> +const char " src "[restrict ." sz ],
>> +.BI " size_t " sz );
>> +.PP
>> +// Chain-copy a null-padded character sequence into a string.
>> +.BI "char *zustr2stp(char *restrict " dst ", \
>> +const char " src "[restrict ." sz ],
>> +.BI " size_t " sz );
>> +.PP
>> +// Catenate a null-padded character sequence into a string.
>> +.BI "char *strncat(char *restrict " dst ", const char " src "[restrict ." sz ],
>> +.BI " size_t " sz );
>> +.fi
>> +.\" ----- SYNOPSIS :: Measured character sequences --------------------/
>> +.SS Measured character sequences
>> +.nf
>> +// Chain-copy a measured character sequence.
>> +.BI "char *ustpcpy(char *restrict " dst ", \
>> +const char " src "[restrict ." len ],
>> +.BI " size_t " len );
>> +.PP
>> +// Chain-copy a measured character sequence into a string.
>> +.BI "char *ustr2stp(char *restrict " dst ", \
>> +const char " src "[restrict ." len ],
>> +.BI " size_t " len );
>> +.fi
>> +.SH DESCRIPTION
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: -----------------/
>> +.SS Terms (and abbreviations)
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: string (str) ----/
>> +.TP
>> +.IR "string " ( str )
>> +is a sequence of zero or more non-null characters followed by a null byte.
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: null-padded character seq
>> +.TP
>> +.I character sequence
>> +is a sequence of zero or more non-null characters.
>> +A program should never usa a character sequence where a string is required.
>
> Here I think you want s/usa/use above.
>
> Thanks,
> Stefan.
>
>> +However, with appropriate care,
>> +a string can be used in the place of a character sequence.
>> +.RS
>> +.TP
>> +.IR "null-padded character sequence " ( zustr )
>> +Character sequences can be contained in fixed-width buffers,
>> +which contain padding null bytes after the character sequence,
>> +to fill the rest of the buffer
>> +without affecting the character sequence;
>> +however, those padding null bytes are not part of the character sequence.
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: measured character sequence
>> +.TP
>> +.IR "measured character sequence " ( ustr )
>> +Character sequence delimited by its length.
>> +It may be a slice of a larger character sequence,
>> +or even of a string.
>> +.RE
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: length (len) ----/
>> +.TP
>> +.IR "length " ( len )
>> +is the number of non-null characters in a string or character sequence.
>> +It is the return value of
>> +.I strlen(str)
>> +and of
>> +.IR "strnlen(ustr, sz)" .
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: size (sz) -------/
>> +.TP
>> +.IR "size " ( sz )
>> +refers to the entire buffer
>> +where the string or character sequence is contained.
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: end -------------/
>> +.TP
>> +.I end
>> +is the name of a pointer to one past the last element of a buffer.
>> +It is equivalent to
>> +.IR &str[sz] .
>> +It is used as a sentinel value,
>> +to be able to truncate strings or character sequences
>> +instead of overrunning the containing buffer.
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: copy ------------/
>> +.TP
>> +.I copy
>> +This term is used when
>> +the writing starts at the first element pointed to by
>> +.IR dst .
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: catenate --------/
>> +.TP
>> +.I catenate
>> +This term is used when
>> +a function first finds the terminating null byte in
>> +.IR dst ,
>> +and then starts writing at that position.
>> +.\" ----- DESCRIPTION :: Terms (and abbreviations) :: chain -----------/
>> +.TP
>> +.I chain
>> +This term is used when
>> +it's the programmer who provides
>> +a pointer to the terminating null byte in the string
>> +.I dst
>> +(or one after the last character in a character sequence),
>> +and the function starts writing at that location.
>> +The function returns
>> +a pointer to the new location of the terminating null byte
>> +(or one after the last character in a character sequence)
>> +after the call,
>> +so that the programmer can use it to chain such calls.
>> +.\" ----- DESCRIPTION :: Copy, catenate, and chain-copy ---------------/
>> +.SS Copy, catenate, and chain-copy
>> +Originally,
>> +there was a distinction between functions that copy and those that catenate.
>> +However, newer functions that copy while allowing chaining
>> +cover both use cases with a single API.
>> +They are also algorithmically faster,
>> +since they don't need to search for
>> +the terminating null byte of the existing string.
>> +However, functions that catenate have a much simpler use,
>> +so if performance is not important,
>> +it can make sense to use them for improving readability.
>> +.PP
>> +The pointer returned by functions that allow chaining
>> +is a byproduct of the copy operation,
>> +so it has no performance costs.
>> +Functions that return such a pointer,
>> +and thus can be chained,
>> +have names of the form
>> +.RB * stp *(),
>> +since it's common to name the pointer just
>> +.IR p .
>> +.PP
>> +Chain-copying functions that truncate
>> +should accept a pointer to the end of the destination buffer,
>> +and have names of the form
>> +.RB * stpe *().
>> +This allows not having to recalculate the remaining size after each call.
>> +.\" ----- DESCRIPTION :: Truncate or not? -----------------------------/
>> +.SS Truncate or not?
>> +The first thing to note is that programmers should be careful with buffers,
>> +so they always have the correct size,
>> +and truncation is not necessary.
>> +.PP
>> +In most cases,
>> +truncation is not desired,
>> +and it is simpler to just do the copy.
>> +Simpler code is safer code.
>> +Programming against programming mistakes by adding more code
>> +just adds more points where mistakes can be made.
>> +.PP
>> +Nowadays,
>> +compilers can detect most programmer errors with features like
>> +compiler warnings,
>> +static analyzers, and
>> +.BR \%_FORTIFY_SOURCE
>> +(see
>> +.BR ftm (7)).
>> +Keeping the code simple
>> +helps these overflow-detection features be more precise.
>> +.PP
>> +When validating user input,
>> +however,
>> +it makes sense to truncate.
>> +Remember to check the return value of such function calls.
>> +.PP
>> +Functions that truncate:
>> +.IP \(bu 3
>> +.BR stpecpy (3)
>> +is the most efficient string copy function that performs truncation.
>> +It only requires to check for truncation once after all chained calls.
>> +.IP \(bu
>> +.BR stpecpyx (3)
>> +is a variant of
>> +.BR stpecpy (3)
>> +that consumes the entire source string,
>> +to catch bugs in the program
>> +by forcing a segmentation fault (as
>> +.BR strlcpy (3bsd)
>> +and
>> +.BR strlcat (3bsd)
>> +do).
>> +.IP \(bu
>> +.BR strlcpy (3bsd)
>> +and
>> +.BR strlcat (3bsd)
>> +are designed to crash if the input string is invalid
>> +(doesn't contain a terminating null byte).
>> +.IP \(bu
>> +.BR stpncpy (3)
>> +and
>> +.BR strncpy (3)
>> +also truncate, but they don't write strings,
>> +but rather null-padded character sequences.
>> +.\" ----- DESCRIPTION :: Null-padded character sequences --------------/
>> +.SS Null-padded character sequences
>> +For historic reasons,
>> +some standard APIs,
>> +such as
>> +.BR utmpx (5),
>> +use null-padded character sequences in fixed-width buffers.
>> +To interface with them,
>> +specialized functions need to be used.
>> +.PP
>> +To copy strings into them, use
>> +.BR stpncpy (3).
>> +.PP
>> +To copy from an unterminated string within a fixed-width buffer into a string,
>> +ignoring any trailing null bytes in the source fixed-width buffer,
>> +you should use
>> +.BR zustr2stp (3)
>> +or
>> +.BR strncat (3).
>> +.PP
>> +To copy from an unterminated string within a fixed-width buffer
>> +into a character sequence,
>> +ingoring any trailing null bytes in the source fixed-width buffer,
>> +you should use
>> +.BR zustr2ustp (3).
>> +.\" ----- DESCRIPTION :: Measured character sequences -----------------/
>> +.SS Measured character sequences
>> +The simplest character sequence copying function is
>> +.BR mempcpy (3).
>> +It requires always knowing the length of your character sequences,
>> +for which structures can be used.
>> +It makes the code much faster,
>> +since you always know the length of your character sequences,
>> +and can do the minimal copies and length measurements.
>> +.BR mempcpy (3)
>> +copies character sequences,
>> +so you need to explicitly set the terminating null byte if you need a string.
>> +.PP
>> +However,
>> +for keeping type safety,
>> +it's good to add a wrapper that uses
>> +.I char\~*
>> +instead of
>> +.IR void\~* :
>> +.BR ustpcpy (3).
>> +.PP
>> +In programs that make considerable use of strings or character sequences,
>> +and need the best performance,
>> +using overlapping character sequences can make a big difference.
>> +It allows holding subsequences of a larger character sequence.
>> +while not duplicating memory
>> +nor using time to do a copy.
>> +.PP
>> +However, this is delicate,
>> +since it requires using character sequences.
>> +C library APIs use strings,
>> +so programs that use character sequences
>> +will have to take care of differentiating strings from character sequences.
>> +.PP
>> +To copy a measured character sequence, use
>> +.BR ustpcpy (3).
>> +.PP
>> +To copy a measured character sequence into a string, use
>> +.BR ustr2stp (3).
>> +.PP
>> +Because these functions ask for the length,
>> +and a string is by nature composed of a character sequence of the same length
>> +plus a terminating null byte,
>> +a string is also accepted as input.
>> +.\" ----- DESCRIPTION :: String vs character sequence -----------------/
>> +.SS String vs character sequence
>> +Some functions only operate on strings.
>> +Those require that the input
>> +.I src
>> +is a string,
>> +and guarantee an output string
>> +(even when truncation occurs).
>> +Functions that catenate
>> +also require that
>> +.I dst
>> +holds a string before the call.
>> +List of functions:
>> +.IP \(bu 3
>> +.PD 0
>> +.BR stpcpy (3)
>> +.IP \(bu
>> +.BR strcpy "(3), \c"
>> +.BR strcat (3)
>> +.IP \(bu
>> +.BR stpecpy "(3), \c"
>> +.BR stpecpyx (3)
>> +.IP \(bu
>> +.BR strlcpy "(3bsd), \c"
>> +.BR strlcat (3bsd)
>> +.PD
>> +.PP
>> +Other functions require an input string,
>> +but create a character sequence as output.
>> +These functions have confusing names,
>> +and have a long history of misuse.
>> +List of functions:
>> +.IP \(bu 3
>> +.PD 0
>> +.BR stpncpy (3)
>> +.IP \(bu
>> +.BR strncpy (3)
>> +.PD
>> +.PP
>> +Other functions operate on an input character sequence,
>> +and create an output string.
>> +Functions that catenate
>> +also require that
>> +.I dst
>> +holds a string before the call.
>> +.BR strncat (3)
>> +has an even more misleading name than the functions above.
>> +List of functions:
>> +.IP \(bu 3
>> +.PD 0
>> +.BR zustr2stp (3)
>> +.IP \(bu
>> +.BR strncat (3)
>> +.IP \(bu
>> +.BR ustr2stp (3)
>> +.PD
>> +.PP
>> +Other functions operate on an input character sequence
>> +to create an output character sequence.
>> +List of functions:
>> +.IP \(bu 3
>> +.PD 0
>> +.BR ustpcpy (3)
>> +.IP \(bu
>> +.BR zustr2stp (3)
>> +.PD
>> +.\" ----- DESCRIPTION :: Functions :: ---------------------------------/
>> +.SS Functions
>> +.\" ----- DESCRIPTION :: Functions :: stpcpy(3) -----------------------/
>> +.TP
>> +.BR stpcpy (3)
>> +This function copies the input string into a destination string.
>> +The programmer is responsible for allocating a buffer large enough.
>> +It returns a pointer suitable for chaining.
>> +.\" ----- DESCRIPTION :: Functions :: strcpy(3), strcat(3) ------------/
>> +.TP
>> +.BR strcpy (3)
>> +.TQ
>> +.BR strcat (3)
>> +These functions copy and catenate the input string into a destination string.
>> +The programmer is responsible for allocating a buffer large enough.
>> +The return value is useless.
>> +.IP
>> +.BR stpcpy (3)
>> +is a faster alternative to these functions.
>> +.\" ----- DESCRIPTION :: Functions :: stpecpy(3), stpecpyx(3) ---------/
>> +.TP
>> +.BR stpecpy (3)
>> +.TQ
>> +.BR stpecpyx (3)
>> +These functions copy the input string into a destination string.
>> +If the destination buffer,
>> +limited by a pointer to its end,
>> +isn't large enough to hold the copy,
>> +the resulting string is truncated
>> +(but it is guaranteed to be null-terminated).
>> +They return a pointer suitable for chaining.
>> +Truncation needs to be detected only once after the last chained call.
>> +.BR stpecpyx (3)
>> +has identical semantics to
>> +.BR stpecpy (3),
>> +except that it forces a SIGSEGV if the
>> +.I src
>> +pointer is not a string.
>> +.IP
>> +These functions are not provided by any library;
>> +See EXAMPLES for a reference implementation.
>> +.\" ----- DESCRIPTION :: Functions :: strlcpy(3bsd), strlcat(3bsd) ----/
>> +.TP
>> +.BR strlcpy (3bsd)
>> +.TQ
>> +.BR strlcat (3bsd)
>> +These functions copy and catenate the input string into a destination string.
>> +If the destination buffer,
>> +limited by its size,
>> +isn't large enough to hold the copy,
>> +the resulting string is truncated
>> +(but it is guaranteed to be null-terminated).
>> +They return the length of the total string they tried to create.
>> +These functions force a SIGSEGV if the
>> +.I src
>> +pointer is not a string.
>> +.IP
>> +.BR stpecpyx (3)
>> +is a faster alternative to these functions.
>> +.\" ----- DESCRIPTION :: Functions :: stpncpy(3) ----------------------/
>> +.TP
>> +.BR stpncpy (3)
>> +This function copies the input string into
>> +a destination null-padded character sequence in a fixed-width buffer.
>> +If the destination buffer,
>> +limited by its size,
>> +isn't large enough to hold the copy,
>> +the resulting character sequence is truncated.
>> +Since it creates a character sequence,
>> +it doesn't need to write a terminating null byte.
>> +It's impossible to distinguish truncation by the result of the call,
>> +from a character sequence that just fits the destination buffer;
>> +truncation should be detected by
>> +comparing the length of the input string
>> +with the size of the destination buffer.
>> +.\" ----- DESCRIPTION :: Functions :: strncpy(3) ----------------------/
>> +.TP
>> +.BR strncpy (3)
>> +This function is identical to
>> +.BR stpncpy (3)
>> +except for the useless return value.
>> +.IP
>> +.BR stpncpy (3)
>> +is a more useful alternative to this function.
>> +.\" ----- DESCRIPTION :: Functions :: zustr2ustp(3) --------------------/
>> +.TP
>> +.BR zustr2ustp (3)
>> +This function copies the input character sequence
>> +contained in a null-padded wixed-width buffer,
>> +into a destination character sequence.
>> +The programmer is responsible for allocating a buffer large enough.
>> +It returns a pointer suitable for chaining.
>> +.IP
>> +A truncating version of this function doesn't exist,
>> +since the size of the original character sequence is always known,
>> +so it wouldn't be very useful.
>> +.IP
>> +This function is not provided by any library;
>> +See EXAMPLES for a reference implementation.
>> +.\" ----- DESCRIPTION :: Functions :: zustr2stp(3) --------------------/
>> +.TP
>> +.BR zustr2stp (3)
>> +This function copies the input character sequence
>> +contained in a null-padded wixed-width buffer,
>> +into a destination string.
>> +The programmer is responsible for allocating a buffer large enough.
>> +It returns a pointer suitable for chaining.
>> +.IP
>> +A truncating version of this function doesn't exist,
>> +since the size of the original character sequence is always known,
>> +so it wouldn't be very useful.
>> +.IP
>> +This function is not provided by any library;
>> +See EXAMPLES for a reference implementation.
>> +.\" ----- DESCRIPTION :: Functions :: strncat(3) ----------------------/
>> +.TP
>> +.BR strncat (3)
>> +Do not confuse this function with
>> +.BR strncpy (3);
>> +they are not related at all.
>> +.IP
>> +This function catenates the input character sequence
>> +contained in a null-padded wixed-width buffer,
>> +into a destination string.
>> +The programmer is responsible for allocating a buffer large enough.
>> +The return value is useless.
>> +.IP
>> +.BR zustr2stp (3)
>> +is a faster alternative to this function.
>> +.\" ----- DESCRIPTION :: Functions :: ustpcpy(3) ----------------------/
>> +.TP
>> +.BR ustpcpy (3)
>> +This function copies the input character sequence,
>> +limited by its length,
>> +into a destination character sequence.
>> +The programmer is responsible for allocating a buffer large enough.
>> +It returns a pointer suitable for chaining.
>> +.\" ----- DESCRIPTION :: Functions :: ustr2stp(3) ---------------------/
>> +.TP
>> +.BR ustr2stp (3)
>> +This function copies the input character sequence,
>> +limited by its length,
>> +into a destination string.
>> +The programmer is responsible for allocating a buffer large enough.
>> +It returns a pointer suitable for chaining.
>> +.\" ----- RETURN VALUE :: ---------------------------------------------/
>> +.SH RETURN VALUE
>> +The following functions return
>> +a pointer to the terminating null byte in the destination string.
>> +.IP \(bu 3
>> +.PD 0
>> +.BR stpcpy (3)
>> +.IP \(bu
>> +.BR ustr2stp (3)
>> +.IP \(bu
>> +.BR zustr2stp (3)
>> +.PD
>> +.PP
>> +The following functions return
>> +a pointer to the terminating null byte in the destination string,
>> +except when truncation occurs;
>> +if truncation occurs,
>> +they return a pointer to the end of the destination buffer.
>> +.IP \(bu 3
>> +.BR stpecpy (3),
>> +.BR stpecpyx (3)
>> +.PP
>> +The following function returns
>> +a pointer to one after the last character
>> +in the destination character sequence;
>> +if truncation occurs,
>> +that pointer is equivalent to
>> +a pointer to the end of the destination buffer.
>> +.IP \(bu 3
>> +.BR stpncpy (3)
>> +.PP
>> +The following functions return
>> +a pointer to one after the last character
>> +in the destination character sequence.
>> +.IP \(bu 3
>> +.PD 0
>> +.BR zustr2ustp (3)
>> +.IP \(bu
>> +.BR ustpcpy (3)
>> +.PD
>> +.PP
>> +The following functions return
>> +the length of the total string that they tried to create
>> +(as if truncation didn't occur).
>> +.IP \(bu 3
>> +.BR strlcpy (3bsd),
>> +.BR strlcat (3bsd)
>> +.PP
>> +The following functions return the
>> +.I dst
>> +pointer,
>> +which is useless.
>> +.IP \(bu 3
>> +.PD 0
>> +.BR strcpy (3),
>> +.BR strcat (3)
>> +.IP \(bu
>> +.BR strncpy (3)
>> +.IP \(bu
>> +.BR strncat (3)
>> +.PD
>> +.\" ----- NOTES :: strscpy(9) -----------------------------------------/
>> +.SH NOTES
>> +The Linux kernel has an internal function for copying strings,
>> +which is similar to
>> +.BR stpecpy (3),
>> +except that it can't be chained:
>> +.TP
>> +.BR strscpy (9)
>> +This function copies the input string into a destination string.
>> +If the destination buffer,
>> +limited by its size,
>> +isn't large enough to hold the copy,
>> +the resulting string is truncated
>> +(but it is guaranteed to be null-terminated).
>> +It returns the length of the destination string, or
>> +.B \-E2BIG
>> +on truncation.
>> +.IP
>> +.BR stpecpy (3)
>> +is a simpler and faster alternative to this function.
>> +.RE
>> +.\" ----- CAVEATS :: --------------------------------------------------/
>> +.SH CAVEATS
>> +Don't mix chain calls to truncating and non-truncating functions.
>> +It is conceptually wrong
>> +unless you know that the first part of a copy will always fit.
>> +Anyway, the performance difference will probably be negligible,
>> +so it will probably be more clear if you use consistent semantics:
>> +either truncating or non-truncating.
>> +Calling a non-truncating function after a truncating one is necessarily wrong.
>> +.\" ----- BUGS :: -----------------------------------------------------/
>> +.SH BUGS
>> +All catenation functions share the same performance problem:
>> +.UR https://www.joelonsoftware.com/\:2001/12/11/\:back\-to\-basics/
>> +Shlemiel the painter
>> +.UE .
>> +.\" ----- EXAMPLES :: -------------------------------------------------/
>> +.SH EXAMPLES
>> +The following are examples of correct use of each of these functions.
>> +.\" ----- EXAMPLES :: stpcpy(3) ---------------------------------------/
>> +.TP
>> +.BR stpcpy (3)
>> +.EX
>> +p = buf;
>> +p = stpcpy(p, "Hello ");
>> +p = stpcpy(p, "world");
>> +p = stpcpy(p, "!");
>> +len = p \- buf;
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: strcpy(3), strcat(3) ----------------------------/
>> +.TP
>> +.BR strcpy (3)
>> +.TQ
>> +.BR strcat (3)
>> +.EX
>> +strcpy(buf, "Hello ");
>> +strcat(buf, "world");
>> +strcat(buf, "!");
>> +len = strlen(buf);
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: stpecpy(3), stpecpyx(3) -------------------------/
>> +.TP
>> +.BR stpecpy (3)
>> +.TQ
>> +.BR stpecpyx (3)
>> +.EX
>> +end = buf + sizeof(buf);
>> +p = buf;
>> +p = stpecpy(p, end, "Hello ");
>> +p = stpecpy(p, end, "world");
>> +p = stpecpy(p, end, "!");
>> +if (p == end) {
>> + p\-\-;
>> + goto toolong;
>> +}
>> +len = p \- buf;
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: strlcpy(3bsd), strlcat(3bsd) --------------------/
>> +.TP
>> +.BR strlcpy (3bsd)
>> +.TQ
>> +.BR strlcat (3bsd)
>> +.EX
>> +if (strlcpy(buf, "Hello ", sizeof(buf)) >= sizeof(buf))
>> + goto toolong;
>> +if (strlcat(buf, "world", sizeof(buf)) >= sizeof(buf))
>> + goto toolong;
>> +len = strlcat(buf, "!", sizeof(buf));
>> +if (len >= sizeof(buf))
>> + goto toolong;
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: strscpy(9) --------------------------------------/
>> +.TP
>> +.BR strscpy (9)
>> +.EX
>> +len = strscpy(buf, "Hello world!", sizeof(buf));
>> +if (len == \-E2BIG)
>> + goto toolong;
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: stpncpy(3) --------------------------------------/
>> +.TP
>> +.BR stpncpy (3)
>> +.EX
>> +p = stpncpy(buf, "Hello world!", sizeof(buf));
>> +if (sizeof(buf) < strlen("Hello world!"))
>> + goto toolong;
>> +len = p \- buf;
>> +for (size_t i = 0; i < sizeof(buf); i++)
>> + putchar(buf[i]);
>> +.EE
>> +.\" ----- EXAMPLES :: strncpy(3) --------------------------------------/
>> +.TP
>> +.BR strncpy (3)
>> +.EX
>> +strncpy(buf, "Hello world!", sizeof(buf));
>> +if (sizeof(buf) < strlen("Hello world!"))
>> + goto toolong;
>> +len = strnlen(buf, sizeof(buf));
>> +for (size_t i = 0; i < sizeof(buf); i++)
>> + putchar(buf[i]);
>> +.EE
>> +.\" ----- EXAMPLES :: zustr2ustp(3) -----------------------------------/
>> +.TP
>> +.BR zustr2ustp (3)
>> +.EX
>> +p = buf;
>> +p = zustr2ustp(p, "Hello ", 6);
>> +p = zustr2ustp(p, "world", 42); // Padding null bytes ignored.
>> +p = zustr2ustp(p, "!", 1);
>> +len = p \- buf;
>> +printf("%.*s\en", (int) len, buf);
>> +.EE
>> +.\" ----- EXAMPLES :: zustr2stp(3) ------------------------------------/
>> +.TP
>> +.BR zustr2stp (3)
>> +.EX
>> +p = buf;
>> +p = zustr2stp(p, "Hello ", 6);
>> +p = zustr2stp(p, "world", 42); // Padding null bytes ignored.
>> +p = zustr2stp(p, "!", 1);
>> +len = p \- buf;
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: strncat(3) --------------------------------------/
>> +.TP
>> +.BR strncat (3)
>> +.EX
>> +buf[0] = \(aq\e0\(aq; // There's no 'cpy' function to this 'cat'.
>> +strncat(buf, "Hello ", 6);
>> +strncat(buf, "world", 42); // Padding null bytes ignored.
>> +strncat(buf, "!", 1);
>> +len = strlen(buf);
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: ustpcpy(3) --------------------------------------/
>> +.TP
>> +.BR ustpcpy (3)
>> +.EX
>> +p = buf;
>> +p = ustpcpy(p, "Hello ", 6);
>> +p = ustpcpy(p, "world", 5);
>> +p = ustpcpy(p, "!", 1);
>> +len = p \- buf;
>> +printf("%.*s\en", (int) len, buf);
>> +.EE
>> +.\" ----- EXAMPLES :: ustr2stp(3) -------------------------------------/
>> +.TP
>> +.BR ustr2stp (3)
>> +.EX
>> +p = buf;
>> +p = ustr2stp(p, "Hello ", 6);
>> +p = ustr2stp(p, "world", 5);
>> +p = ustr2stp(p, "!", 1);
>> +len = p \- buf;
>> +puts(buf);
>> +.EE
>> +.\" ----- EXAMPLES :: Implementations :: ------------------------------/
>> +.SS Implementations
>> +Here are reference implementations for functions not provided by libc.
>> +.PP
>> +.in +4n
>> +.EX
>> +/* This code is in the public domain. */
>> +
>> +.\" ----- EXAMPLES :: Implementations :: stpecpy(3) -------------------/
>> +char *
>> +.IR stpecpy "(char *dst, char end[0], const char *restrict src)"
>> +{
>> + char *p;
>> +
>> + if (dst == end)
>> + return end;
>> +
>> + p = memccpy(dst, src, \(aq\e0\(aq, end \- dst);
>> + if (p != NULL)
>> + return p \- 1;
>> +
>> + /* truncation detected */
>> + end[\-1] = \(aq\e0\(aq;
>> + return end;
>> +}
>> +
>> +.\" ----- EXAMPLES :: Implementations :: stpecpy(3) -------------------/
>> +char *
>> +.IR stpecpyx "(char *dst, char end[0], const char *restrict src)"
>> +{
>> + if (src[strlen(src)] != \(aq\e0\(aq)
>> + raise(SIGSEGV);
>> +
>> + return stpecpy(dst, end, src);
>> +}
>> +
>> +.\" ----- EXAMPLES :: Implementations :: zustr2ustp(3) ----------------/
>> +char *
>> +.IR zustr2ustp "(char *restrict dst, const char *restrict src, size_t sz)"
>> +{
>> + return ustpcpy(dst, src, strnlen(src, sz));
>> +}
>> +
>> +.\" ----- EXAMPLES :: Implementations :: zustr2stp(3) -----------------/
>> +char *
>> +.IR zustr2stp "(char *restrict dst, const char *restrict src, size_t sz)"
>> +{
>> + char *p;
>> +
>> + p = zustr2ustp(dst, src, sz);
>> + *p = \(aq\e0\(aq;
>> +
>> + return p;
>> +}
>> +
>> +.\" ----- EXAMPLES :: Implementations :: ustpcpy(3) -------------------/
>> +char *
>> +.IR ustpcpy "(char *restrict dst, const char *restrict src, size_t len)"
>> +{
>> + return mempcpy(dst, src, len);
>> +}
>> +
>> +.\" ----- EXAMPLES :: Implementations :: ustr2stp(3) ------------------/
>> +char *
>> +.IR ustr2stp "(char *restrict dst, const char *restrict src, size_t len)"
>> +{
>> + char *p;
>> +
>> + p = ustpcpy(dst, src, len);
>> + *p = \(aq\e0\(aq;
>> +
>> + return p;
>> +}
>> +.EE
>> +.in
>> +.\" ----- SEE ALSO :: -------------------------------------------------/
>> +.SH SEE ALSO
>> +.BR bzero (3),
>> +.BR memcpy (3),
>> +.BR memccpy (3),
>> +.BR mempcpy (3),
>> +.BR stpcpy (3),
>> +.BR strlcpy (3bsd),
>> +.BR strncat (3),
>> +.BR stpncpy (3),
>> +.BR string (3)
>> --
>> 2.39.0
>>
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 1/5] string_copy.7: Add page to document all string-copying functions
2022-12-19 21:02 ` [PATCH v6 1/5] string_copy.7: Add page to document all " Alejandro Colomar
2022-12-20 15:00 ` Stefan Puiu
@ 2023-01-20 3:43 ` Eric Biggers
2023-01-20 12:55 ` Alejandro Colomar
1 sibling, 1 reply; 53+ messages in thread
From: Eric Biggers @ 2023-01-20 3:43 UTC (permalink / raw)
To: Alejandro Colomar
Cc: linux-man, Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski, Stefan Puiu
On Mon, Dec 19, 2022 at 10:02:05PM +0100, Alejandro Colomar wrote:
> diff --git a/man7/string_copy.7 b/man7/string_copy.7
> new file mode 100644
> index 000000000..a32b93c01
> --- /dev/null
> +++ b/man7/string_copy.7
> @@ -0,0 +1,855 @@
> +.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
> +.\"
> +.\" SPDX-License-Identifier: BSD-3-Clause
> +.\"
> +.TH string_copy 7 (date) "Linux man-pages (unreleased)"
> +.\" ----- NAME :: -----------------------------------------------------/
> +.SH NAME
> +stpcpy,
> +strcpy, strcat,
> +stpecpy, stpecpyx,
> +strlcpy, strlcat,
> +stpncpy,
> +strncpy,
> +zustr2ustp, zustr2stp,
> +strncat,
> +ustpcpy, ustr2stp
I happened to come across this new man page, and I'm confused by the inclusion
of functions like "ustpcpy". These functions don't seem to actually exist, so
why are they documented in the man page?
- Eric
^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v6 1/5] string_copy.7: Add page to document all string-copying functions
2023-01-20 3:43 ` Eric Biggers
@ 2023-01-20 12:55 ` Alejandro Colomar
0 siblings, 0 replies; 53+ messages in thread
From: Alejandro Colomar @ 2023-01-20 12:55 UTC (permalink / raw)
To: Eric Biggers
Cc: linux-man, Alejandro Colomar, Martin Sebor, G. Branden Robinson,
Douglas McIlroy, Jakub Wilk, Serge Hallyn, Iker Pedrosa,
Andrew Pinski, Stefan Puiu
[-- Attachment #1.1: Type: text/plain, Size: 1811 bytes --]
Hi Eric,
On 1/20/23 04:43, Eric Biggers wrote:
> On Mon, Dec 19, 2022 at 10:02:05PM +0100, Alejandro Colomar wrote:
>> diff --git a/man7/string_copy.7 b/man7/string_copy.7
>> new file mode 100644
>> index 000000000..a32b93c01
>> --- /dev/null
>> +++ b/man7/string_copy.7
>> @@ -0,0 +1,855 @@
>> +.\" Copyright 2022 Alejandro Colomar <alx@kernel.org>
>> +.\"
>> +.\" SPDX-License-Identifier: BSD-3-Clause
>> +.\"
>> +.TH string_copy 7 (date) "Linux man-pages (unreleased)"
>> +.\" ----- NAME :: -----------------------------------------------------/
>> +.SH NAME
>> +stpcpy,
>> +strcpy, strcat,
>> +stpecpy, stpecpyx,
>> +strlcpy, strlcat,
>> +stpncpy,
>> +strncpy,
>> +zustr2ustp, zustr2stp,
>> +strncat,
>> +ustpcpy, ustr2stp
>
> I happened to come across this new man page, and I'm confused by the inclusion
> of functions like "ustpcpy". These functions don't seem to actually exist, so
> why are they documented in the man page?
That page is not documenting the existing libc functions for copying strings,
but rather trying to explain all the alternatives, including other systems' ones
(such as strlcpy(3bsd)), and custom ones that are not provided by any system
(yet). It tries to guide a programmer that knows nothing about string copying
to allow him produce quality code, independently of libc support. For
documentation of the libc functions we still have the separate pages for each,
which have been also updated.
Those specific functions are similar to the old saying of "just use memcpy(3)
and forget about string copying functions", which is not so bad of an advice:
it's the fastest; however, those functions are a bit safer than directly calling
memcpy(3).
>
> - Eric
Cheers,
Alex
--
<http://www.alejandro-colomar.es/>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2023-01-20 12:56 UTC | newest]
Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-11 23:59 string_copy(7): New manual page documenting string copying functions Alejandro Colomar
2022-12-12 0:17 ` Alejandro Colomar
2022-12-12 0:25 ` Alejandro Colomar
2022-12-12 0:32 ` Alejandro Colomar
2022-12-12 14:24 ` [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions Alejandro Colomar
2022-12-12 17:33 ` Alejandro Colomar
2022-12-12 18:38 ` groff man(7) extensions (was: [PATCH 1/3] strcpy.3: Rewrite page to document all string-copying functions) G. Branden Robinson
2022-12-13 15:45 ` a Q quotation macro for man(7) (was: groff man(7) extensions) G. Branden Robinson
2022-12-12 23:00 ` [PATCH v2 0/3] Rewrite strcpy(3) Alejandro Colomar
2022-12-13 20:56 ` Jakub Wilk
2022-12-13 20:57 ` Alejandro Colomar
2022-12-13 22:05 ` Alejandro Colomar
2022-12-13 22:46 ` Alejandro Colomar
2022-12-14 0:03 ` [PATCH v3 0/1] Rewritten page for string-copying functions Alejandro Colomar
2022-12-14 0:14 ` Alejandro Colomar
2022-12-14 0:16 ` Alejandro Colomar
2022-12-14 16:17 ` [PATCH v4 " Alejandro Colomar
2022-12-15 0:26 ` [PATCH v5 0/5] Rewrite pages about " Alejandro Colomar
2022-12-19 21:02 ` [PATCH v6 0/5] Rewrite documentation for " Alejandro Colomar
2022-12-19 21:02 ` [PATCH v6 1/5] string_copy.7: Add page to document all " Alejandro Colomar
2022-12-20 15:00 ` Stefan Puiu
2022-12-20 15:03 ` Alejandro Colomar
2023-01-20 3:43 ` Eric Biggers
2023-01-20 12:55 ` Alejandro Colomar
2022-12-19 21:02 ` [PATCH v6 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7) Alejandro Colomar
2022-12-19 21:02 ` [PATCH v6 3/5] stpcpy.3, strcpy.3, strcat.3: Document in a single page Alejandro Colomar
2022-12-19 21:02 ` [PATCH v6 4/5] stpncpy.3, strncpy.3: " Alejandro Colomar
2022-12-19 21:02 ` [PATCH v6 5/5] strncat.3: Rewrite to be consistent with string_copy.7 Alejandro Colomar
2022-12-15 0:26 ` [PATCH v5 1/5] string_copy.7: Add page to document all string-copying functions Alejandro Colomar
2022-12-15 0:30 ` Alejandro Colomar
2022-12-15 0:26 ` [PATCH v5 2/5] stpecpy.3, stpecpyx.3, ustpcpy.3, ustr2stp.3, zustr2stp.3, zustr2ustp.3: Add new links to string_copy(7) Alejandro Colomar
2022-12-15 0:27 ` Alejandro Colomar
2022-12-16 18:47 ` Stefan Puiu
2022-12-16 19:03 ` Alejandro Colomar
2022-12-16 19:09 ` Alejandro Colomar
2022-12-15 0:26 ` [PATCH v5 3/5] stpcpy.3, strcpy.3, strcat.3: Document in a single page Alejandro Colomar
2022-12-16 14:46 ` Alejandro Colomar
2022-12-16 14:47 ` Alejandro Colomar
2022-12-15 0:26 ` [PATCH v5 4/5] stpncpy.3, strncpy.3: " Alejandro Colomar
2022-12-15 0:28 ` Alejandro Colomar
2022-12-15 0:26 ` [PATCH v5 5/5] strncat.3: Rewrite to be consistent with string_copy.7 Alejandro Colomar
2022-12-15 0:29 ` Alejandro Colomar
2022-12-14 16:17 ` [PATCH v4 1/1] strcpy.3: Rewrite page to document all string-copying functions Alejandro Colomar
2022-12-14 0:03 ` [PATCH v3 " Alejandro Colomar
2022-12-14 16:22 ` Douglas McIlroy
2022-12-14 16:36 ` Alejandro Colomar
2022-12-14 17:11 ` Alejandro Colomar
2022-12-14 17:19 ` Alejandro Colomar
2022-12-12 23:00 ` [PATCH v2 1/3] " Alejandro Colomar
2022-12-12 23:00 ` [PATCH v2 2/3] stpcpy.3, stpncpy.3, strcat.3, strncat.3, strncpy.3: Transform the old pages into links to strcpy(3) Alejandro Colomar
2022-12-12 23:00 ` [PATCH v2 3/3] stpecpy.3, stpecpyx.3, strlcat.3, strlcpy.3, strscpy.3: Add new " Alejandro Colomar
2022-12-12 14:24 ` [PATCH 2/3] stpcpy.3, stpncpy.3, strcat.3, strncat.3, strncpy.3: Transform the old pages into " Alejandro Colomar
2022-12-12 14:24 ` [PATCH 3/3] stpecpy.3, stpecpyx.3, strlcat.3, strlcpy.3, strscpy.3: Add new " Alejandro Colomar
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.