All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] ctime.3: mktime() may modify tm_hour due to tm_isdst
@ 2021-10-10 10:52 Alejandro Colomar
  2021-10-11 10:27 ` Alejandro Colomar (man-pages)
  0 siblings, 1 reply; 18+ messages in thread
From: Alejandro Colomar @ 2021-10-10 10:52 UTC (permalink / raw)
  To: mtk.manpages, linux-man; +Cc: Alejandro Colomar, libc-alpha, Paul Eggert

If the input DST value is the opposite of the one that mktime()
uses, which comes from the current system timezone (see tzset(3)),
mktime() will modify the hour (and if it's in a day limit, it may
carry up to modify other fields) to normalize the time to the
correct DST.

If a user wants to avoid this, it should use UTC time, which never
has DST.  For that, setenv("TZ", "", 1); sets UTC time for the
program.  After that, the program should specify that DST is not
in effect, by setting tm_isdst to 0 (or let the system guess it
with -1), since setting tm_isdst to a positive value will result
in an error, (probably) due to mktime() considering that it is
invalid to have DST enabled for UTC times.

Cc: Paul Eggert <eggert@cs.ucla.edu>
Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com>
---
 man3/ctime.3 | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/man3/ctime.3 b/man3/ctime.3
index 0e2068a09..7a5714be8 100644
--- a/man3/ctime.3
+++ b/man3/ctime.3
@@ -260,6 +260,13 @@ normalized (so that, for example, 40 October is changed into 9 November);
 is set (regardless of its initial value)
 to a positive value or to 0, respectively,
 to indicate whether DST is or is not in effect at the specified time.
+If the initial value of
+.I tm_isdst
+is inconsistent with the one set by
+.BR mktime (),
+.I tm_hour
+(and possibly other fields)
+will be modified to normalize the time to the correct DST.
 Calling
 .BR mktime ()
 also sets the external variable \fItzname\fP with
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] ctime.3: mktime() may modify tm_hour due to tm_isdst
  2021-10-10 10:52 [PATCH] ctime.3: mktime() may modify tm_hour due to tm_isdst Alejandro Colomar
@ 2021-10-11 10:27 ` Alejandro Colomar (man-pages)
  2021-10-11 11:12   ` [PATCH v2 1/2] " Alejandro Colomar
                     ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Alejandro Colomar (man-pages) @ 2021-10-11 10:27 UTC (permalink / raw)
  To: mtk.manpages, linux-man; +Cc: libc-alpha, Paul Eggert

Hi Michael,

Please don't apply this patch.  I'll resend with the same contents but 
with a different commit message.

On 10/10/21 12:52 PM, Alejandro Colomar wrote:
> If the input DST value is the opposite of the one that mktime()
> uses, which comes from the current system timezone (see tzset(3)),
> mktime() will modify the hour (and if it's in a day limit, it may
> carry up to modify other fields) to normalize the time to the
> correct DST.
> 
> If a user wants to avoid this, it should use UTC time, which never
> has DST.  For that, setenv("TZ", "", 1); sets UTC time for the
> program.  After that, the program should specify that DST is not
> in effect, by setting tm_isdst to 0 (or let the system guess it
> with -1), since setting tm_isdst to a positive value will result
> in an error, (probably) due to mktime() considering that it is
> invalid to have DST enabled for UTC times.

Today I found timegm(3), which does the same as mktime(3) with TZ set to 
UTC, but doesn't require the whole program to be using UTC times, so 
I'll add this to the commit message.

BTW, timegm(3) says that you should "avoid their use" because timegm(3) 
is a Linux and BSD extension, but its use can NOT be avoided (well, it 
can, but if not done very carefully, you are likely to introduce bugs 
due to setenv(3) not being thread-safe), so I'd remove that sentence 
from timegm(3).  I think it should be in POSIX.  As FreeBSD timegm(3) 
says, there's no (safe and easy) way to roll your own timegm() based on 
standard functions:

[
      The timegm() function is not specified by any standard; its 
function can-
      not be completely emulated	using the standard functions described 
above.
]

Thanks,

Alex


> 
> Cc: Paul Eggert <eggert@cs.ucla.edu>
> Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com>
> ---
>   man3/ctime.3 | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/man3/ctime.3 b/man3/ctime.3
> index 0e2068a09..7a5714be8 100644
> --- a/man3/ctime.3
> +++ b/man3/ctime.3
> @@ -260,6 +260,13 @@ normalized (so that, for example, 40 October is changed into 9 November);
>   is set (regardless of its initial value)
>   to a positive value or to 0, respectively,
>   to indicate whether DST is or is not in effect at the specified time.
> +If the initial value of
> +.I tm_isdst
> +is inconsistent with the one set by
> +.BR mktime (),
> +.I tm_hour
> +(and possibly other fields)
> +will be modified to normalize the time to the correct DST.
>   Calling
>   .BR mktime ()
>   also sets the external variable \fItzname\fP with
> 


-- 
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/2] ctime.3: mktime() may modify tm_hour due to tm_isdst
  2021-10-11 10:27 ` Alejandro Colomar (man-pages)
@ 2021-10-11 11:12   ` Alejandro Colomar
  2021-10-11 15:36     ` Paul Eggert
  2021-10-11 11:12   ` [PATCH v2 2/2] timegm.3: Remove recommendation against use of timegm() Alejandro Colomar
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Alejandro Colomar @ 2021-10-11 11:12 UTC (permalink / raw)
  To: mtk.manpages, linux-man; +Cc: Alejandro Colomar, libc-alpha, Paul Eggert

If the input DST value is the opposite of the one that mktime()
uses, which comes from the current system timezone (see tzset(3)),
mktime() will modify the hour (and if it's in a day limit, it may
carry up to modify other fields) to normalize the time to the
correct DST.

If a user wants to avoid this, the user probably wants to use UTC
time.  mktime(3) uses local time, so it's not suitable for that
(by default).  Consider the following solutions:

For that, the recommended solution is to use timegm(3), which
sadly is non-portable (it is present in Linux and the BSDs, but
not in POSIX).

A portable solution (untested) might be to implement your own
timegm(3):
	time_t portable_timegm(struct tm *tm)
	{
		tm->tm_isdst = 0;
		return mktime(tm) - timezone;
	}
and assuming no other thread will change the current timezone
during this call.

Another portable solution would involve setting the timezone
explicitly to UTC+0 with setenv() (see tzset(3)).  But this forces
all of the program to use UTC time, which might not be desirable.

Cc: Paul Eggert <eggert@cs.ucla.edu>
Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com>
---
 man3/ctime.3 | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/man3/ctime.3 b/man3/ctime.3
index 0e2068a09..7a5714be8 100644
--- a/man3/ctime.3
+++ b/man3/ctime.3
@@ -260,6 +260,13 @@ normalized (so that, for example, 40 October is changed into 9 November);
 is set (regardless of its initial value)
 to a positive value or to 0, respectively,
 to indicate whether DST is or is not in effect at the specified time.
+If the initial value of
+.I tm_isdst
+is inconsistent with the one set by
+.BR mktime (),
+.I tm_hour
+(and possibly other fields)
+will be modified to normalize the time to the correct DST.
 Calling
 .BR mktime ()
 also sets the external variable \fItzname\fP with
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/2] timegm.3: Remove recommendation against use of timegm()
  2021-10-11 10:27 ` Alejandro Colomar (man-pages)
  2021-10-11 11:12   ` [PATCH v2 1/2] " Alejandro Colomar
@ 2021-10-11 11:12   ` Alejandro Colomar
  2021-10-11 15:40     ` Paul Eggert
  2021-10-11 11:54   ` [PATCH v3 1/2] ctime.3: mktime() may modify tm_hour due to tm_isdst Alejandro Colomar
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Alejandro Colomar @ 2021-10-11 11:12 UTC (permalink / raw)
  To: mtk.manpages, linux-man; +Cc: Alejandro Colomar, libc-alpha, Paul Eggert

It was straight after a note that they are nonstandard functions,
which already tells the user that if portability is in mind, they
shouldn't be used, so this recommendation adds nothing in that
sense.

Also, there's a note that timelocal() should _never_ be used, due
to mktime(3) being identical and in the POSIX standard (it is also
in C99), so this note would also add nothing in that sense.

So the only uses not covered by those other notes are non-portable
uses of timegm().  In that scenario, it is an excellent function,
since it avoids races, which a home-made timegm(3) implementation
by means of standard functions would have.

A trivial implementation of timegm(3) using only standard C could
be (I didn't test it; use on your own):

// portable_timegm.c
 #include <time.h>

time_t portable_timegm(struct tm *tm)
{
	tm->tm_isdst = 0;
	/*
	 * If another thread modifies the timezone during the
	 * execution of the line below, it will produce undefined
	 * behavior.
	 */
	return mktime(tm) - timezone;
}

Cc: Paul Eggert <eggert@cs.ucla.edu>
Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com>
---
 man3/timegm.3 | 1 -
 1 file changed, 1 deletion(-)

diff --git a/man3/timegm.3 b/man3/timegm.3
index b848e83e1..0e8528b26 100644
--- a/man3/timegm.3
+++ b/man3/timegm.3
@@ -97,7 +97,6 @@ T}	Thread safety	MT-Safe env locale
 .SH CONFORMING TO
 These functions are nonstandard GNU extensions
 that are also present on the BSDs.
-Avoid their use.
 .SH NOTES
 The
 .BR timelocal ()
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 1/2] ctime.3: mktime() may modify tm_hour due to tm_isdst
  2021-10-11 10:27 ` Alejandro Colomar (man-pages)
  2021-10-11 11:12   ` [PATCH v2 1/2] " Alejandro Colomar
  2021-10-11 11:12   ` [PATCH v2 2/2] timegm.3: Remove recommendation against use of timegm() Alejandro Colomar
@ 2021-10-11 11:54   ` Alejandro Colomar
  2021-10-11 11:54   ` [PATCH v3 2/2] timegm.3: Remove recommendation against their use Alejandro Colomar
  2021-10-11 15:37   ` [PATCH] ctime.3: mktime() may modify tm_hour due to tm_isdst Paul Eggert
  4 siblings, 0 replies; 18+ messages in thread
From: Alejandro Colomar @ 2021-10-11 11:54 UTC (permalink / raw)
  To: mtk.manpages, linux-man; +Cc: Alejandro Colomar, libc-alpha, Paul Eggert

If the input DST value is the opposite of the one that mktime()
uses, which comes from the current system timezone (see tzset(3)),
mktime() will modify the hour (and if it's in a day limit, it may
carry up to modify other fields) to normalize the time to the
correct DST.

If a user wants to avoid this, the user probably wants to use UTC
time.  mktime(3) uses local time, so it's not suitable for that
(by itself).  Consider the following solutions:

For that, the easiest solution is to use timegm(3), which is
non-portable (it is present in Linux and the BSDs, but not in
POSIX).

A portable solution (untested) might be to implement your own
timegm(3):
	time_t portable_timegm(struct tm *tm)
	{
		tm->tm_isdst = 0;
		return mktime(tm) - timezone;
	}

Another portable solution would involve setting the timezone
explicitly to UTC+0 with setenv() (see tzset(3)).  But this forces
all of the program to use UTC time, which might not be desirable,
especially in multi-threaded programs.

Cc: Paul Eggert <eggert@cs.ucla.edu>
Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com>
---
 man3/ctime.3 | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/man3/ctime.3 b/man3/ctime.3
index 0e2068a09..7a5714be8 100644
--- a/man3/ctime.3
+++ b/man3/ctime.3
@@ -260,6 +260,13 @@ normalized (so that, for example, 40 October is changed into 9 November);
 is set (regardless of its initial value)
 to a positive value or to 0, respectively,
 to indicate whether DST is or is not in effect at the specified time.
+If the initial value of
+.I tm_isdst
+is inconsistent with the one set by
+.BR mktime (),
+.I tm_hour
+(and possibly other fields)
+will be modified to normalize the time to the correct DST.
 Calling
 .BR mktime ()
 also sets the external variable \fItzname\fP with
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v3 2/2] timegm.3: Remove recommendation against their use
  2021-10-11 10:27 ` Alejandro Colomar (man-pages)
                     ` (2 preceding siblings ...)
  2021-10-11 11:54   ` [PATCH v3 1/2] ctime.3: mktime() may modify tm_hour due to tm_isdst Alejandro Colomar
@ 2021-10-11 11:54   ` Alejandro Colomar
  2021-10-11 15:37   ` [PATCH] ctime.3: mktime() may modify tm_hour due to tm_isdst Paul Eggert
  4 siblings, 0 replies; 18+ messages in thread
From: Alejandro Colomar @ 2021-10-11 11:54 UTC (permalink / raw)
  To: mtk.manpages, linux-man; +Cc: Alejandro Colomar, libc-alpha, Paul Eggert

It was straight after a note that they are nonstandard functions,
which already tells the user that if portability is in mind, they
shouldn't be used, so this recommendation adds nothing in that
sense.

Also, there's a note that timelocal() should _never_ be used, due
to mktime() being identical and in the POSIX standard (it is also
in C99), so this note would also add nothing in that sense.

So the only uses not covered by those other notes are non-portable
uses of timegm(3).  In that scenario, it is an excellent function.

When porting to other systems, it is trivial to port timegm(3)
using only standard C (I didn't test it; use on your own):

// timegm.c
 #include <time.h>

time_t timegm(struct tm *tm)
{
	tm->tm_isdst = 0;
	return mktime(tm) - timezone;
}

Cc: Paul Eggert <eggert@cs.ucla.edu>
Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com>
---
 man3/timegm.3 | 1 -
 1 file changed, 1 deletion(-)

diff --git a/man3/timegm.3 b/man3/timegm.3
index b848e83e1..0e8528b26 100644
--- a/man3/timegm.3
+++ b/man3/timegm.3
@@ -97,7 +97,6 @@ T}	Thread safety	MT-Safe env locale
 .SH CONFORMING TO
 These functions are nonstandard GNU extensions
 that are also present on the BSDs.
-Avoid their use.
 .SH NOTES
 The
 .BR timelocal ()
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/2] ctime.3: mktime() may modify tm_hour due to tm_isdst
  2021-10-11 11:12   ` [PATCH v2 1/2] " Alejandro Colomar
@ 2021-10-11 15:36     ` Paul Eggert
  2021-10-15 21:49       ` Alejandro Colomar (man-pages)
  0 siblings, 1 reply; 18+ messages in thread
From: Paul Eggert @ 2021-10-11 15:36 UTC (permalink / raw)
  To: Alejandro Colomar, mtk.manpages, linux-man; +Cc: libc-alpha

On 10/11/21 4:12 AM, Alejandro Colomar wrote:
> +If the initial value of
> +.I tm_isdst
> +is inconsistent with the one set by
> +.BR mktime (),
> +.I tm_hour
> +(and possibly other fields)
> +will be modified to normalize the time to the correct DST.

I don't see why this change is necessary. mktime normalizes all its 
input fields: there's nothing special about tm_isdst and tm_hour.

If normalization isn't explained clearly enough elsewhere in the man 
page, that explanation should be fixed; there shouldn't be special-case 
wording that implies that this is the only special case.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] ctime.3: mktime() may modify tm_hour due to tm_isdst
  2021-10-11 10:27 ` Alejandro Colomar (man-pages)
                     ` (3 preceding siblings ...)
  2021-10-11 11:54   ` [PATCH v3 2/2] timegm.3: Remove recommendation against their use Alejandro Colomar
@ 2021-10-11 15:37   ` Paul Eggert
  2021-10-11 22:05     ` Joseph Myers
  4 siblings, 1 reply; 18+ messages in thread
From: Paul Eggert @ 2021-10-11 15:37 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages), mtk.manpages, linux-man; +Cc: libc-alpha

On 10/11/21 3:27 AM, Alejandro Colomar (man-pages) wrote:
> timegm(3) says that you should "avoid their use" because timegm(3) is a 
> Linux and BSD extension, but its use can NOT be avoided (well, it can, 
> but if not done very carefully, you are likely to introduce bugs due to 
> setenv(3) not being thread-safe), so I'd remove that sentence from 
> timegm(3).  I think it should be in POSIX.

No, NetBSD's mktime_z should be in POSIX, as it nicely generalizes both 
mktime and timegm.

mktime_z should also be in glibc, but that's another story....

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] timegm.3: Remove recommendation against use of timegm()
  2021-10-11 11:12   ` [PATCH v2 2/2] timegm.3: Remove recommendation against use of timegm() Alejandro Colomar
@ 2021-10-11 15:40     ` Paul Eggert
  2021-10-15 22:03       ` Alejandro Colomar (man-pages)
  0 siblings, 1 reply; 18+ messages in thread
From: Paul Eggert @ 2021-10-11 15:40 UTC (permalink / raw)
  To: Alejandro Colomar, mtk.manpages, linux-man; +Cc: libc-alpha

On 10/11/21 4:12 AM, Alejandro Colomar wrote:

> time_t portable_timegm(struct tm *tm)
> {
> 	tm->tm_isdst = 0;
> 	/*
> 	 * If another thread modifies the timezone during the
> 	 * execution of the line below, it will produce undefined
> 	 * behavior.
> 	 */
> 	return mktime(tm) - timezone;
> }

This doesn't work for multiple reasons: it's not thread-safe, mktime 
might set timezone even in a single-threaded app, and the subtraction 
might overflow.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] ctime.3: mktime() may modify tm_hour due to tm_isdst
  2021-10-11 15:37   ` [PATCH] ctime.3: mktime() may modify tm_hour due to tm_isdst Paul Eggert
@ 2021-10-11 22:05     ` Joseph Myers
  2021-10-15 21:55       ` Alejandro Colomar (man-pages)
  0 siblings, 1 reply; 18+ messages in thread
From: Joseph Myers @ 2021-10-11 22:05 UTC (permalink / raw)
  To: Paul Eggert
  Cc: Alejandro Colomar (man-pages), mtk.manpages, linux-man, libc-alpha

[-- Attachment #1: Type: text/plain, Size: 893 bytes --]

On Mon, 11 Oct 2021, Paul Eggert wrote:

> On 10/11/21 3:27 AM, Alejandro Colomar (man-pages) wrote:
> > timegm(3) says that you should "avoid their use" because timegm(3) is a
> > Linux and BSD extension, but its use can NOT be avoided (well, it can, but
> > if not done very carefully, you are likely to introduce bugs due to
> > setenv(3) not being thread-safe), so I'd remove that sentence from
> > timegm(3).  I think it should be in POSIX.
> 
> No, NetBSD's mktime_z should be in POSIX, as it nicely generalizes both mktime
> and timegm.

Arguably ISO C (there's no obvious dependence on any concepts that are in 
scope of POSIX but not of ISO C), but we're now past the deadline to 
request document numbers for proposals to C23 (and while there's a 
proposal to add timegm, there's no proposal to add functions using 
explicit time zones).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/2] ctime.3: mktime() may modify tm_hour due to tm_isdst
  2021-10-11 15:36     ` Paul Eggert
@ 2021-10-15 21:49       ` Alejandro Colomar (man-pages)
  0 siblings, 0 replies; 18+ messages in thread
From: Alejandro Colomar (man-pages) @ 2021-10-15 21:49 UTC (permalink / raw)
  To: Paul Eggert, mtk.manpages, linux-man; +Cc: libc-alpha

Hi Paul,

On 10/11/21 5:36 PM, Paul Eggert wrote:
> On 10/11/21 4:12 AM, Alejandro Colomar wrote:
>> +If the initial value of
>> +.I tm_isdst
>> +is inconsistent with the one set by
>> +.BR mktime (),
>> +.I tm_hour
>> +(and possibly other fields)
>> +will be modified to normalize the time to the correct DST.
> 
> I don't see why this change is necessary. mktime normalizes all its 
> input fields: there's nothing special about tm_isdst and tm_hour.
> 
> If normalization isn't explained clearly enough elsewhere in the man 
> page, that explanation should be fixed; there shouldn't be special-case 
> wording that implies that this is the only special case.

Hmm, you're right.  I think I was misled by the wording "if structure 
members are outside their valid interval", which led me to think that 
since 08h was between 0 and 23 it shouldn´t be affected, but the 
normalization goes beyond that and interprets "valid range" as a more 
general concept so that the full time has to correspond to a valid time 
for a given timezone.

I'm not sure how to reword it.  I'll keep it with the original text, and 
keep thinking about it.

Thanks,

Alex


-- 
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] ctime.3: mktime() may modify tm_hour due to tm_isdst
  2021-10-11 22:05     ` Joseph Myers
@ 2021-10-15 21:55       ` Alejandro Colomar (man-pages)
  0 siblings, 0 replies; 18+ messages in thread
From: Alejandro Colomar (man-pages) @ 2021-10-15 21:55 UTC (permalink / raw)
  To: Joseph Myers, Paul Eggert; +Cc: mtk.manpages, linux-man, libc-alpha

Hi Josepf and Paul,

On 10/12/21 12:05 AM, Joseph Myers wrote:
> On Mon, 11 Oct 2021, Paul Eggert wrote:
> 
>> On 10/11/21 3:27 AM, Alejandro Colomar (man-pages) wrote:
>>> timegm(3) says that you should "avoid their use" because timegm(3) is a
>>> Linux and BSD extension, but its use can NOT be avoided (well, it can, but
>>> if not done very carefully, you are likely to introduce bugs due to
>>> setenv(3) not being thread-safe), so I'd remove that sentence from
>>> timegm(3).  I think it should be in POSIX.
>>
>> No, NetBSD's mktime_z should be in POSIX, as it nicely generalizes both mktime
>> and timegm.

Hmm, I didn´t know that one either...  Yes, it seems a nicer interface 
(and can be used to implement both mktime and timegm).

I'd still remove the warning against timegm(3) in the man page, though.

> 
> Arguably ISO C (there's no obvious dependence on any concepts that are in
> scope of POSIX but not of ISO C), but we're now past the deadline to
> request document numbers for proposals to C23 (and while there's a
> proposal to add timegm, there's no proposal to add functions using
> explicit time zones).
> 

Yes.

On 10/11/21 5:37 PM, Paul Eggert wrote:
 > mktime_z should also be in glibc, but that's another story....

BTW, I started implementing mktime_z(3) in my library, based mostly on 
glibc's mktime(3) code.  When I have something working, I'll tell you in 
case you want to pick it for glibc.

Cheers,

Alex

-- 
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] timegm.3: Remove recommendation against use of timegm()
  2021-10-11 15:40     ` Paul Eggert
@ 2021-10-15 22:03       ` Alejandro Colomar (man-pages)
  2021-10-16  0:20         ` Paul Eggert
  0 siblings, 1 reply; 18+ messages in thread
From: Alejandro Colomar (man-pages) @ 2021-10-15 22:03 UTC (permalink / raw)
  To: Paul Eggert; +Cc: libc-alpha, mtk.manpages, linux-man

Hi Paul,

On 10/11/21 5:40 PM, Paul Eggert wrote:
> On 10/11/21 4:12 AM, Alejandro Colomar wrote:
> 
>> time_t portable_timegm(struct tm *tm)
>> {
>>     tm->tm_isdst = 0;
>>     /*
>>      * If another thread modifies the timezone during the
>>      * execution of the line below, it will produce undefined
>>      * behavior.
>>      */
>>     return mktime(tm) - timezone;
>> }
> 
> This doesn't work for multiple reasons:


> it's not thread-safe,

Actually, since timegm(3) is implemented in terms of mktime(3), as far 
as I could read from glibc code, the problem will be the same, I think. 
  I don't understand why it wasn't the other way around; maybe it was 
more complex internally...  But timegm(3) shouldn't need to depend on 
environment variables.

> mktime might set timezone even in a single-threaded app,

Yes, I should have called tzset() before the return line.

> and the subtraction might overflow.

Yup, casting to int64_t needed.  BTW, I had a look at mktime source 
code, and it uses long, which might be 32 bits, and then there's a lot 
of checking for overflow.  Wouldn't it be simpler to just implement 
mktime(3) with int64_t internally?  Then, only at the return, cast it 
implicitly to whatever time_t means, but int64_t would simplify the code 
very much, I think.

Thanks,

Alex

-- 
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] timegm.3: Remove recommendation against use of timegm()
  2021-10-15 22:03       ` Alejandro Colomar (man-pages)
@ 2021-10-16  0:20         ` Paul Eggert
  2021-10-17 18:02           ` Alejandro Colomar (man-pages)
  0 siblings, 1 reply; 18+ messages in thread
From: Paul Eggert @ 2021-10-16  0:20 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: libc-alpha, mtk.manpages, linux-man

On 10/15/21 3:03 PM, Alejandro Colomar (man-pages) wrote:

> Actually, since timegm(3) is implemented in terms of mktime(3), as far as I could read from glibc code, the problem will be the same, I think.

No, because another thread could setenv ("TZ", ...) between the time 
that you call mktime and the time you look at the 'timezone' variable. 
So even though mktime itself is thread-safe, the expression 'mktime(tm) 
- timezone' is not.

> But timegm(3) shouldn't need to depend on environment variables.

It does depend, if leap seconds are involved.

>> and the subtraction might overflow.
> 
> Yup, casting to int64_t needed.

That would help, but it still wouldn't suffice. It'd mishandle -1 
returns, for example. Plus, we're better of not putting today's hardware 
assumptions into code (suppose int is 64 bits in future machines?).

> BTW, I had a look at mktime source 
> code, and it uses long, which might be 32 bits, and then there's a lot 
> of checking for overflow.

mktime uses long_int, which is not necessarily 'long'. And no matter 
what type you pick, it could overflow on some platform, even if it's an 
only-hypothetical platform now.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] timegm.3: Remove recommendation against use of timegm()
  2021-10-16  0:20         ` Paul Eggert
@ 2021-10-17 18:02           ` Alejandro Colomar (man-pages)
  2021-10-17 22:00             ` Paul Eggert
  0 siblings, 1 reply; 18+ messages in thread
From: Alejandro Colomar (man-pages) @ 2021-10-17 18:02 UTC (permalink / raw)
  To: Paul Eggert; +Cc: libc-alpha, mtk.manpages, linux-man

Hi Paul,

On 10/16/21 2:20 AM, Paul Eggert wrote:
> On 10/15/21 3:03 PM, Alejandro Colomar (man-pages) wrote:
> 
>> Actually, since timegm(3) is implemented in terms of mktime(3), as far 
>> as I could read from glibc code, the problem will be the same, I think.
> 
> No, because another thread could setenv ("TZ", ...) between the time 
> that you call mktime and the time you look at the 'timezone' variable. 
> So even though mktime itself is thread-safe, the expression 'mktime(tm) 
> - timezone' is not.

Yes, there are probably many bugs in that sample code I wrote, which 
glibc solves in its timegm(3) implementation.  That probably gives more 
force to the original point: timegm(3) is the only non-error-prone 
solution in glibc for using UTC times, so it should not be marked as 
"avoid using".  The standards should get a function that does that, be 
it timegm(), mktime_z(), or both.

Just for curiosity, I'm not sure about this, but from what I've seen, 
the only lock in glibc is in gmtime(), which is called repeatedly from 
within mktime().  If another thread calls setenv("TZ"...) in between one 
of those calls, wouldn't it produce the same problems?

> 
>> But timegm(3) shouldn't need to depend on environment variables.
> 
> It does depend, if leap seconds are involved.

Okay.  (I don't know too much about those.)

> 
>>> and the subtraction might overflow.
>>
>> Yup, casting to int64_t needed.
> 
> That would help, but it still wouldn't suffice. It'd mishandle -1 
> returns, for example.

Ahh, yes.

> Plus, we're better of not putting today's hardware 
> assumptions into code (suppose int is 64 bits in future machines?).
> 
>> BTW, I had a look at mktime source code, and it uses long, which might 
>> be 32 bits, and then there's a lot of checking for overflow.
> 
> mktime uses long_int, which is not necessarily 'long'. And no matter 
> what type you pick, it could overflow on some platform, even if it's an 
> only-hypothetical platform now.

I think that's not a problem for the following reasons:

- int is unlikely to be >32 bits.  If so, we would miss one of the 
"conventional" types: int8_t, int16_t, int32_t couldn't map to 
fundamental types, unless we add a new type (short short int?), which is 
also unlikely because scanf() already uses %hhi for signed char.  I 
think it's more likely to see something like 'long long long int'.

- The current types can already handle 128-bit archs (just use the same 
mapping as in amd64 and change long long int to be int128_t), so maybe 
we'll need the triple long when we get to 256-bit archs.  Very 
hypothetically, that is.

- Even if int ever happened to be 64 bit, this problem would still be 
something very theoretical, since INT64_MAX is way greater than the age 
of the universe, and many orders of magnitude greater than the expected 
lifespan of the sun, and therefore the concept of leap years, months, 
ydays, wdays, and so on will be meaningless for such values.  How many 
seconds since the Epoch will have happened the 2nd March of the year 
that the Milky Way collides with Andromeda, at 11:30?  I think the 
correct answer to that question should be undefined behavior; or an 
error if you want to be nice.

So I wouldn't care for now, and maybe just add some initial check such as:

if (tm->tm_year > SOME_ARBITRARY_HUGE_VALUE || tm->tm_mon > 
SOME_ARBITRARY_HUGE_VALUE || ...) {
	errno = EOVERFLOW;
	return -1;
}

and then go on.


Thanks,

Alex


-- 
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] timegm.3: Remove recommendation against use of timegm()
  2021-10-17 18:02           ` Alejandro Colomar (man-pages)
@ 2021-10-17 22:00             ` Paul Eggert
  2021-11-05  0:47               ` Alejandro Colomar (man-pages)
  0 siblings, 1 reply; 18+ messages in thread
From: Paul Eggert @ 2021-10-17 22:00 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: libc-alpha, mtk.manpages, linux-man

On 10/17/21 11:02, Alejandro Colomar (man-pages) wrote:

> timegm(3) is the only non-error-prone 
> solution in glibc for using UTC times, so it should not be marked as 
> "avoid using".

timegm is not portable, so it's reasonable for the documentation to warn 
against its use. Perhaps the warning could be made clearer.


> - int is unlikely to be >32 bits.

There are a few platforms where int is >32 bits (these are typically 
ILP64). Although glibc doesn't currently support them, let's not place 
unnecessary obstacles in the way.


> - Even if int ever happened to be 64 bit, this problem would still be 
> something very theoretical

The behavior of the 'zdump' command would change. I imagine it'd affect 
other commands as well. Admittedly most people wouldn't notice.


> if (tm->tm_year > SOME_ARBITRARY_HUGE_VALUE 

Let's not impose arbitrary limits.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] timegm.3: Remove recommendation against use of timegm()
  2021-10-17 22:00             ` Paul Eggert
@ 2021-11-05  0:47               ` Alejandro Colomar (man-pages)
  2021-11-08  8:05                 ` Paul Eggert
  0 siblings, 1 reply; 18+ messages in thread
From: Alejandro Colomar (man-pages) @ 2021-11-05  0:47 UTC (permalink / raw)
  To: Paul Eggert; +Cc: libc-alpha, linux-man, Joseph Myers

Hi Paul,

On 10/18/21 00:00, Paul Eggert wrote:
> On 10/17/21 11:02, Alejandro Colomar (man-pages) wrote:
> 
>> timegm(3) is the only non-error-prone solution in glibc for using UTC 
>> times, so it should not be marked as "avoid using".
> 
> timegm is not portable, so it's reasonable for the documentation to warn 
> against its use. Perhaps the warning could be made clearer.

Well, there's no portable alternative, as we discussed, so if a program 
needs to do what timegm(3) does (handle UTC dates), it needs to call it. 
  The only portable alternatives are using other libraries such as boost 
(C++).  I prefer porting timegm(3) to those platforms instead.  Since 
C2X will add timegm(3), it's good news.

> 
> 
>> - int is unlikely to be >32 bits.
> 
> There are a few platforms where int is >32 bits (these are typically 
> ILP64).

Interesting; I didn't know that.  What a weird thing.

> Although glibc doesn't currently support them, let's not place 
> unnecessary obstacles in the way.
> 
> 
>> - Even if int ever happened to be 64 bit, this problem would still be 
>> something very theoretical
> 
> The behavior of the 'zdump' command would change. I imagine it'd affect 
> other commands as well. Admittedly most people wouldn't notice.
> 
> 
>> if (tm->tm_year > SOME_ARBITRARY_HUGE_VALUE 
> 
> Let's not impose arbitrary limits.
Since glibc doesn't support 64-bit 'int' yet, the following restriction 
could be added without altering the behavior of any existing program:

[
If any of the fields of 'struct tm' would overflow 'int32_t', the 
behavior of timegm(3) is undefined.
]

It is arbitrary, yes, but in reality, the code would handle correctly 
more than INT32_MAX; it would only cause overflow for values close to 
INT64_MAX.

Moreover, it would only affect those weird platforms.

And in those cases:
- If the date is something sane, UB will never be triggered.
- If the date can be compromised (e.g., user input), the only thing that 
the programmer needs to do to avoid UB, is to store the temporary values 
in int32_t variables before storing them in the struct tm.  That will 
handle any overflow by truncating it (if the programmer wants to do 
something else, that's fine, but it's the minimum to avoid UB).  Not 
much of a problem.

It would simplify very much glibc code, by imposing small restrictions 
to the user.

Don't you think?

Thanks,

Alex


-- 
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/2] timegm.3: Remove recommendation against use of timegm()
  2021-11-05  0:47               ` Alejandro Colomar (man-pages)
@ 2021-11-08  8:05                 ` Paul Eggert
  0 siblings, 0 replies; 18+ messages in thread
From: Paul Eggert @ 2021-11-08  8:05 UTC (permalink / raw)
  To: Alejandro Colomar (man-pages); +Cc: libc-alpha, linux-man, Joseph Myers

On 11/4/21 17:47, Alejandro Colomar (man-pages) wrote:
> Since C2X will add timegm(3), it's good news.

Yes at that point we can remove or reword the portability warning.


> the following restriction could be added without altering the behavior of any existing program:
> 
> [
> If any of the fields of 'struct tm' would overflow 'int32_t', the behavior of timegm(3) is undefined.
> ]

No, timegm should not be documented to have a 32-bit limit on struct tm 
components. It should be documented only to have an int limit on struct 
tm components. I.e., the same as mktime, localtime, gmtime, etc.


> It would simplify very much glibc code, by imposing small restrictions to the user. 

The general glibc philosophy is to not impose arbitrary limits on the 
user, even if the limits are "small restrictions". Occasionally limits 
need to be imposed anyway for a good reason, but there's no good reason 
here.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-11-08  8:05 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-10 10:52 [PATCH] ctime.3: mktime() may modify tm_hour due to tm_isdst Alejandro Colomar
2021-10-11 10:27 ` Alejandro Colomar (man-pages)
2021-10-11 11:12   ` [PATCH v2 1/2] " Alejandro Colomar
2021-10-11 15:36     ` Paul Eggert
2021-10-15 21:49       ` Alejandro Colomar (man-pages)
2021-10-11 11:12   ` [PATCH v2 2/2] timegm.3: Remove recommendation against use of timegm() Alejandro Colomar
2021-10-11 15:40     ` Paul Eggert
2021-10-15 22:03       ` Alejandro Colomar (man-pages)
2021-10-16  0:20         ` Paul Eggert
2021-10-17 18:02           ` Alejandro Colomar (man-pages)
2021-10-17 22:00             ` Paul Eggert
2021-11-05  0:47               ` Alejandro Colomar (man-pages)
2021-11-08  8:05                 ` Paul Eggert
2021-10-11 11:54   ` [PATCH v3 1/2] ctime.3: mktime() may modify tm_hour due to tm_isdst Alejandro Colomar
2021-10-11 11:54   ` [PATCH v3 2/2] timegm.3: Remove recommendation against their use Alejandro Colomar
2021-10-11 15:37   ` [PATCH] ctime.3: mktime() may modify tm_hour due to tm_isdst Paul Eggert
2021-10-11 22:05     ` Joseph Myers
2021-10-15 21:55       ` Alejandro Colomar (man-pages)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.