bug#22033: time-utc format is lossy

bug-guile

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#22033: time-utc format is lossy

From:	Zefram
Subject:	bug#22033: time-utc format is lossy
Date:	Mon, 24 Apr 2017 21:32:14 +0100

I wrote:
>                                   These two seconds are perfectly
>distinct parts of the UTC time scale, and the time-utc format ought to
>preserve their distinction.

This is a problematic goal.  At the time I wrote the bug report I didn't
have a satisfactory idea of how to achieve it, but I think I've come up
with one now.

The essential problem is that the SRFI-19 time structure expects to
encapsulate a scalar value -- as it says, a count of seconds since
some epoch -- but there is no natural scalar representation of a UTC
time.  Because of the irregularity imposed by its leaps, the natural
representation of a UTC time is a two-part structure, consisting of an
integer identifying the day and a fractional count of seconds elapsed
within the day.  Because UTC days contain differing numbers of seconds,
this is a variable-radix system.  SRFI-19 doesn't offer any structure that
has this simple form.  The only structure that it describes as separating
representation of the day from time of day is the date structure, which
splits up the time representation much more and has the complication of
the timezone offset.

The present approach of the library is to squeeze a UTC time into the time
structure by converting the variable-radix value into a scalar by using
a fixed radix of 86400.  This has the advantage of producing a scalar,
and of the scalar behaving continuously on most UTC days, but the major
downside of being lossy, aliasing some UTC times.  The scalar also isn't
really a count of seconds since an epoch, as SRFI-19 expects, breaking
arithmetic on it.  It looks rather as though this part of SRFI-19 was
written expecting this sort of transformation of UTC, but conflictingly
expecting it to serve as an unambiguous encoding and as a genuine count
of seconds since an epoch.

A simple workaround would be to create a scalar in the same kind of
way but using a larger fixed radix: minimally 86401, or more roundly
131072.  This means we have a scalar value that fits easily into the time
structure, and unambiguously encodes all UTC times.  But it's still not
a count of seconds since an epoch, and it's appreciably less like such
a count because it's no longer continuous across (most) UTC day ends.

Since the time structure has separate fields for seconds and nanoseconds,
it would be possible to borrow a trick sometimes used with the Unix
struct timespec: extending the nanoseconds range to represent leap
seconds.  This would be mostly like the present arrangement, with
the seconds count increasing by 86400 per UTC day, but with a leap
second unambiguously represented by the seconds count of the preceding
second and a nanoseconds count in the range [1000000000, 2000000000).
This fixes the ambiguity, but retains all the other downsides of the
present badly-behaved scalar, and adds the substantial downside of
breaking expectations of normalisation.

The alternative to all of those hacks is to produce a continuous scalar
value that genuinely counts the seconds of UTC.  This is feasible.
It would have a distinct representation for all points on the UTC
time scale.  By being a true scalar value it would fully meet SRFI-19's
description of the time structure, would be represented in normalised
fashion, and would support arithmetic operations on the seconds of UTC
(fixing bug#26164 with no extra effort).

The downside is that this is an unusual and somewhat surprising
arrangement.  I've never previously seen a linear count of UTC
seconds brought out as a product of any time library.  It would
mean that a time-utc structure is not an encoding of a UTC time as
normally understood: the date structure would serve that purpose, and
a time-utc would instead have a hybrid meaning halfway between what we
usually think of as UTC and TAI times.  In the leap-seconds era (1972
onwards), the scalar value in a time-utc would be a constant offset
from the scalar value in the corresponding time-tai.  This implies that
conversion operations would be in a different place from where they
are now.  Whereas currently date/time-utc conversions are almost purely
arithmetical and time-utc/time-tai conversions involve the leap second
table, instead date/time-utc conversions would require the leap second
table and time-utc/time-tai conversions would be purely arithmetical
for the leap-seconds era.  (Frequency offsets would come into the
time-utc/time-tai conversions, for times in the rubber-seconds era.)

I'm pretty sure that this actually-linear treatment of time-utc is not
what the author of SRFI-19 envisioned.  But it fits the actual words of
the standard better than anything else I can imagine, and would fix a
bunch of problems that otherwise look painful.  I reckon this is the best
way forward.  What do you think?  If you like it, I could work up a patch.

-zefram

[Prev in Thread]

Current Thread

[Next in Thread]

bug#22033: time-utc format is lossy, Zefram <=

Prev by Date: bug#26633: TAI<->UTC conversion botches pre-1961 era
Next by Date: bug#26149: SRFI-19 doc erroneously warns about Gregorian reform
Previous by thread: bug#26633: TAI<->UTC conversion botches pre-1961 era
Next by thread: bug#26151: date-year-day screws up leap days prior to AD 1
Index(es):
- Date
- Thread