All of lore.kernel.org
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	Eric Sandeen <sandeen@sandeen.net>
Subject: Re: [PATCH 08/11] xfs: widen ondisk timestamps to deal with y2038 problem
Date: Thu, 20 Aug 2020 08:11:27 +0300	[thread overview]
Message-ID: <CAOQ4uxiinUPDB6K=cZ=4h1hwzOefoLgCR8pF3B+cn3u0HTWj0A@mail.gmail.com> (raw)
In-Reply-To: <20200820000102.GF6096@magnolia>

On Thu, Aug 20, 2020 at 3:03 AM Darrick J. Wong <darrick.wong@oracle.com> wrote:
>
> On Wed, Aug 19, 2020 at 02:43:22PM -0700, Darrick J. Wong wrote:
> > On Wed, Aug 19, 2020 at 09:35:35AM +1000, Dave Chinner wrote:
> > > On Mon, Aug 17, 2020 at 03:57:39PM -0700, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > >
> > > > Redesign the ondisk timestamps to be a simple unsigned 64-bit counter of
> > > > nanoseconds since 14 Dec 1901 (i.e. the minimum time in the 32-bit unix
> > > > time epoch).  This enables us to handle dates up to 2486, which solves
> > > > the y2038 problem.
> > > >
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > ---
> > > .....
> > > > +/* Convert an ondisk timestamp into the 64-bit safe incore format. */
> > > >  void
> > > >  xfs_inode_from_disk_timestamp(
> > > > + struct xfs_dinode               *dip,
> > > >   struct timespec64               *tv,
> > > >   const union xfs_timestamp       *ts)
> > > >  {
> > > > + if (dip->di_version >= 3 &&
> > > > +     (dip->di_flags2 & cpu_to_be64(XFS_DIFLAG2_BIGTIME))) {
> > > > +         uint64_t                t = be64_to_cpu(ts->t_bigtime);
> > > > +         uint64_t                s;
> > > > +         uint32_t                n;
> > > > +
> > > > +         s = div_u64_rem(t, NSEC_PER_SEC, &n);
> > > > +         tv->tv_sec = s - XFS_INO_BIGTIME_EPOCH;
> > > > +         tv->tv_nsec = n;
> > > > +         return;
> > > > + }
> > > > +
> > > >   tv->tv_sec = (int)be32_to_cpu(ts->t_sec);
> > > >   tv->tv_nsec = (int)be32_to_cpu(ts->t_nsec);
> > > >  }
> > >
> > > Can't say I'm sold on this union. It seems cleaner to me to just
> > > make the timestamp an opaque 64 bit field on disk and convert it to
> > > the in-memory representation directly in the to/from disk
> > > operations. e.g.:
> > >
> > > void
> > > xfs_inode_from_disk_timestamp(
> > >     struct xfs_dinode               *dip,
> > >     struct timespec64               *tv,
> > >     __be64                          ts)
> > > {
> > >
> > >     uint64_t                t = be64_to_cpu(ts);
> > >     uint64_t                s;
> > >     uint32_t                n;
> > >
> > >     if (xfs_dinode_is_bigtime(dip)) {
> > >             s = div_u64_rem(t, NSEC_PER_SEC, &n) - XFS_INO_BIGTIME_EPOCH;
> > >     } else {
> > >             s = (int)(t >> 32);
> > >             n = (int)(t & 0xffffffff);
> > >     }
> > >     tv->tv_sec = s;
> > >     tv->tv_nsec = n;
> > > }
> >
> > I don't like this open-coded union approach at all because now I have to
> > keep the t_sec and t_nsec bits separate in my head instead of letting
> > the C compiler take care of that detail.  The sample code above doesn't
> > handle that correctly either:
> >
> > Start with an old kernel on a little endian system; each uppercase
> > letter represents a byte (A is the LSB of t_sec, D is the MSB of t_sec,
> > E is the LSB of t_nsec, and H is the MSB of t_nsec):
> >
> >       sec  nsec (incore)
> >       ABCD EFGH
> >
> > That gets written out as:
> >
> >       sec  nsec (ondisk)
> >       DCBA HGFE
> >
> > Now reboot with a new kernel that only knows 64bit timestamps on disk:
> >
> >       64bit (ondisk)
> >       DCBAHGFE
> >
> > Now it does the first be64_to_cpu conversion:
> >       64bit (incore)
> >       EFGHABCD
> >
> > And then masks and shifts:
> >       sec  nsec (incore)
> >       EFGH ABCD
> >
> > Oops, we just switched the values!
> >
> > The correct approach (I think) is to perform the shifting and masking on
> > the raw __be64 value before converting them to incore format via
> > be32_to_cpu, but now I have to work out all four cases by hand instead
> > of letting the compiler do the legwork for me.  I don't remember if it's
> > correct to go around shifting and masking __be64 values.
> >
> > I guess the good news is that at least we have generic/402 to catch
> > these kinds of persistence problems, but ugh.
> >
> > Anyway, what are you afraid of?  The C compiler smoking crack and not
> > actually overlapping the two union elements?  We could control for
> > that...
>
> (Following up on the mailing list with something I pasted into IRC)
>
> Ok, so I temporarily patched up my dev tree with this approximation of
> how that would work, properly done:
>
> diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> index c59ddb56bb90..7c71e4440402 100644
> --- a/fs/xfs/libxfs/xfs_inode_buf.c
> +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> @@ -176,8 +176,8 @@ xfs_inode_from_disk_timestamp(
>                 return;
>         }
>
> -       tv->tv_sec = (int)be32_to_cpu(ts->t_sec);
> -       tv->tv_nsec = (int)be32_to_cpu(ts->t_nsec);
> +       tv->tv_sec = (time64_t)be32_to_cpu((__be32)(ts->t_bigtime >> 32));
> +       tv->tv_nsec = be32_to_cpu(ts->t_bigtime & 0xFFFFFFFFU);
>  }
>
>  int
> @@ -294,8 +294,8 @@ xfs_inode_to_disk_timestamp(
>                 return;
>         }
>
> -       ts->t_sec = cpu_to_be32(tv->tv_sec);
> -       ts->t_nsec = cpu_to_be32(tv->tv_nsec);
> +       ts->t_bigtime = (__be64)cpu_to_be32(tv->tv_sec) << 32 |
> +                       cpu_to_be32(tv->tv_nsec);
>  }
>
>  void
> diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
> index d44e8932979b..5d36d6dea326 100644
> --- a/fs/xfs/xfs_inode_item.c
> +++ b/fs/xfs/xfs_inode_item.c
> @@ -308,8 +308,8 @@ xfs_log_dinode_to_disk_ts(
>                 return;
>         }
>
> -       ts->t_sec = cpu_to_be32(its->t_sec);
> -       ts->t_nsec = cpu_to_be32(its->t_nsec);
> +       ts->t_bigtime = (__be64)cpu_to_be32(its->t_sec) << 32 |
> +                       cpu_to_be32(its->t_nsec);
>  }
>
> And immediately got a ton of smatch warnings:
>
> xfs_inode_buf.c:179:32: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:179:32: warning: cast to restricted __be32
> xfs_inode_buf.c:179:32: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:179:32: warning: cast to restricted __be32
> xfs_inode_buf.c:179:32: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:179:32: warning: cast to restricted __be32
> xfs_inode_buf.c:179:32: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:179:32: warning: cast to restricted __be32
> xfs_inode_buf.c:179:32: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:179:32: warning: cast to restricted __be32
> xfs_inode_buf.c:179:32: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:179:32: warning: cast to restricted __be32
> xfs_inode_buf.c:180:23: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:180:23: warning: cast to restricted __be32
> xfs_inode_buf.c:180:23: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:180:23: warning: cast to restricted __be32
> xfs_inode_buf.c:180:23: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:180:23: warning: cast to restricted __be32
> xfs_inode_buf.c:180:23: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:180:23: warning: cast to restricted __be32
> xfs_inode_buf.c:180:23: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:180:23: warning: cast to restricted __be32
> xfs_inode_buf.c:180:23: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:180:23: warning: cast to restricted __be32
> xfs_inode_buf.c:297:26: warning: cast to restricted __be64
> xfs_inode_buf.c:297:26: warning: cast from restricted __be32
> xfs_inode_buf.c:297:26: warning: restricted __be64 degrades to integer
> xfs_inode_buf.c:298:25: warning: restricted __be32 degrades to integer
> xfs_inode_buf.c:297:23: warning: incorrect type in assignment (different base types)
> xfs_inode_buf.c:297:23:    expected restricted __be64 [usertype] t_bigtime
> xfs_inode_buf.c:297:23:    got unsigned long long
>
> (and even more in xfs_inode_item.c)
>
> So... while we could get rid of the union and hand-decode the timestamp
> from a __be64 on legacy filesystems, I see the static checker complaints
> as a second piece of evidence that this would be unnecessarily risky.
>

And unnecessarily make the code less readable and harder to review.
To what end? Dave writes:
"I just didn't really like the way the code in the encode/decode
helpers turned out..."

Cannot respond to that argument on a technical review.
I can only say that as a reviewer, the posted version was clear and easy
for me to verify and the posted alternative that turned out to have a bug,
I would never have never caught that bug in review and I would not have
felt confident about verifying the code in review either.

Thanks,
Amir.

  parent reply	other threads:[~2020-08-20  5:11 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-17 22:56 [PATCH v2 00/11] xfs: widen timestamps to deal with y2038 Darrick J. Wong
2020-08-17 22:56 ` [PATCH 01/11] xfs: explicitly define inode timestamp range Darrick J. Wong
2020-08-18  6:25   ` Amir Goldstein
2020-08-17 22:57 ` [PATCH 02/11] xfs: refactor quota expiration timer modification Darrick J. Wong
2020-08-18  6:48   ` Amir Goldstein
2020-08-18 15:40     ` Darrick J. Wong
2020-08-17 22:57 ` [PATCH 03/11] xfs: refactor default quota grace period setting code Darrick J. Wong
2020-08-18 10:46   ` Amir Goldstein
2020-08-17 22:57 ` [PATCH 04/11] xfs: remove xfs_timestamp_t Darrick J. Wong
2020-08-18 10:50   ` Amir Goldstein
2020-08-17 22:57 ` [PATCH 05/11] xfs: move xfs_log_dinode_to_disk to the log code Darrick J. Wong
2020-08-18 10:51   ` Amir Goldstein
2020-08-17 22:57 ` [PATCH 06/11] xfs: refactor inode timestamp coding Darrick J. Wong
2020-08-18 11:20   ` Amir Goldstein
2020-08-18 15:42     ` Darrick J. Wong
2020-08-17 22:57 ` [PATCH 07/11] xfs: convert struct xfs_timestamp to union Darrick J. Wong
2020-08-18 11:24   ` Amir Goldstein
2020-08-17 22:57 ` [PATCH 08/11] xfs: widen ondisk timestamps to deal with y2038 problem Darrick J. Wong
2020-08-18 12:00   ` Amir Goldstein
2020-08-18 12:53     ` Amir Goldstein
2020-08-18 15:53       ` Darrick J. Wong
2020-08-18 20:52         ` Darrick J. Wong
2020-08-18 15:44     ` Darrick J. Wong
2020-08-18 23:35   ` Dave Chinner
2020-08-19 21:43     ` Darrick J. Wong
2020-08-19 23:58       ` Dave Chinner
2020-08-20  0:01       ` Darrick J. Wong
2020-08-20  4:42         ` griffin tucker
2020-08-20 16:23           ` Darrick J. Wong
2020-08-21  5:02             ` griffin tucker
2020-08-21 15:31               ` Mike Fleetwood
2020-08-20  5:11         ` Amir Goldstein [this message]
2020-08-20 22:47           ` Dave Chinner
2020-08-17 22:57 ` [PATCH 09/11] xfs: refactor quota timestamp coding Darrick J. Wong
2020-08-18 12:25   ` Amir Goldstein
2020-08-17 22:57 ` [PATCH 10/11] xfs: enable bigtime for quota timers Darrick J. Wong
2020-08-18 13:58   ` Amir Goldstein
2020-08-18 15:59     ` Darrick J. Wong
2020-08-17 22:57 ` [PATCH 11/11] xfs: enable big timestamps Darrick J. Wong
2020-08-18 14:04   ` Amir Goldstein
2020-08-18 23:01 ` [PATCH v2 00/11] xfs: widen timestamps to deal with y2038 Dave Chinner
2020-08-18 23:10   ` Darrick J. Wong
2020-08-18 23:41     ` Dave Chinner
2020-08-21  2:11 [PATCH v3 " Darrick J. Wong
2020-08-21  2:12 ` [PATCH 08/11] xfs: widen ondisk timestamps to deal with y2038 problem Darrick J. Wong
2020-08-22  7:33   ` Christoph Hellwig
2020-08-24  2:43     ` Darrick J. Wong
2020-08-25  0:39       ` Darrick J. Wong
2020-08-24  1:25   ` Dave Chinner
2020-08-24  3:13     ` Darrick J. Wong
2020-08-24  6:15       ` Dave Chinner
2020-08-24 16:24         ` Darrick J. Wong
2020-08-24 21:13           ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxiinUPDB6K=cZ=4h1hwzOefoLgCR8pF3B+cn3u0HTWj0A@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@sandeen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.