LKML Archive on lore.kernel.org
 help / color / Atom feed
* [RFC 00/32] making inode time stamps y2038 ready
@ 2014-05-30 20:01 Arnd Bergmann
  2014-05-30 20:01 ` [RFC 01/32] fs: introduce new 'struct inode_time' Arnd Bergmann
                   ` (34 more replies)
  0 siblings, 35 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, ceph-devel, cluster-devel, coda,
	codalist, fuse-devel, linux-afs, linux-btrfs, linux-cifs,
	linux-ext4, linux-f2fs-devel, linux-mtd, linux-nfs,
	linux-ntfs-dev, linux-scsi, logfs, ocfs2-devel, reiserfs-devel,
	samba-technical, xfs

Based on the recent discussion about 64-bit time_t for new
architectures, and for solving the year 2038 problem in general,
I decided to try out what it would take to solve part of the
kernel side of things.

This is a proof-of-concept work to get us to the point where
two system calls (utimes and stat) provide a working interface
to user space to pass 64-bit inode time stamps in and out of
the kernel all the way to the file systems.

I picked this because it is a fairly isolated problem, as the
inode time stamps are rarely assigned to any other time values.
As a byproduct of this work, I documented for each of the file
systems we support how long the on-disk format can work[1].

Obviously we also need to convert all the other syscalls and
have a proper libc implementation using those for this to
be really useful, but it's a start and it can be tested
independently (I didn't so far, want to wait for initial
feedback).

All the interesting stuff is in the first five patches here,
the rest is the straightforward conversion of all file systems
that use 'timespec' values internally.

There are of course a number of open questions:

a) is this the right approach in general? The previous discussion
   pointed this way, but there may be other opinions.
b) what type should we use internally to represent inode time
   stamps? The code contains three different versions that would
   all work, we just have to pick a good tradeoff between
   efficiency and the range of times we want to cover.
c) Should we continue this way for all 32-bit platforms for
   consistency, including future ones, or should we go to
   different 64-bit types right away? My feeling is that the
   second approach would complicate this work.

	Arnd

[1] http://kernelnewbies.org/y2038

Arnd Bergmann (32):
  fs: introduce new 'struct inode_time'
  uapi: add struct __kernel_timespec{32,64}
  fs: introduce sys_utimens64at
  fs: introduce sys_newfstat64/sys_newfstatat64
  arch: hook up new stat and utimes syscalls
  isofs: fix timestamps beyond 2027
  fs/nfs: convert to struct inode_time
  fs/ceph: convert to 'struct inode_time'
  fs/pstore: convert to struct inode_time
  fs/coda: convert to struct inode_time
  xfs: convert to struct inode_time
  btrfs: convert to struct inode_time
  ext3: convert to struct inode_time
  ext4: convert to struct inode_time
  cifs: convert to struct inode_time
  ntfs: convert to struct inode_time
  ubifs: convert to struct inode_time
  ocfs2: convert to struct inode_time
  fs/fat: convert to struct inode_time
  afs: convert to struct inode_time
  udf: convert to struct inode_time
  fs: convert simple fs to inode_time
  logfs: convert to struct inode_time
  hfs, hfsplus: convert to struct inode_time
  gfs2: convert to struct inode_time
  reiserfs: convert to struct inode_time
  jffs2: convert to struct inode_time
  adfs: convert to struct inode_time
  f2fs: convert to struct inode_time
  fuse: convert to struct inode_time
  scsi: fnic: use current_kernel_time() for timestamp
  fs: use new inode_time definition unconditionally

 arch/alpha/kernel/osf_sys.c        |  2 +-
 arch/arm/include/asm/unistd.h      |  2 +-
 arch/arm/include/uapi/asm/stat.h   | 25 +++++++++++++++++
 arch/arm/include/uapi/asm/unistd.h |  3 +++
 arch/arm/kernel/calls.S            |  3 +++
 arch/arm64/include/asm/unistd32.h  |  5 +++-
 arch/x86/include/uapi/asm/stat.h   | 28 +++++++++++++++++++
 arch/x86/syscalls/syscall_32.tbl   |  3 +++
 drivers/block/rbd.c                |  2 +-
 drivers/firmware/efi/efi-pstore.c  | 28 +++++++++----------
 drivers/scsi/fnic/fnic_trace.c     |  2 +-
 drivers/tty/tty_io.c               |  2 +-
 drivers/usb/gadget/f_fs.c          |  2 +-
 fs/adfs/inode.c                    |  4 +--
 fs/afs/afs.h                       |  6 ++---
 fs/afs/fsclient.c                  |  2 +-
 fs/attr.c                          |  8 +++---
 fs/btrfs/file.c                    |  6 ++---
 fs/btrfs/inode.c                   |  4 +--
 fs/btrfs/ioctl.c                   |  4 +--
 fs/btrfs/root-tree.c               |  2 +-
 fs/btrfs/transaction.c             |  2 +-
 fs/ceph/cache.c                    |  2 +-
 fs/ceph/caps.c                     |  6 ++---
 fs/ceph/file.c                     |  4 +--
 fs/ceph/inode.c                    | 20 +++++++-------
 fs/ceph/super.h                    |  8 +++---
 fs/cifs/cache.c                    |  6 ++---
 fs/cifs/cifsglob.h                 |  6 ++---
 fs/cifs/cifsproto.h                |  6 ++---
 fs/cifs/cifssmb.c                  |  5 ++--
 fs/cifs/inode.c                    |  2 +-
 fs/cifs/netmisc.c                  | 15 ++++++-----
 fs/coda/coda_linux.c               | 18 ++++++++-----
 fs/compat.c                        | 19 ++-----------
 fs/configfs/inode.c                |  6 ++---
 fs/cramfs/inode.c                  |  2 +-
 fs/ext3/inode.c                    |  4 +--
 fs/ext4/ext4.h                     | 10 +++----
 fs/ext4/extents.c                  |  2 +-
 fs/f2fs/file.c                     |  6 ++---
 fs/fat/dir.c                       |  2 +-
 fs/fat/fat.h                       |  6 ++---
 fs/fat/misc.c                      |  4 +--
 fs/fat/namei_msdos.c               |  8 +++---
 fs/fat/namei_vfat.c                | 10 +++----
 fs/fuse/inode.c                    |  6 ++---
 fs/gfs2/dir.c                      |  6 ++---
 fs/gfs2/glops.c                    |  4 +--
 fs/hfs/hfs_fs.h                    |  2 +-
 fs/hfsplus/hfsplus_fs.h            |  2 +-
 fs/inode.c                         | 18 ++++++-------
 fs/isofs/util.c                    |  2 +-
 fs/jffs2/os-linux.h                |  2 +-
 fs/locks.c                         |  4 +--
 fs/logfs/readwrite.c               | 18 ++++++-------
 fs/nfs/callback.h                  |  4 +--
 fs/nfs/callback_xdr.c              |  6 ++---
 fs/nfs/file.c                      |  2 +-
 fs/nfs/fscache-index.c             |  8 +++---
 fs/nfs/inode.c                     | 10 +++----
 fs/nfs/internal.h                  |  4 +--
 fs/nfs/netns.h                     |  2 +-
 fs/nfs/nfs2xdr.c                   |  8 +++---
 fs/nfs/nfs3xdr.c                   | 10 +++----
 fs/nfs/nfs4xdr.c                   | 20 +++++++-------
 fs/nfsd/nfs3xdr.c                  |  6 ++---
 fs/nfsd/nfsfh.h                    |  4 +--
 fs/nfsd/nfsxdr.c                   |  2 +-
 fs/ntfs/inode.c                    | 12 ++++-----
 fs/ntfs/time.h                     |  8 +++---
 fs/ocfs2/dlmglue.c                 | 16 +++++------
 fs/ocfs2/file.c                    |  6 ++---
 fs/ocfs2/ocfs2.h                   |  2 +-
 fs/pstore/inode.c                  |  2 +-
 fs/pstore/internal.h               |  2 +-
 fs/pstore/platform.c               |  2 +-
 fs/pstore/ram.c                    | 18 +++++++------
 fs/reiserfs/namei.c                |  2 +-
 fs/reiserfs/xattr.c                |  4 +--
 fs/stat.c                          | 55 ++++++++++++++++++++++++++++++++++++++
 fs/ubifs/dir.c                     |  2 +-
 fs/ubifs/file.c                    | 16 +++++------
 fs/ubifs/misc.h                    |  2 +-
 fs/udf/udf_i.h                     |  2 +-
 fs/udf/udf_sb.h                    |  2 +-
 fs/udf/udfdecl.h                   |  7 ++---
 fs/udf/udftime.c                   |  7 ++---
 fs/utimes.c                        | 47 +++++++++++++++++++++++++++-----
 fs/xfs/time.h                      |  4 +--
 fs/xfs/xfs_inode.c                 |  2 +-
 fs/xfs/xfs_iops.c                  |  2 +-
 fs/xfs/xfs_trans_inode.c           |  6 ++---
 include/linux/ceph/decode.h        |  8 +++---
 include/linux/ceph/osd_client.h    |  4 +--
 include/linux/compat.h             |  2 +-
 include/linux/fs.h                 | 32 +++++++++++-----------
 include/linux/nfs_fs_sb.h          |  2 +-
 include/linux/nfs_xdr.h            | 14 +++++-----
 include/linux/pstore.h             |  4 +--
 include/linux/stat.h               |  6 ++---
 include/linux/syscalls.h           |  9 ++++++-
 include/linux/time.h               | 44 +++++++++++++++++++++++++++---
 include/uapi/asm-generic/stat.h    | 29 ++++++++++++++++++--
 include/uapi/asm-generic/unistd.h  |  8 +++++-
 include/uapi/linux/coda.h          |  1 +
 include/uapi/linux/time.h          | 40 ++++++++++++++++++++++++++-
 init/initramfs.c                   |  2 +-
 kernel/audit.c                     |  2 +-
 kernel/auditsc.c                   |  2 +-
 kernel/time.c                      | 44 +++++++++++++++++++++++++-----
 kernel/time/timekeeping.c          | 16 +++++++++++
 net/ceph/auth_x.c                  |  2 +-
 net/ceph/osd_client.c              |  4 +--
 114 files changed, 642 insertions(+), 333 deletions(-)

-- 
1.8.3.2

Bcc: "J. Bruce Fields" <bfields@fieldses.org>
Bcc: "Theodore Ts'o" <tytso@mit.edu>
Bcc: Adrian Hunter <adrian.hunter@intel.com>
Bcc: Andreas Dilger <adilger.kernel@dilger.ca>
Bcc: Andrew Morton <akpm@linux-foundation.org>
Bcc: Anton Altaparmakov <anton@tuxera.com>
Bcc: Anton Vorontsov <anton@enomsg.org>
Bcc: Artem Bityutskiy <dedekind1@gmail.com>
Bcc: Brian Uchino <buchino@cisco.com>
Bcc: Chris Mason <clm@fb.com>
Bcc: Colin Cross <ccross@android.com>
Bcc: Dave Chinner <david@fromorbit.com>
Bcc: David Howells <dhowells@redhat.com>
Bcc: David Woodhouse <dwmw2@infradead.org>
Bcc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bcc: Hiral Patel <hiralpat@cisco.com>
Bcc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Bcc: Jan Harkes <jaharkes@cs.cmu.edu>
Bcc: Jan Kara <jack@suse.cz>
Bcc: Joel Becker <jlbec@evilplan.org>
Bcc: Joern Engel <joern@logfs.org>
Bcc: Josef Bacik <jbacik@fb.com>
Bcc: Kees Cook <keescook@chromium.org>
Bcc: Mark Fasheh <mfasheh@suse.com>
Bcc: Miklos Szeredi <miklos@szeredi.hu>
Bcc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Bcc: Prasad Joshi <prasadjoshi.linux@gmail.com>
Bcc: Sage Weil <sage@inktank.com>
Bcc: Steve French <sfrench@samba.org>
Bcc: Steven Whitehouse <swhiteho@redhat.com>
Bcc: Suma Ramars <sramars@cisco.com>
Bcc: Tony Luck <tony.luck@intel.com>
Cc: ceph-devel@vger.kernel.org
Cc: cluster-devel@redhat.com
Cc: coda@cs.cmu.edu
Cc: codalist@coda.cs.cmu.edu
Cc: fuse-devel@lists.sourceforge.net
Cc: linux-afs@lists.infradead.org
Cc: linux-btrfs@vger.kernel.org
Cc: linux-cifs@vger.kernel.org
Cc: linux-ext4@vger.kernel.org
Cc: linux-f2fs-devel@lists.sourceforge.net
Cc: linux-mtd@lists.infradead.org
Cc: linux-nfs@vger.kernel.org
Cc: linux-ntfs-dev@lists.sourceforge.net
Cc: linux-scsi@vger.kernel.org
Cc: logfs@logfs.org
Cc: ocfs2-devel@oss.oracle.com
Cc: reiserfs-devel@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: xfs@oss.sgi.com

^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 01/32] fs: introduce new 'struct inode_time'
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
@ 2014-05-30 20:01 ` Arnd Bergmann
  2014-05-31  7:56   ` Geert Uytterhoeven
  2014-05-31  9:03   ` H. Peter Anvin
  2014-05-30 20:01 ` [RFC 02/32] uapi: add struct __kernel_timespec{32,64} Arnd Bergmann
                   ` (33 subsequent siblings)
  34 siblings, 2 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann

As one part of the puzzle to solve the y2038 problem, this introduces
a new time type to be used for keeping inode timestamps (atime, ctime,
mtime) inside of the kernel.

Initially, this type is defined to 'struct timespec' to allow migrating
all file systems one by one, but the intention is to change the definition
to use either 64-bit signed seconds or 'unsigned long' seconds, which
would allow timestamps between 1970 and 2106.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 fs/attr.c                 |  8 +++---
 fs/inode.c                | 18 ++++++-------
 fs/locks.c                |  4 +--
 include/linux/fs.h        | 32 +++++++++++-----------
 include/linux/stat.h      |  6 ++---
 include/linux/time.h      | 69 ++++++++++++++++++++++++++++++++++++++++++++---
 kernel/audit.c            |  2 +-
 kernel/auditsc.c          |  2 +-
 kernel/time.c             | 44 +++++++++++++++++++++++++-----
 kernel/time/timekeeping.c | 16 +++++++++++
 10 files changed, 155 insertions(+), 46 deletions(-)

diff --git a/fs/attr.c b/fs/attr.c
index 5d4e59d..62a9d28 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -148,13 +148,13 @@ void setattr_copy(struct inode *inode, const struct iattr *attr)
 	if (ia_valid & ATTR_GID)
 		inode->i_gid = attr->ia_gid;
 	if (ia_valid & ATTR_ATIME)
-		inode->i_atime = timespec_trunc(attr->ia_atime,
+		inode->i_atime = inode_time_trunc(attr->ia_atime,
 						inode->i_sb->s_time_gran);
 	if (ia_valid & ATTR_MTIME)
-		inode->i_mtime = timespec_trunc(attr->ia_mtime,
+		inode->i_mtime = inode_time_trunc(attr->ia_mtime,
 						inode->i_sb->s_time_gran);
 	if (ia_valid & ATTR_CTIME)
-		inode->i_ctime = timespec_trunc(attr->ia_ctime,
+		inode->i_ctime = inode_time_trunc(attr->ia_ctime,
 						inode->i_sb->s_time_gran);
 	if (ia_valid & ATTR_MODE) {
 		umode_t mode = attr->ia_mode;
@@ -192,7 +192,7 @@ int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **de
 	struct inode *inode = dentry->d_inode;
 	umode_t mode = inode->i_mode;
 	int error;
-	struct timespec now;
+	struct inode_time now;
 	unsigned int ia_valid = attr->ia_valid;
 
 	WARN_ON_ONCE(!mutex_is_locked(&inode->i_mutex));
diff --git a/fs/inode.c b/fs/inode.c
index 2feb9b6..e123f4c 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1464,7 +1464,7 @@ EXPORT_SYMBOL(bmap);
  * passed since the last atime update.
  */
 static int relatime_need_update(struct vfsmount *mnt, struct inode *inode,
-			     struct timespec now)
+				struct inode_time now)
 {
 
 	if (!(mnt->mnt_flags & MNT_RELATIME))
@@ -1472,12 +1472,12 @@ static int relatime_need_update(struct vfsmount *mnt, struct inode *inode,
 	/*
 	 * Is mtime younger than atime? If yes, update atime:
 	 */
-	if (timespec_compare(&inode->i_mtime, &inode->i_atime) >= 0)
+	if (inode_time_compare(&inode->i_mtime, &inode->i_atime) >= 0)
 		return 1;
 	/*
 	 * Is ctime younger than atime? If yes, update atime:
 	 */
-	if (timespec_compare(&inode->i_ctime, &inode->i_atime) >= 0)
+	if (inode_time_compare(&inode->i_ctime, &inode->i_atime) >= 0)
 		return 1;
 
 	/*
@@ -1496,7 +1496,7 @@ static int relatime_need_update(struct vfsmount *mnt, struct inode *inode,
  * This does the actual work of updating an inodes time or version.  Must have
  * had called mnt_want_write() before calling this.
  */
-static int update_time(struct inode *inode, struct timespec *time, int flags)
+static int update_time(struct inode *inode, struct inode_time *time, int flags)
 {
 	if (inode->i_op->update_time)
 		return inode->i_op->update_time(inode, time, flags);
@@ -1525,7 +1525,7 @@ void touch_atime(const struct path *path)
 {
 	struct vfsmount *mnt = path->mnt;
 	struct inode *inode = path->dentry->d_inode;
-	struct timespec now;
+	struct inode_time now;
 
 	if (inode->i_flags & S_NOATIME)
 		return;
@@ -1544,7 +1544,7 @@ void touch_atime(const struct path *path)
 	if (!relatime_need_update(mnt, inode, now))
 		return;
 
-	if (timespec_equal(&inode->i_atime, &now))
+	if (inode_time_equal(&inode->i_atime, &now))
 		return;
 
 	if (!sb_start_write_trylock(inode->i_sb))
@@ -1653,7 +1653,7 @@ EXPORT_SYMBOL(file_remove_suid);
 int file_update_time(struct file *file)
 {
 	struct inode *inode = file_inode(file);
-	struct timespec now;
+	struct inode_time now;
 	int sync_it = 0;
 	int ret;
 
@@ -1662,10 +1662,10 @@ int file_update_time(struct file *file)
 		return 0;
 
 	now = current_fs_time(inode->i_sb);
-	if (!timespec_equal(&inode->i_mtime, &now))
+	if (!inode_time_equal(&inode->i_mtime, &now))
 		sync_it = S_MTIME;
 
-	if (!timespec_equal(&inode->i_ctime, &now))
+	if (!inode_time_equal(&inode->i_ctime, &now))
 		sync_it |= S_CTIME;
 
 	if (IS_I_VERSION(inode))
diff --git a/fs/locks.c b/fs/locks.c
index da57c9b..1d9bb23 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1423,13 +1423,13 @@ EXPORT_SYMBOL(__break_lease);
 /**
  *	lease_get_mtime - get the last modified time of an inode
  *	@inode: the inode
- *      @time:  pointer to a timespec which will contain the last modified time
+ *      @time:  pointer to a inode time which will contain the last modified time
  *
  * This is to force NFS clients to flush their caches for files with
  * exclusive leases.  The justification is that if someone has an
  * exclusive lease, then they could be modifying it.
  */
-void lease_get_mtime(struct inode *inode, struct timespec *time)
+void lease_get_mtime(struct inode *inode, struct inode_time *time)
 {
 	struct file_lock *flock = inode->i_flock;
 	if (flock && IS_LEASE(flock) && (flock->fl_type == F_WRLCK))
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 1cab2f8..5ee58bf 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -234,21 +234,21 @@ typedef void (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
  * Derek Atkins <warlord@MIT.EDU> 94-10-20
  */
 struct iattr {
-	unsigned int	ia_valid;
-	umode_t		ia_mode;
-	kuid_t		ia_uid;
-	kgid_t		ia_gid;
-	loff_t		ia_size;
-	struct timespec	ia_atime;
-	struct timespec	ia_mtime;
-	struct timespec	ia_ctime;
+	unsigned int		ia_valid;
+	umode_t			ia_mode;
+	kuid_t			ia_uid;
+	kgid_t			ia_gid;
+	loff_t			ia_size;
+	struct inode_time	ia_atime;
+	struct inode_time	ia_mtime;
+	struct inode_time	ia_ctime;
 
 	/*
 	 * Not an attribute, but an auxiliary info for filesystems wanting to
 	 * implement an ftruncate() like method.  NOTE: filesystem should
 	 * check for (ia_valid & ATTR_FILE), and not for (ia_file != NULL).
 	 */
-	struct file	*ia_file;
+	struct file		*ia_file;
 };
 
 /*
@@ -534,9 +534,9 @@ struct inode {
 	};
 	dev_t			i_rdev;
 	loff_t			i_size;
-	struct timespec		i_atime;
-	struct timespec		i_mtime;
-	struct timespec		i_ctime;
+	struct inode_time	i_atime;
+	struct inode_time	i_mtime;
+	struct inode_time	i_ctime;
 	spinlock_t		i_lock;	/* i_blocks, i_bytes, maybe i_size */
 	unsigned short          i_bytes;
 	unsigned int		i_blkbits;
@@ -954,7 +954,7 @@ extern int vfs_lock_file(struct file *, unsigned int, struct file_lock *, struct
 extern int vfs_cancel_lock(struct file *filp, struct file_lock *fl);
 extern int flock_lock_file_wait(struct file *filp, struct file_lock *fl);
 extern int __break_lease(struct inode *inode, unsigned int flags, unsigned int type);
-extern void lease_get_mtime(struct inode *, struct timespec *time);
+extern void lease_get_mtime(struct inode *, struct inode_time *time);
 extern int generic_setlease(struct file *, long, struct file_lock **);
 extern int vfs_setlease(struct file *, long, struct file_lock **);
 extern int lease_modify(struct file_lock **, int);
@@ -1069,7 +1069,7 @@ static inline int __break_lease(struct inode *inode, unsigned int mode, unsigned
 	return 0;
 }
 
-static inline void lease_get_mtime(struct inode *inode, struct timespec *time)
+static inline void lease_get_mtime(struct inode *inode, struct inode_time *time)
 {
 	return;
 }
@@ -1260,7 +1260,7 @@ struct super_block {
 	struct rcu_head		rcu;
 };
 
-extern struct timespec current_fs_time(struct super_block *sb);
+extern struct inode_time current_fs_time(struct super_block *sb);
 
 /*
  * Snapshotting support.
@@ -1514,7 +1514,7 @@ struct inode_operations {
 	int (*removexattr) (struct dentry *, const char *);
 	int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start,
 		      u64 len);
-	int (*update_time)(struct inode *, struct timespec *, int);
+	int (*update_time)(struct inode *, struct inode_time *, int);
 	int (*atomic_open)(struct inode *, struct dentry *,
 			   struct file *, unsigned open_flag,
 			   umode_t create_mode, int *opened);
diff --git a/include/linux/stat.h b/include/linux/stat.h
index 075cb0c..c867e29 100644
--- a/include/linux/stat.h
+++ b/include/linux/stat.h
@@ -27,9 +27,9 @@ struct kstat {
 	kgid_t		gid;
 	dev_t		rdev;
 	loff_t		size;
-	struct timespec  atime;
-	struct timespec	mtime;
-	struct timespec	ctime;
+	struct inode_time atime;
+	struct inode_time mtime;
+	struct inode_time ctime;
 	unsigned long	blksize;
 	unsigned long long	blocks;
 };
diff --git a/include/linux/time.h b/include/linux/time.h
index d5d229b..e2d5aa2 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -6,6 +6,45 @@
 # include <linux/math64.h>
 #include <uapi/linux/time.h>
 
+#ifdef CONFIG_NEW_INODE_TIME
+/*
+ * This is the type we use internally in the kernel to represent
+ * absolute times in file system metadata.
+ * This structure must not leak out to user space, and new interfaces
+ * should be using 64-bit types right away.
+ */
+
+/*
+ * Variant a) using unsigned seconds lets us extend the life span
+ * for another 69 years beyond 2038.
+ */
+struct inode_time {
+	unsigned long	tv_sec;
+	long		tv_nsec;
+};
+#elif 0
+/*
+ * This variant can represent the widest range of times, but also
+ * bloats 'struct inode' a little more.
+ */
+struct inode_time {
+	long long	tv_sec __attribute__((packed));
+	int		tv_nsec;
+};
+#elif 0
+/*
+ * The variant using bit fields is less efficient to access, but
+ * small and has a wider range as the 32-bit one, plus it keeps
+ * the signedness of the original timespec.
+ */
+struct inode_time {
+	long long	tv_sec	: 34;
+	int		tv_nsec : 30;
+};
+#else
+#define inode_time timespec
+#endif
+
 extern struct timezone sys_tz;
 
 /* Parameters used to convert the timespec values: */
@@ -25,6 +64,12 @@ static inline int timespec_equal(const struct timespec *a,
 	return (a->tv_sec == b->tv_sec) && (a->tv_nsec == b->tv_nsec);
 }
 
+static inline int inode_time_equal(const struct inode_time *a,
+                                 const struct inode_time *b)
+{
+	return (a->tv_sec == b->tv_sec) && (a->tv_nsec == b->tv_nsec);
+}
+
 /*
  * lhs < rhs:  return <0
  * lhs == rhs: return 0
@@ -39,6 +84,15 @@ static inline int timespec_compare(const struct timespec *lhs, const struct time
 	return lhs->tv_nsec - rhs->tv_nsec;
 }
 
+static inline int inode_time_compare(const struct inode_time *lhs, const struct inode_time *rhs)
+{
+	if (lhs->tv_sec < rhs->tv_sec)
+		return -1;
+	if (lhs->tv_sec > rhs->tv_sec)
+		return 1;
+	return lhs->tv_nsec - rhs->tv_nsec;
+}
+
 static inline int timeval_compare(const struct timeval *lhs, const struct timeval *rhs)
 {
 	if (lhs->tv_sec < rhs->tv_sec)
@@ -131,14 +185,15 @@ extern int timekeeping_suspended;
 
 unsigned long get_seconds(void);
 struct timespec current_kernel_time(void);
+struct inode_time current_inode_time(void);
 struct timespec __current_kernel_time(void); /* does not take xtime_lock */
 struct timespec get_monotonic_coarse(void);
 void get_xtime_and_monotonic_and_sleep_offset(struct timespec *xtim,
 				struct timespec *wtom, struct timespec *sleep);
 void timekeeping_inject_sleeptime(struct timespec *delta);
 
-#define CURRENT_TIME		(current_kernel_time())
-#define CURRENT_TIME_SEC	((struct timespec) { get_seconds(), 0 })
+#define CURRENT_TIME		(current_inode_time())
+#define CURRENT_TIME_SEC	((struct inode_time) { get_seconds(), 0 })
 
 /* Some architectures do not supply their own clocksource.
  * This is mainly the case in architectures that get their
@@ -173,7 +228,7 @@ extern void getboottime(struct timespec *ts);
 extern void monotonic_to_bootbased(struct timespec *ts);
 extern void get_monotonic_boottime(struct timespec *ts);
 
-extern struct timespec timespec_trunc(struct timespec t, unsigned gran);
+extern struct inode_time inode_time_trunc(struct inode_time t, unsigned gran);
 extern int timekeeping_valid_for_hres(void);
 extern u64 timekeeping_max_deferment(void);
 extern int timekeeping_inject_offset(struct timespec *ts);
@@ -246,6 +301,14 @@ static inline s64 timeval_to_ns(const struct timeval *tv)
 extern struct timespec ns_to_timespec(const s64 nsec);
 
 /**
+ * ns_to_inode_time - Convert nanoseconds to inode_time
+ * @nsec:	the nanoseconds value to be converted
+ *
+ * Returns the inode_time representation of the nsec parameter.
+ */
+extern struct inode_time ns_to_inode_time(const s64 nsec);
+
+/**
  * ns_to_timeval - Convert nanoseconds to timeval
  * @nsec:	the nanoseconds value to be converted
  *
diff --git a/kernel/audit.c b/kernel/audit.c
index 3ef2e0e..2440add 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1320,7 +1320,7 @@ static inline void audit_get_stamp(struct audit_context *ctx,
 				   struct timespec *t, unsigned int *serial)
 {
 	if (!ctx || !auditsc_get_stamp(ctx, t, serial)) {
-		*t = CURRENT_TIME;
+		*t = current_kernel_time();
 		*serial = audit_serial();
 	}
 }
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index b12a712..041ec4e 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -1543,7 +1543,7 @@ void __audit_syscall_entry(int major, unsigned long a1, unsigned long a2,
 		return;
 
 	context->serial     = 0;
-	context->ctime      = CURRENT_TIME;
+	context->ctime      = current_kernel_time();
 	context->in_syscall = 1;
 	context->current_state  = state;
 	context->ppid       = 0;
diff --git a/kernel/time.c b/kernel/time.c
index 7c7964c..40a25a7 100644
--- a/kernel/time.c
+++ b/kernel/time.c
@@ -228,10 +228,15 @@ SYSCALL_DEFINE1(adjtimex, struct timex __user *, txc_p)
  * Return the current time truncated to the time granularity supported by
  * the fs.
  */
-struct timespec current_fs_time(struct super_block *sb)
+struct inode_time current_fs_time(struct super_block *sb)
 {
-	struct timespec now = current_kernel_time();
-	return timespec_trunc(now, sb->s_time_gran);
+	/* FIXME: current_kernel_time may be 32-bit */
+	struct timespec ts = current_kernel_time();
+	struct inode_time now = (struct inode_time) {
+		.tv_sec = ts.tv_sec,
+		.tv_nsec = ts.tv_nsec,
+	};
+	return inode_time_trunc(now, sb->s_time_gran);
 }
 EXPORT_SYMBOL(current_fs_time);
 
@@ -274,8 +279,8 @@ unsigned int jiffies_to_usecs(const unsigned long j)
 EXPORT_SYMBOL(jiffies_to_usecs);
 
 /**
- * timespec_trunc - Truncate timespec to a granularity
- * @t: Timespec
+ * inode_time_trunc - Truncate timespec to a granularity
+ * @t: inode time
  * @gran: Granularity in ns.
  *
  * Truncate a timespec to a granularity. gran must be smaller than a second.
@@ -285,7 +290,7 @@ EXPORT_SYMBOL(jiffies_to_usecs);
  * current_kernel_time() or CURRENT_TIME, not with do_gettimeofday() because
  * it doesn't handle the better resolution of the latter.
  */
-struct timespec timespec_trunc(struct timespec t, unsigned gran)
+struct inode_time inode_time_trunc(struct inode_time t, unsigned gran)
 {
 	/*
 	 * Division is pretty slow so avoid it for common cases.
@@ -301,7 +306,7 @@ struct timespec timespec_trunc(struct timespec t, unsigned gran)
 	}
 	return t;
 }
-EXPORT_SYMBOL(timespec_trunc);
+EXPORT_SYMBOL(inode_time_trunc);
 
 /* Converts Gregorian date to seconds since 1970-01-01 00:00:00.
  * Assumes input in normal date format, i.e. 1980-12-31 23:59:59
@@ -403,6 +408,31 @@ struct timespec ns_to_timespec(const s64 nsec)
 EXPORT_SYMBOL(ns_to_timespec);
 
 /**
+ * ns_to_inode_time - Convert nanoseconds to inode_time
+ * @nsec:       the nanoseconds value to be converted
+ *
+ * Returns the inode_time representation of the nsec parameter.
+ */
+struct inode_time ns_to_inode_time(const s64 nsec)
+{
+	struct inode_time ts;
+	s32 rem;
+
+	if (!nsec)
+		return (struct inode_time) {0, 0};
+
+	ts.tv_sec = div_s64_rem(nsec, NSEC_PER_SEC, &rem);
+	if (unlikely(rem < 0)) {
+		ts.tv_sec--;
+		rem += NSEC_PER_SEC;
+	}
+	ts.tv_nsec = rem;
+
+	return ts;
+}
+EXPORT_SYMBOL(ns_to_inode_time);
+
+/**
  * ns_to_timeval - Convert nanoseconds to timeval
  * @nsec:       the nanoseconds value to be converted
  *
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 32d8d6a..c0c4a18 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1559,6 +1559,22 @@ struct timespec current_kernel_time(void)
 }
 EXPORT_SYMBOL(current_kernel_time);
 
+struct inode_time current_inode_time(void)
+{
+	struct timekeeper *tk = &timekeeper;
+	struct timespec now;
+	unsigned long seq;
+
+	do {
+		seq = read_seqcount_begin(&timekeeper_seq);
+
+		now = tk_xtime(tk);
+	} while (read_seqcount_retry(&timekeeper_seq, seq));
+
+	return (struct inode_time) { now.tv_sec, now.tv_nsec };
+}
+EXPORT_SYMBOL(current_inode_time);
+
 struct timespec get_monotonic_coarse(void)
 {
 	struct timekeeper *tk = &timekeeper;
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 02/32] uapi: add struct __kernel_timespec{32,64}
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
  2014-05-30 20:01 ` [RFC 01/32] fs: introduce new 'struct inode_time' Arnd Bergmann
@ 2014-05-30 20:01 ` Arnd Bergmann
  2014-05-30 20:18   ` H. Peter Anvin
  2014-05-30 20:01 ` [RFC 03/32] fs: introduce sys_utimens64at Arnd Bergmann
                   ` (32 subsequent siblings)
  34 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann

We cannot use time_t or any derived structures beyond the year
2038 in interfaces between kernel and user space, on 32-bit
machines.

This is my suggestion for how to migrate syscall and ioctl
interfaces: We completely phase out time_t, timeval and timespec
from the uapi header files and replace them with types that are
either explicitly safe (__kernel_timespec64), or explicitly
unsafe (e.g. __kernel_timespec32). For each unsafe interface,
there needs to be a safe replacement interface.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 include/uapi/linux/time.h | 40 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/time.h b/include/uapi/linux/time.h
index e75e1b6..e2b56a3 100644
--- a/include/uapi/linux/time.h
+++ b/include/uapi/linux/time.h
@@ -3,7 +3,18 @@
 
 #include <linux/types.h>
 
-
+/*
+ * time_t, timespec and timeval are not safe to use beyond
+ * 2038 on 32-bit systems, and should be phased out for
+ * in-kernel uses as well as interfaces to user space.
+ *
+ * Inside of the kernel, we can use e.g. inode_time,
+ * ktime_t or timespec64, as appropriate.
+ *
+ * In the long run, we have to stop making these definitions
+ * visibile to user headers, so libc can define its own
+ * 64-bit types.
+ */
 #ifndef _STRUCT_TIMESPEC
 #define _STRUCT_TIMESPEC
 struct timespec {
@@ -17,6 +28,33 @@ struct timeval {
 	__kernel_suseconds_t	tv_usec;	/* microseconds */
 };
 
+/*
+ * __kernel_timespec64 is the general type to be used for
+ * new user space interfaces passing a time argument.
+ * 64-bit nanoseconds is a bit silly, but the advantage is
+ * that it is compatible with the native 'struct timespec'
+ * on 64-bit user space. This simplifies the compat code.
+ */
+struct __kernel_timespec64 {
+	long long tv_sec;
+	long long tv_nsec;
+};
+
+/*
+ * As interfaces get moved over from time_t, timeval and timespec
+ * to __kernel_timespec64, we have to provide backwards compatibility
+ * interfaces. These can use __kernel_timespec32. Other types will
+ * be needed as required.
+ * The compat syscalls and ioctls can also migrate from compat_timespec
+ * to __kernel_timespec32 in order to share the implementation with
+ * the native 32-bit legacy handlers.
+ */
+struct __kernel_timespec32 {
+	int	tv_sec;
+	int	tv_nsec;
+};
+
+/* timezone is safe for use beyond 2038 */
 struct timezone {
 	int	tz_minuteswest;	/* minutes west of Greenwich */
 	int	tz_dsttime;	/* type of dst correction */
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 03/32] fs: introduce sys_utimens64at
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
  2014-05-30 20:01 ` [RFC 01/32] fs: introduce new 'struct inode_time' Arnd Bergmann
  2014-05-30 20:01 ` [RFC 02/32] uapi: add struct __kernel_timespec{32,64} Arnd Bergmann
@ 2014-05-30 20:01 ` Arnd Bergmann
  2014-05-31  9:22   ` Andreas Schwab
  2014-05-30 20:01 ` [RFC 04/32] fs: introduce sys_newfstat64/sys_newfstatat64 Arnd Bergmann
                   ` (31 subsequent siblings)
  34 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann

This introduces a new variant of the utime/utimes/futimesat/utimensat
system call (used by /usr/bin/touch), which fixes the 32-bit limitation
of time_t at the kernel/user boundary for 32-bit machines.

Each of the variants is a strict superset of the functionality of the
previous ones, so we only need to add one more and let the libc
emulate the other interfaces based on that.

This moves over the existing compat_sys_utimensat implementation
from fs/compat.c into fs/utimes.c and changes the data types so
we use __kernel_timespec64 for the new native code path and use
__kernel_timespec32 for the compatibility with existing 32-bit
code, independent of whether we run on 32 or 64-bit CPUs.

Other patches in this series take care of the in-kernel handling of
inode times, but the full solution will require many other patches
system calls passing time_t values, and of course a C library with
adaptations to use those.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/alpha/kernel/osf_sys.c |  2 +-
 fs/compat.c                 | 19 ++----------------
 fs/utimes.c                 | 47 ++++++++++++++++++++++++++++++++++++++-------
 include/linux/compat.h      |  2 +-
 include/linux/syscalls.h    |  9 ++++++++-
 include/linux/time.h        |  2 +-
 init/initramfs.c            |  2 +-
 7 files changed, 54 insertions(+), 29 deletions(-)

diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index 1402fcc..96b4903 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -1070,7 +1070,7 @@ SYSCALL_DEFINE3(osf_setitimer, int, which, struct itimerval32 __user *, in,
 SYSCALL_DEFINE2(osf_utimes, const char __user *, filename,
 		struct timeval32 __user *, tvs)
 {
-	struct timespec tv[2];
+	struct __kernel_timespec64 tv[2];
 
 	if (tvs) {
 		struct timeval ktvs[2];
diff --git a/fs/compat.c b/fs/compat.c
index 66d3d3c..1e281f3 100644
--- a/fs/compat.c
+++ b/fs/compat.c
@@ -75,7 +75,7 @@ int compat_printk(const char *fmt, ...)
 COMPAT_SYSCALL_DEFINE2(utime, const char __user *, filename,
 		       struct compat_utimbuf __user *, t)
 {
-	struct timespec tv[2];
+	struct __kernel_timespec64 tv[2];
 
 	if (t) {
 		if (get_user(tv[0].tv_sec, &t->actime) ||
@@ -87,24 +87,9 @@ COMPAT_SYSCALL_DEFINE2(utime, const char __user *, filename,
 	return do_utimes(AT_FDCWD, filename, t ? tv : NULL, 0);
 }
 
-COMPAT_SYSCALL_DEFINE4(utimensat, unsigned int, dfd, const char __user *, filename, struct compat_timespec __user *, t, int, flags)
-{
-	struct timespec tv[2];
-
-	if  (t) {
-		if (compat_get_timespec(&tv[0], &t[0]) ||
-		    compat_get_timespec(&tv[1], &t[1]))
-			return -EFAULT;
-
-		if (tv[0].tv_nsec == UTIME_OMIT && tv[1].tv_nsec == UTIME_OMIT)
-			return 0;
-	}
-	return do_utimes(dfd, filename, t ? tv : NULL, flags);
-}
-
 COMPAT_SYSCALL_DEFINE3(futimesat, unsigned int, dfd, const char __user *, filename, struct compat_timeval __user *, t)
 {
-	struct timespec tv[2];
+	struct __kernel_timespec64 tv[2];
 
 	if (t) {
 		if (get_user(tv[0].tv_sec, &t[0].tv_sec) ||
diff --git a/fs/utimes.c b/fs/utimes.c
index aa138d6..89c23ce 100644
--- a/fs/utimes.c
+++ b/fs/utimes.c
@@ -26,7 +26,7 @@
  */
 SYSCALL_DEFINE2(utime, char __user *, filename, struct utimbuf __user *, times)
 {
-	struct timespec tv[2];
+	struct __kernel_timespec64 tv[2];
 
 	if (times) {
 		if (get_user(tv[0].tv_sec, &times->actime) ||
@@ -48,7 +48,7 @@ static bool nsec_valid(long nsec)
 	return nsec >= 0 && nsec <= 999999999;
 }
 
-static int utimes_common(struct path *path, struct timespec *times)
+static int utimes_common(struct path *path, struct __kernel_timespec64 *times)
 {
 	int error;
 	struct iattr newattrs;
@@ -133,8 +133,8 @@ out:
  * must be owner or have write permission.
  * Else, update from *times, must be owner or super user.
  */
-long do_utimes(int dfd, const char __user *filename, struct timespec *times,
-	       int flags)
+long do_utimes(int dfd, const char __user *filename,
+	       struct __kernel_timespec64 *times, int flags)
 {
 	int error = -EINVAL;
 
@@ -182,10 +182,15 @@ out:
 	return error;
 }
 
+#ifdef CONFIG_64BIT
 SYSCALL_DEFINE4(utimensat, int, dfd, const char __user *, filename,
-		struct timespec __user *, utimes, int, flags)
+		struct __kernel_timespec64 __user *, utimes, int, flags)
+#else
+SYSCALL_DEFINE4(utimens64at, int, dfd, const char __user *, filename,
+		struct __kernel_timespec64 __user *, utimes, int, flags)
+#endif
 {
-	struct timespec tstimes[2];
+	struct __kernel_timespec64 tstimes[2];
 
 	if (utimes) {
 		if (copy_from_user(&tstimes, utimes, sizeof(tstimes)))
@@ -200,11 +205,39 @@ SYSCALL_DEFINE4(utimensat, int, dfd, const char __user *, filename,
 	return do_utimes(dfd, filename, utimes ? tstimes : NULL, flags);
 }
 
+#ifdef CONFIG_64BIT
+COMPAT_SYSCALL_DEFINE4(utimensat, unsigned int, dfd, const char __user *, filename,
+		struct __kernel_timespec32 __user *, t, int, flags)
+#else
+SYSCALL_DEFINE4(utimensat, int, dfd, const char __user *, filename,
+		struct __kernel_timespec32 __user *, utimes, int, flags)
+#endif
+{
+	struct __kernel_timespec64 tstimes64[2];
+	struct __kernel_timespec32 tstimes[2];
+
+	if (utimes) {
+		if (copy_from_user(&tstimes, utimes, sizeof(tstimes)))
+			return -EFAULT;
+
+		/* Nothing to do, we must not even check the path.  */
+		if (tstimes[0].tv_nsec == UTIME_OMIT &&
+		    tstimes[1].tv_nsec == UTIME_OMIT)
+			return 0;
+		tstimes64[0].tv_sec = tstimes[0].tv_sec;
+		tstimes64[0].tv_nsec = tstimes[0].tv_nsec;
+		tstimes64[1].tv_sec = tstimes[1].tv_sec;
+		tstimes64[1].tv_nsec = tstimes[1].tv_nsec;
+	}
+
+	return do_utimes(dfd, filename, utimes ? tstimes64 : NULL, flags);
+}
+
 SYSCALL_DEFINE3(futimesat, int, dfd, const char __user *, filename,
 		struct timeval __user *, utimes)
 {
 	struct timeval times[2];
-	struct timespec tstimes[2];
+	struct __kernel_timespec64 tstimes[2];
 
 	if (utimes) {
 		if (copy_from_user(&times, utimes, sizeof(times)))
diff --git a/include/linux/compat.h b/include/linux/compat.h
index e649426..7fd34f9 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -453,7 +453,7 @@ asmlinkage long compat_sys_utime(const char __user *filename,
 				 struct compat_utimbuf __user *t);
 asmlinkage long compat_sys_utimensat(unsigned int dfd,
 				     const char __user *filename,
-				     struct compat_timespec __user *t,
+				     struct __kernel_timespec32 __user *t,
 				     int flags);
 
 asmlinkage long compat_sys_time(compat_time_t __user *tloc);
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index b0881a0..2332448 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -770,8 +770,15 @@ asmlinkage long sys_fstatat64(int dfd, const char __user *filename,
 			       struct stat64 __user *statbuf, int flag);
 asmlinkage long sys_readlinkat(int dfd, const char __user *path, char __user *buf,
 			       int bufsiz);
+#ifdef CONFIG_64BIT
 asmlinkage long sys_utimensat(int dfd, const char __user *filename,
-				struct timespec __user *utimes, int flags);
+				struct __kernel_timespec64 __user *utimes, int flags);
+#else
+asmlinkage long sys_utimens64at(int dfd, const char __user *filename,
+				struct __kernel_timespec64 __user *utimes, int flags);
+asmlinkage long sys_utimensat(int dfd, const char __user *filename,
+				struct __kernel_timespec32 __user *utimes, int flags);
+#endif
 asmlinkage long sys_unshare(unsigned long unshare_flags);
 
 asmlinkage long sys_splice(int fd_in, loff_t __user *off_in,
diff --git a/include/linux/time.h b/include/linux/time.h
index e2d5aa2..f431263 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -213,7 +213,7 @@ extern int do_settimeofday(const struct timespec *tv);
 extern int do_sys_settimeofday(const struct timespec *tv,
 			       const struct timezone *tz);
 #define do_posix_clock_monotonic_gettime(ts) ktime_get_ts(ts)
-extern long do_utimes(int dfd, const char __user *filename, struct timespec *times, int flags);
+extern long do_utimes(int dfd, const char __user *filename, struct __kernel_timespec64 *times, int flags);
 struct itimerval;
 extern int do_setitimer(int which, struct itimerval *value,
 			struct itimerval *ovalue);
diff --git a/init/initramfs.c b/init/initramfs.c
index a8497fa..5e89fb5 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -86,7 +86,7 @@ static void __init free_hash(void)
 
 static long __init do_utime(char *filename, time_t mtime)
 {
-	struct timespec t[2];
+	struct __kernel_timespec64 t[2];
 
 	t[0].tv_sec = mtime;
 	t[0].tv_nsec = 0;
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 04/32] fs: introduce sys_newfstat64/sys_newfstatat64
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (2 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 03/32] fs: introduce sys_utimens64at Arnd Bergmann
@ 2014-05-30 20:01 ` Arnd Bergmann
  2014-05-30 20:01 ` [RFC 05/32] arch: hook up new stat and utimes syscalls Arnd Bergmann
                   ` (30 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann

We want to be able to read file system timestamps beyond year 2038,
which is currently impossibly on 32-bit systems, since none of the
various stat syscall interfaces (oldstat, stat, stat64) handles
correctly.

This introduces a fourth version of the syscalls, called newfstat64
and newfstatat64, which operate on struct newstat64. Each 32-bit
architecture needs to define a version of this structure.
Architectures that have a 64-bit CPU should use the native 64-bit
'struct stat' if possible, so we can avoid adding a compat_newfstatat64
syscall. Note that there is no sys_newlstat64 or sys_newstat64,
as both can be trivially emulated from libc using newfstatat64.

This approach might not be the best solution, as there have been
proposals in the past to add a new 'struct xstat' interface that
would not only solve the y2038 problem but provide a number of
other extensions as well. I have chickened out and avoided reviving
that discussion for now. This new set of syscalls is a much simpler
addition and is hopefully less controversial.
If we end up merging xstat first, we won't need this patch.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 fs/stat.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)

diff --git a/fs/stat.c b/fs/stat.c
index ae0c3ce..77fd219 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -445,6 +445,61 @@ SYSCALL_DEFINE4(fstatat64, int, dfd, const char __user *, filename,
 }
 #endif /* __ARCH_WANT_STAT64 || __ARCH_WANT_COMPAT_STAT64 */
 
+#ifdef __ARCH_HAS_NEWSTAT64
+/*
+ * we only need this for the native 32-bit path, all architectures should
+ * define the 32-bit newstat64 as compatible with the 64-bit stat or
+ * stat64.
+ */
+static long cp_new_newstat64(struct kstat *stat, struct newstat64 __user *statbuf)
+{
+	struct newstat64 tmp;
+
+	memset(&tmp, 0, sizeof(tmp));
+	tmp.st_dev = huge_encode_dev(stat->dev);
+	tmp.st_rdev = huge_encode_dev(stat->rdev);
+	tmp.st_ino = stat->ino;
+	tmp.st_mode = stat->mode;
+	tmp.st_nlink = stat->nlink;
+	tmp.st_uid = from_kuid_munged(current_user_ns(), stat->uid);
+	tmp.st_gid = from_kgid_munged(current_user_ns(), stat->gid);
+	tmp.st_atime = stat->atime.tv_sec;
+	tmp.st_atime_nsec = stat->atime.tv_nsec;
+	tmp.st_mtime = stat->mtime.tv_sec;
+	tmp.st_mtime_nsec = stat->mtime.tv_nsec;
+	tmp.st_ctime = stat->ctime.tv_sec;
+	tmp.st_ctime_nsec = stat->ctime.tv_nsec;
+	tmp.st_size = stat->size;
+	tmp.st_blocks = stat->blocks;
+	tmp.st_blksize = stat->blksize;
+	return copy_to_user(statbuf,&tmp,sizeof(tmp)) ? -EFAULT : 0;
+}
+
+SYSCALL_DEFINE2(newfstat64, unsigned long, fd, struct newstat64 __user *, statbuf)
+{
+	struct kstat stat;
+	int error;
+
+	error = vfs_fstat(fd, &stat);
+	if (error)
+		return error;
+
+	return cp_new_newstat64(&stat, statbuf);
+}
+
+SYSCALL_DEFINE4(newfstatat64, int, dfd, const char __user *, filename,
+		struct newstat64 __user *, statbuf, int, flag)
+{
+	struct kstat stat;
+	int error;
+
+	error = vfs_fstatat(dfd, filename, &stat, flag);
+	if (error)
+		return error;
+	return cp_new_newstat64(&stat, statbuf);
+}
+#endif
+
 /* Caller is here responsible for sufficient locking (ie. inode->i_lock) */
 void __inode_add_bytes(struct inode *inode, loff_t bytes)
 {
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 05/32] arch: hook up new stat and utimes syscalls
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (3 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 04/32] fs: introduce sys_newfstat64/sys_newfstatat64 Arnd Bergmann
@ 2014-05-30 20:01 ` Arnd Bergmann
  2014-05-30 20:01 ` [RFC 06/32] isofs: fix timestamps beyond 2027 Arnd Bergmann
                   ` (29 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann

This adds the newly added system calls for newfstat64, newfstatat64
and utimens64at to x86, arm and all architectures using the generic
syscall ABI.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/include/asm/unistd.h      |  2 +-
 arch/arm/include/uapi/asm/stat.h   | 25 +++++++++++++++++++++++++
 arch/arm/include/uapi/asm/unistd.h |  3 +++
 arch/arm/kernel/calls.S            |  3 +++
 arch/arm64/include/asm/unistd32.h  |  5 ++++-
 arch/x86/include/uapi/asm/stat.h   | 28 ++++++++++++++++++++++++++++
 arch/x86/syscalls/syscall_32.tbl   |  3 +++
 include/uapi/asm-generic/stat.h    | 29 +++++++++++++++++++++++++++--
 include/uapi/asm-generic/unistd.h  |  8 +++++++-
 9 files changed, 101 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/unistd.h b/arch/arm/include/asm/unistd.h
index 4387624..2ed963a 100644
--- a/arch/arm/include/asm/unistd.h
+++ b/arch/arm/include/asm/unistd.h
@@ -15,7 +15,7 @@
 
 #include <uapi/asm/unistd.h>
 
-#define __NR_syscalls  (384)
+#define __NR_syscalls  (388)
 #define __ARM_NR_cmpxchg		(__ARM_NR_BASE+0x00fff0)
 
 #define __ARCH_WANT_STAT64
diff --git a/arch/arm/include/uapi/asm/stat.h b/arch/arm/include/uapi/asm/stat.h
index 42c0c13..ee2ba23 100644
--- a/arch/arm/include/uapi/asm/stat.h
+++ b/arch/arm/include/uapi/asm/stat.h
@@ -48,6 +48,31 @@ struct stat {
 	unsigned long  __unused5;
 };
 
+/* this matches the arm64 'struct stat' to allow a simpler compat ABI */
+#define __ARCH_HAS_NEWSTAT64
+struct newstat64 {
+	unsigned long long	st_dev;		/* Device.  */
+	unsigned long long	st_ino;		/* File serial number.  */
+	unsigned int		st_mode;	/* File mode.  */
+	unsigned int		st_nlink;	/* Link count.  */
+	unsigned int		st_uid;		/* User ID of the file's owner.  */
+	unsigned int		st_gid;		/* Group ID of the file's group. */
+	unsigned long long	st_rdev;	/* Device number, if device.  */
+	unsigned long long	__pad1;
+	long long		st_size;	/* Size of file, in bytes.  */
+	int			st_blksize;	/* Optimal block size for I/O.  */
+	int			__pad2;
+	long long		st_blocks;	/* Number 512-byte blocks allocated. */
+	long long		st_atime;	/* Time of last access.  */
+	unsigned long long	st_atime_nsec;
+	long long		st_mtime;	/* Time of last modification.  */
+	unsigned long long	st_mtime_nsec;
+	long long		st_ctime;	/* Time of last status change.  */
+	unsigned long long	st_ctime_nsec;
+	unsigned int		__unused4;
+	unsigned int		__unused5;
+};
+
 /* This matches struct stat64 in glibc2.1, hence the absolutely
  * insane amounts of padding around dev_t's.
  * Note: The kernel zero's the padded region because glibc might read them
diff --git a/arch/arm/include/uapi/asm/unistd.h b/arch/arm/include/uapi/asm/unistd.h
index ba94446..371f8d6 100644
--- a/arch/arm/include/uapi/asm/unistd.h
+++ b/arch/arm/include/uapi/asm/unistd.h
@@ -409,6 +409,9 @@
 #define __NR_sched_setattr		(__NR_SYSCALL_BASE+380)
 #define __NR_sched_getattr		(__NR_SYSCALL_BASE+381)
 #define __NR_renameat2			(__NR_SYSCALL_BASE+382)
+#define __NR_newfstat64			(__NR_SYSCALL_BASE+383)
+#define __NR_newfstatat64		(__NR_SYSCALL_BASE+384)
+#define __NR_utimens64at		(__NR_SYSCALL_BASE+385)
 
 /*
  * This may need to be greater than __NR_last_syscall+1 in order to
diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S
index 8f51bdc..a51a418 100644
--- a/arch/arm/kernel/calls.S
+++ b/arch/arm/kernel/calls.S
@@ -392,6 +392,9 @@
 /* 380 */	CALL(sys_sched_setattr)
 		CALL(sys_sched_getattr)
 		CALL(sys_renameat2)
+		CALL(sys_newfstat64)
+		CALL(sys_newfstatat64)
+/* 385 */	CALL(sys_utimens64at)
 #ifndef syscalls_counted
 .equ syscalls_padding, ((NR_syscalls + 3) & ~3) - NR_syscalls
 #define syscalls_counted
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index c8d8fc1..348c009 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -404,8 +404,11 @@ __SYSCALL(379, sys_finit_module)
 __SYSCALL(380, sys_sched_setattr)
 __SYSCALL(381, sys_sched_getattr)
 __SYSCALL(382, sys_renameat2)
+__SYSCALL(383, sys_newfstat)
+__SYSCALL(384, sys_newfstatat)
+__SYSCALL(385, sys_utimensat)
 
-#define __NR_compat_syscalls		383
+#define __NR_compat_syscalls		386
 
 /*
  * Compat syscall numbers used by the AArch64 kernel.
diff --git a/arch/x86/include/uapi/asm/stat.h b/arch/x86/include/uapi/asm/stat.h
index bc03eb5..43f7c4a 100644
--- a/arch/x86/include/uapi/asm/stat.h
+++ b/arch/x86/include/uapi/asm/stat.h
@@ -71,6 +71,34 @@ struct stat64 {
 	unsigned long long	st_ino;
 };
 
+/*
+ * This matches the native 'struct stat' on x86-64 and
+ * provides 64-bit timestamps on __i386__
+ */
+#define __ARCH_HAS_NEWSTAT64
+struct newstat64 {
+	unsigned long long	st_dev;
+	unsigned long long	st_ino;
+	unsigned long long	st_nlink;
+
+	unsigned int		st_mode;
+	unsigned int		st_uid;
+	unsigned int		st_gid;
+	unsigned int		__pad0;
+	unsigned long long	st_rdev;
+	long long		st_size;
+	long long		st_blksize;
+	long long		st_blocks;	/* Number 512-byte blocks allocated. */
+
+	unsigned long long	st_atime;
+	unsigned long long	st_atime_nsec;
+	unsigned long long	st_mtime;
+	unsigned long long	st_mtime_nsec;
+	unsigned long long	st_ctime;
+	unsigned long long	st_ctime_nsec;
+	long long		__unused[3];
+};
+
 /* We don't need to memset the whole thing just to initialize the padding */
 #define INIT_STRUCT_STAT64_PADDING(st) do {		\
 	memset(&st.__pad0, 0, sizeof(st.__pad0));	\
diff --git a/arch/x86/syscalls/syscall_32.tbl b/arch/x86/syscalls/syscall_32.tbl
index d6b8679..91b6a41 100644
--- a/arch/x86/syscalls/syscall_32.tbl
+++ b/arch/x86/syscalls/syscall_32.tbl
@@ -360,3 +360,6 @@
 351	i386	sched_setattr		sys_sched_setattr
 352	i386	sched_getattr		sys_sched_getattr
 353	i386	renameat2		sys_renameat2
+354	i386	newfstat64		sys_newfstat64			sys_newfstat
+355	i386	newfstatat64		sys_newfstatat64		sys_newfstatat
+356	i386	utimens64at		sys_utimens64at			sys_utimensat
diff --git a/include/uapi/asm-generic/stat.h b/include/uapi/asm-generic/stat.h
index bd8cad2..904e5dc 100644
--- a/include/uapi/asm-generic/stat.h
+++ b/include/uapi/asm-generic/stat.h
@@ -43,8 +43,33 @@ struct stat {
 	unsigned int	__unused5;
 };
 
-/* This matches struct stat64 in glibc2.1. Only used for 32 bit. */
-#if __BITS_PER_LONG != 64 || defined(__ARCH_WANT_STAT64)
+#if __BITS_PER_LONG != 64
+#define __ARCH_HAVE_NEWSTAT64
+struct newstat64 {
+	unsigned long long	st_dev;		/* Device.  */
+	unsigned long long	st_ino;		/* File serial number.  */
+	unsigned int		st_mode;	/* File mode.  */
+	unsigned int		st_nlink;	/* Link count.  */
+	unsigned int		st_uid;		/* User ID of the file's owner.  */
+	unsigned int		st_gid;		/* Group ID of the file's group. */
+	unsigned long long	st_rdev;	/* Device number, if device.  */
+	unsigned long long	__pad1;
+	long long		st_size;	/* Size of file, in bytes.  */
+	int			st_blksize;	/* Optimal block size for I/O.  */
+	int			__pad2;
+	long long		st_blocks;	/* Number 512-byte blocks allocated. */
+	long long		st_atime;	/* Time of last access.  */
+	unsigned long long	st_atime_nsec;
+	long long		st_mtime;	/* Time of last modification.  */
+	unsigned long long	st_mtime_nsec;
+	long long		st_ctime;	/* Time of last status change.  */
+	unsigned long long	st_ctime_nsec;
+	unsigned int		__unused4;
+	unsigned int		__unused5;
+};
+#endif
+
+#if __BITS_PER_LONG != 64 ||  defined(__ARCH_WANT_STAT64)
 struct stat64 {
 	unsigned long long st_dev;	/* Device.  */
 	unsigned long long st_ino;	/* File serial number.  */
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 3336406..ddcbd42 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -699,9 +699,15 @@ __SYSCALL(__NR_sched_setattr, sys_sched_setattr)
 __SYSCALL(__NR_sched_getattr, sys_sched_getattr)
 #define __NR_renameat2 276
 __SYSCALL(__NR_renameat2, sys_renameat2)
+#define __NR_newfstat64 277
+__SC_COMP_3264(__NR_newfstat64, sys_newfstat64, sys_newfstat, sys_newfstat)
+#define __NR_newfstatat64 278
+__SC_COMP_3264(__NR_newfstatat64, sys_newfstatat64, sys_newfstatat, sys_newfstatat)
+#define __NR_utimensat64 279
+__SC_COMP_3264(__NR_utimens64at, sys_utimensat, sys_utimensat)
 
 #undef __NR_syscalls
-#define __NR_syscalls 277
+#define __NR_syscalls 280
 
 /*
  * All syscalls below here should go away really,
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 06/32] isofs: fix timestamps beyond 2027
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (4 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 05/32] arch: hook up new stat and utimes syscalls Arnd Bergmann
@ 2014-05-30 20:01 ` Arnd Bergmann
  2014-05-31  7:59   ` Geert Uytterhoeven
  2014-05-30 20:01 ` [RFC 07/32] fs/nfs: convert to struct inode_time Arnd Bergmann
                   ` (28 subsequent siblings)
  34 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann

isofs uses a 'char' variable to load the number of years since
1900 for an inode timestamp. On architectures that use a signed
char type by default, this results in an invalid date for
anything beyond 2027.

This adds a cast to 'u8' for the year number, which should extend
the shelf life of the file system until 2155.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 fs/isofs/util.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/isofs/util.c b/fs/isofs/util.c
index 01e1ee7..28e7ff1 100644
--- a/fs/isofs/util.c
+++ b/fs/isofs/util.c
@@ -19,7 +19,7 @@ int iso_date(char * p, int flag)
 	int year, month, day, hour, minute, second, tz;
 	int crtime, days, i;
 
-	year = p[0] - 70;
+	year = (int)(u8)p[0] - 70;
 	month = p[1];
 	day = p[2];
 	hour = p[3];
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 07/32] fs/nfs: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (5 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 06/32] isofs: fix timestamps beyond 2027 Arnd Bergmann
@ 2014-05-30 20:01 ` Arnd Bergmann
  2014-05-30 20:01 ` [RFC 08/32] fs/ceph: convert to 'struct inode_time' Arnd Bergmann
                   ` (27 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, J. Bruce Fields, linux-nfs

This makes the nfs client and server code use 'struct inode_time'
instead of 'struct timespec', to lift the time stamp limitation
on 32-bit systems. With NFS version 2 and 3, this means we can
represent years up until 2106 rather than 2038. With NFS version
4, the on-wire representation allows 64-bit seconds.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: linux-nfs@vger.kernel.org
---
 fs/nfs/callback.h         |  4 ++--
 fs/nfs/callback_xdr.c     |  6 +++---
 fs/nfs/file.c             |  2 +-
 fs/nfs/fscache-index.c    |  8 ++++----
 fs/nfs/inode.c            | 10 +++++-----
 fs/nfs/internal.h         |  4 ++--
 fs/nfs/netns.h            |  2 +-
 fs/nfs/nfs2xdr.c          |  8 ++++----
 fs/nfs/nfs3xdr.c          | 10 +++++-----
 fs/nfs/nfs4xdr.c          | 20 ++++++++++----------
 fs/nfsd/nfs3xdr.c         |  6 +++---
 fs/nfsd/nfsfh.h           |  4 ++--
 fs/nfsd/nfsxdr.c          |  2 +-
 include/linux/nfs_fs_sb.h |  2 +-
 include/linux/nfs_xdr.h   | 14 +++++++-------
 15 files changed, 51 insertions(+), 51 deletions(-)

diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h
index 84326e9..3a3e6b4 100644
--- a/fs/nfs/callback.h
+++ b/fs/nfs/callback.h
@@ -71,8 +71,8 @@ struct cb_getattrres {
 	uint32_t bitmap[2];
 	uint64_t size;
 	uint64_t change_attr;
-	struct timespec ctime;
-	struct timespec mtime;
+	struct inode_time ctime;
+	struct inode_time mtime;
 };
 
 struct cb_recallargs {
diff --git a/fs/nfs/callback_xdr.c b/fs/nfs/callback_xdr.c
index f4ccfe6..177a6f7 100644
--- a/fs/nfs/callback_xdr.c
+++ b/fs/nfs/callback_xdr.c
@@ -596,7 +596,7 @@ static __be32 encode_attr_size(struct xdr_stream *xdr, const uint32_t *bitmap, u
 	return 0;
 }
 
-static __be32 encode_attr_time(struct xdr_stream *xdr, const struct timespec *time)
+static __be32 encode_attr_time(struct xdr_stream *xdr, const struct inode_time *time)
 {
 	__be32 *p;
 
@@ -608,14 +608,14 @@ static __be32 encode_attr_time(struct xdr_stream *xdr, const struct timespec *ti
 	return 0;
 }
 
-static __be32 encode_attr_ctime(struct xdr_stream *xdr, const uint32_t *bitmap, const struct timespec *time)
+static __be32 encode_attr_ctime(struct xdr_stream *xdr, const uint32_t *bitmap, const struct inode_time *time)
 {
 	if (!(bitmap[1] & FATTR4_WORD1_TIME_METADATA))
 		return 0;
 	return encode_attr_time(xdr,time);
 }
 
-static __be32 encode_attr_mtime(struct xdr_stream *xdr, const uint32_t *bitmap, const struct timespec *time)
+static __be32 encode_attr_mtime(struct xdr_stream *xdr, const uint32_t *bitmap, const struct inode_time *time)
 {
 	if (!(bitmap[1] & FATTR4_WORD1_TIME_MODIFY))
 		return 0;
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 4042ff5..9bdd210 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -772,7 +772,7 @@ do_unlk(struct file *filp, int cmd, struct file_lock *fl, int is_local)
 }
 
 static int
-is_time_granular(struct timespec *ts) {
+is_time_granular(struct inode_time *ts) {
 	return ((ts->tv_sec == 0) && (ts->tv_nsec <= 1000));
 }
 
diff --git a/fs/nfs/fscache-index.c b/fs/nfs/fscache-index.c
index 7cf2c46..ae75bad 100644
--- a/fs/nfs/fscache-index.c
+++ b/fs/nfs/fscache-index.c
@@ -157,10 +157,10 @@ const struct fscache_cookie_def nfs_fscache_super_index_def = {
  * cache object.
  */
 struct nfs_fscache_inode_auxdata {
-	struct timespec	mtime;
-	struct timespec	ctime;
-	loff_t		size;
-	u64		change_attr;
+	struct inode_time	mtime;
+	struct inode_time	ctime;
+	loff_t			size;
+	u64			change_attr;
 };
 
 /*
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index c496f8a..99c9145 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1107,14 +1107,14 @@ static unsigned long nfs_wcc_update_inode(struct inode *inode, struct nfs_fattr
 	/* If we have atomic WCC data, we may update some attributes */
 	if ((fattr->valid & NFS_ATTR_FATTR_PRECTIME)
 			&& (fattr->valid & NFS_ATTR_FATTR_CTIME)
-			&& timespec_equal(&inode->i_ctime, &fattr->pre_ctime)) {
+			&& inode_time_equal(&inode->i_ctime, &fattr->pre_ctime)) {
 		memcpy(&inode->i_ctime, &fattr->ctime, sizeof(inode->i_ctime));
 		ret |= NFS_INO_INVALID_ATTR;
 	}
 
 	if ((fattr->valid & NFS_ATTR_FATTR_PREMTIME)
 			&& (fattr->valid & NFS_ATTR_FATTR_MTIME)
-			&& timespec_equal(&inode->i_mtime, &fattr->pre_mtime)) {
+			&& inode_time_equal(&inode->i_mtime, &fattr->pre_mtime)) {
 		memcpy(&inode->i_mtime, &fattr->mtime, sizeof(inode->i_mtime));
 		if (S_ISDIR(inode->i_mode))
 			nfsi->cache_validity |= NFS_INO_INVALID_DATA;
@@ -1163,7 +1163,7 @@ static int nfs_check_inode_attributes(struct inode *inode, struct nfs_fattr *fat
 		invalid |= NFS_INO_INVALID_ATTR|NFS_INO_REVAL_PAGECACHE;
 
 	/* Verify a few of the more important attributes */
-	if ((fattr->valid & NFS_ATTR_FATTR_MTIME) && !timespec_equal(&inode->i_mtime, &fattr->mtime))
+	if ((fattr->valid & NFS_ATTR_FATTR_MTIME) && !inode_time_equal(&inode->i_mtime, &fattr->mtime))
 		invalid |= NFS_INO_INVALID_ATTR;
 
 	if (fattr->valid & NFS_ATTR_FATTR_SIZE) {
@@ -1185,7 +1185,7 @@ static int nfs_check_inode_attributes(struct inode *inode, struct nfs_fattr *fat
 	if ((fattr->valid & NFS_ATTR_FATTR_NLINK) && inode->i_nlink != fattr->nlink)
 		invalid |= NFS_INO_INVALID_ATTR;
 
-	if ((fattr->valid & NFS_ATTR_FATTR_ATIME) && !timespec_equal(&inode->i_atime, &fattr->atime))
+	if ((fattr->valid & NFS_ATTR_FATTR_ATIME) && !inode_time_equal(&inode->i_atime, &fattr->atime))
 		invalid |= NFS_INO_INVALID_ATIME;
 
 	if (invalid != 0)
@@ -1199,7 +1199,7 @@ static int nfs_ctime_need_update(const struct inode *inode, const struct nfs_fat
 {
 	if (!(fattr->valid & NFS_ATTR_FATTR_CTIME))
 		return 0;
-	return timespec_compare(&fattr->ctime, &inode->i_ctime) > 0;
+	return inode_time_compare(&fattr->ctime, &inode->i_ctime) > 0;
 }
 
 static int nfs_size_need_update(const struct inode *inode, const struct nfs_fattr *fattr)
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index 0e4e804..97e06f1 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -605,14 +605,14 @@ unsigned int nfs_page_array_len(unsigned int base, size_t len)
 }
 
 /*
- * Convert a struct timespec into a 64-bit change attribute
+ * Convert a struct inode_time into a 64-bit change attribute
  *
  * This does approximately the same thing as timespec_to_ns(),
  * but for calculation efficiency, we multiply the seconds by
  * 1024*1024*1024.
  */
 static inline
-u64 nfs_timespec_to_change_attr(const struct timespec *ts)
+u64 nfs_time_to_change_attr(const struct inode_time *ts)
 {
 	return ((u64)ts->tv_sec << 30) + ts->tv_nsec;
 }
diff --git a/fs/nfs/netns.h b/fs/nfs/netns.h
index 8ee1fab..f665fbd 100644
--- a/fs/nfs/netns.h
+++ b/fs/nfs/netns.h
@@ -28,7 +28,7 @@ struct nfs_net {
 	int cb_users[NFS4_MAX_MINOR_VERSION + 1];
 #endif
 	spinlock_t nfs_client_lock;
-	struct timespec boot_time;
+	struct inode_time boot_time;
 };
 
 extern int nfs_net_id;
diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c
index 62db136..984b7cd 100644
--- a/fs/nfs/nfs2xdr.c
+++ b/fs/nfs/nfs2xdr.c
@@ -222,7 +222,7 @@ out_overflow:
  *		unsigned int useconds;
  *	};
  */
-static __be32 *xdr_encode_time(__be32 *p, const struct timespec *timep)
+static __be32 *xdr_encode_time(__be32 *p, const struct inode_time *timep)
 {
 	*p++ = cpu_to_be32(timep->tv_sec);
 	if (timep->tv_nsec != 0)
@@ -240,14 +240,14 @@ static __be32 *xdr_encode_time(__be32 *p, const struct timespec *timep)
  * Illustrated" by Brent Callaghan, Addison-Wesley, ISBN 0-201-32750-5.
  */
 static __be32 *xdr_encode_current_server_time(__be32 *p,
-					      const struct timespec *timep)
+					      const struct inode_time *timep)
 {
 	*p++ = cpu_to_be32(timep->tv_sec);
 	*p++ = cpu_to_be32(1000000);
 	return p;
 }
 
-static __be32 *xdr_decode_time(__be32 *p, struct timespec *timep)
+static __be32 *xdr_decode_time(__be32 *p, struct inode_time *timep)
 {
 	timep->tv_sec = be32_to_cpup(p++);
 	timep->tv_nsec = be32_to_cpup(p++) * NSEC_PER_USEC;
@@ -315,7 +315,7 @@ static int decode_fattr(struct xdr_stream *xdr, struct nfs_fattr *fattr)
 	p = xdr_decode_time(p, &fattr->atime);
 	p = xdr_decode_time(p, &fattr->mtime);
 	xdr_decode_time(p, &fattr->ctime);
-	fattr->change_attr = nfs_timespec_to_change_attr(&fattr->ctime);
+	fattr->change_attr = nfs_time_to_change_attr(&fattr->ctime);
 
 	return 0;
 out_uid:
diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c
index fa6d721..09c40f2 100644
--- a/fs/nfs/nfs3xdr.c
+++ b/fs/nfs/nfs3xdr.c
@@ -477,21 +477,21 @@ static void zero_nfs_fh3(struct nfs_fh *fh)
 }
 
 /*
- * nfstime3
+ * nfstime3 
  *
  *	struct nfstime3 {
  *		uint32	seconds;
  *		uint32	nseconds;
  *	};
  */
-static __be32 *xdr_encode_nfstime3(__be32 *p, const struct timespec *timep)
+static __be32 *xdr_encode_nfstime3(__be32 *p, const struct inode_time *timep)
 {
 	*p++ = cpu_to_be32(timep->tv_sec);
 	*p++ = cpu_to_be32(timep->tv_nsec);
 	return p;
 }
 
-static __be32 *xdr_decode_nfstime3(__be32 *p, struct timespec *timep)
+static __be32 *xdr_decode_nfstime3(__be32 *p, struct inode_time *timep)
 {
 	timep->tv_sec = be32_to_cpup(p++);
 	timep->tv_nsec = be32_to_cpup(p++);
@@ -675,7 +675,7 @@ static int decode_fattr3(struct xdr_stream *xdr, struct nfs_fattr *fattr)
 	p = xdr_decode_nfstime3(p, &fattr->atime);
 	p = xdr_decode_nfstime3(p, &fattr->mtime);
 	xdr_decode_nfstime3(p, &fattr->ctime);
-	fattr->change_attr = nfs_timespec_to_change_attr(&fattr->ctime);
+	fattr->change_attr = nfs_time_to_change_attr(&fattr->ctime);
 
 	fattr->valid |= NFS_ATTR_FATTR_V3;
 	return 0;
@@ -739,7 +739,7 @@ static int decode_wcc_attr(struct xdr_stream *xdr, struct nfs_fattr *fattr)
 	p = xdr_decode_size3(p, &fattr->pre_size);
 	p = xdr_decode_nfstime3(p, &fattr->pre_mtime);
 	xdr_decode_nfstime3(p, &fattr->pre_ctime);
-	fattr->pre_change_attr = nfs_timespec_to_change_attr(&fattr->pre_ctime);
+	fattr->pre_change_attr = nfs_time_to_change_attr(&fattr->pre_ctime);
 
 	return 0;
 out_overflow:
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 73ce8d4..a41265b 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -4073,7 +4073,7 @@ out_overflow:
 	return -EIO;
 }
 
-static int decode_attr_time(struct xdr_stream *xdr, struct timespec *time)
+static int decode_attr_time(struct xdr_stream *xdr, struct inode_time *time)
 {
 	__be32 *p;
 	uint64_t sec;
@@ -4084,7 +4084,7 @@ static int decode_attr_time(struct xdr_stream *xdr, struct timespec *time)
 		goto out_overflow;
 	p = xdr_decode_hyper(p, &sec);
 	nsec = be32_to_cpup(p);
-	time->tv_sec = (time_t)sec;
+	time->tv_sec = sec;
 	time->tv_nsec = (long)nsec;
 	return 0;
 out_overflow:
@@ -4092,7 +4092,7 @@ out_overflow:
 	return -EIO;
 }
 
-static int decode_attr_time_access(struct xdr_stream *xdr, uint32_t *bitmap, struct timespec *time)
+static int decode_attr_time_access(struct xdr_stream *xdr, uint32_t *bitmap, struct inode_time *time)
 {
 	int status = 0;
 
@@ -4106,11 +4106,11 @@ static int decode_attr_time_access(struct xdr_stream *xdr, uint32_t *bitmap, str
 			status = NFS_ATTR_FATTR_ATIME;
 		bitmap[1] &= ~FATTR4_WORD1_TIME_ACCESS;
 	}
-	dprintk("%s: atime=%ld\n", __func__, (long)time->tv_sec);
+	dprintk("%s: atime=%lld\n", __func__, (long long)time->tv_sec);
 	return status;
 }
 
-static int decode_attr_time_metadata(struct xdr_stream *xdr, uint32_t *bitmap, struct timespec *time)
+static int decode_attr_time_metadata(struct xdr_stream *xdr, uint32_t *bitmap, struct inode_time *time)
 {
 	int status = 0;
 
@@ -4124,12 +4124,12 @@ static int decode_attr_time_metadata(struct xdr_stream *xdr, uint32_t *bitmap, s
 			status = NFS_ATTR_FATTR_CTIME;
 		bitmap[1] &= ~FATTR4_WORD1_TIME_METADATA;
 	}
-	dprintk("%s: ctime=%ld\n", __func__, (long)time->tv_sec);
+	dprintk("%s: ctime=%lld\n", __func__, (long long)time->tv_sec);
 	return status;
 }
 
 static int decode_attr_time_delta(struct xdr_stream *xdr, uint32_t *bitmap,
-				  struct timespec *time)
+				  struct inode_time *time)
 {
 	int status = 0;
 
@@ -4141,7 +4141,7 @@ static int decode_attr_time_delta(struct xdr_stream *xdr, uint32_t *bitmap,
 		status = decode_attr_time(xdr, time);
 		bitmap[1] &= ~FATTR4_WORD1_TIME_DELTA;
 	}
-	dprintk("%s: time_delta=%ld %ld\n", __func__, (long)time->tv_sec,
+	dprintk("%s: time_delta=%lld %ld\n", __func__, (long long)time->tv_sec,
 		(long)time->tv_nsec);
 	return status;
 }
@@ -4196,7 +4196,7 @@ out_overflow:
 	return -EIO;
 }
 
-static int decode_attr_time_modify(struct xdr_stream *xdr, uint32_t *bitmap, struct timespec *time)
+static int decode_attr_time_modify(struct xdr_stream *xdr, uint32_t *bitmap, struct inode_time *time)
 {
 	int status = 0;
 
@@ -4210,7 +4210,7 @@ static int decode_attr_time_modify(struct xdr_stream *xdr, uint32_t *bitmap, str
 			status = NFS_ATTR_FATTR_MTIME;
 		bitmap[1] &= ~FATTR4_WORD1_TIME_MODIFY;
 	}
-	dprintk("%s: mtime=%ld\n", __func__, (long)time->tv_sec);
+	dprintk("%s: mtime=%lld\n", __func__, (long long)time->tv_sec);
 	return status;
 }
 
diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
index de6e39e..46d2eb1 100644
--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -30,14 +30,14 @@ static u32	nfs3_ftypes[] = {
  * XDR functions for basic NFS types
  */
 static __be32 *
-encode_time3(__be32 *p, struct timespec *time)
+encode_time3(__be32 *p, struct inode_time *time)
 {
 	*p++ = htonl((u32) time->tv_sec); *p++ = htonl(time->tv_nsec);
 	return p;
 }
 
 static __be32 *
-decode_time3(__be32 *p, struct timespec *time)
+decode_time3(__be32 *p, struct inode_time *time)
 {
 	time->tv_sec = ntohl(*p++);
 	time->tv_nsec = ntohl(*p++);
@@ -292,7 +292,7 @@ nfs3svc_decode_sattrargs(struct svc_rqst *rqstp, __be32 *p,
 	p = decode_sattr3(p, &args->attrs);
 
 	if ((args->check_guard = ntohl(*p++)) != 0) { 
-		struct timespec time; 
+		struct inode_time time; 
 		p = decode_time3(p, &time);
 		args->guardtime = time.tv_sec;
 	}
diff --git a/fs/nfsd/nfsfh.h b/fs/nfsd/nfsfh.h
index 2e89e70..a7eb5af 100644
--- a/fs/nfsd/nfsfh.h
+++ b/fs/nfsd/nfsfh.h
@@ -39,8 +39,8 @@ typedef struct svc_fh {
 
 	/* Pre-op attributes saved during fh_lock */
 	__u64			fh_pre_size;	/* size before operation */
-	struct timespec		fh_pre_mtime;	/* mtime before oper */
-	struct timespec		fh_pre_ctime;	/* ctime before oper */
+	struct inode_time	fh_pre_mtime;	/* mtime before oper */
+	struct inode_time	fh_pre_ctime;	/* ctime before oper */
 	/*
 	 * pre-op nfsv4 change attr: note must check IS_I_VERSION(inode)
 	 *  to find out if it is valid.
diff --git a/fs/nfsd/nfsxdr.c b/fs/nfsd/nfsxdr.c
index 9c769a4..cfac45c 100644
--- a/fs/nfsd/nfsxdr.c
+++ b/fs/nfsd/nfsxdr.c
@@ -146,7 +146,7 @@ encode_fattr(struct svc_rqst *rqstp, __be32 *p, struct svc_fh *fhp,
 {
 	struct dentry	*dentry = fhp->fh_dentry;
 	int type;
-	struct timespec time;
+	struct inode_time time;
 	u32 f;
 
 	type = (stat->mode & S_IFMT);
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 1150ea4..2370468 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -147,7 +147,7 @@ struct nfs_server {
 
 	struct nfs_fsid		fsid;
 	__u64			maxfilesize;	/* maximum file size */
-	struct timespec		time_delta;	/* smallest time granularity */
+	struct inode_time	time_delta;	/* smallest time granularity */
 	unsigned long		mount_time;	/* when this fs was mounted */
 	struct super_block	*super;		/* VFS super block */
 	dev_t			s_dev;		/* superblock dev numbers */
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 6fb5b23..ae27bf4 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -61,14 +61,14 @@ struct nfs_fattr {
 	struct nfs_fsid		fsid;
 	__u64			fileid;
 	__u64			mounted_on_fileid;
-	struct timespec		atime;
-	struct timespec		mtime;
-	struct timespec		ctime;
+	struct inode_time	atime;
+	struct inode_time	mtime;
+	struct inode_time	ctime;
 	__u64			change_attr;	/* NFSv4 change attribute */
 	__u64			pre_change_attr;/* pre-op NFSv4 change attribute */
 	__u64			pre_size;	/* pre_op_attr.size	  */
-	struct timespec		pre_mtime;	/* pre_op_attr.mtime	  */
-	struct timespec		pre_ctime;	/* pre_op_attr.ctime	  */
+	struct inode_time	pre_mtime;	/* pre_op_attr.mtime	  */
+	struct inode_time	pre_ctime;	/* pre_op_attr.ctime	  */
 	unsigned long		time_start;
 	unsigned long		gencount;
 	struct nfs4_string	*owner_name;
@@ -137,7 +137,7 @@ struct nfs_fsinfo {
 	__u32			wtmult;	/* writes should be multiple of this */
 	__u32			dtpref;	/* pref. readdir transfer size */
 	__u64			maxfilesize;
-	struct timespec		time_delta; /* server time granularity */
+	struct inode_time	time_delta; /* server time granularity */
 	__u32			lease_time; /* in seconds */
 	__u32			layouttype; /* supported pnfs layout driver */
 	__u32			blksize; /* preferred pnfs io block size */
@@ -745,7 +745,7 @@ struct nfs3_sattrargs {
 	struct nfs_fh *		fh;
 	struct iattr *		sattr;
 	unsigned int		guard;
-	struct timespec		guardtime;
+	struct inode_time	guardtime;
 };
 
 struct nfs3_diropargs {
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 08/32] fs/ceph: convert to 'struct inode_time'
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (6 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 07/32] fs/nfs: convert to struct inode_time Arnd Bergmann
@ 2014-05-30 20:01 ` Arnd Bergmann
  2014-05-30 20:01 ` [RFC 09/32] fs/pstore: convert to struct inode_time Arnd Bergmann
                   ` (26 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Sage Weil, ceph-devel

Ceph supports timestamps until year 2106 using u32 seconds on the
wire, but the kernel internally limits this to a signed value
that only works until 2038 on 32 bit CPUs.

This changes the type used in the ceph code to struct inode_time
to lift that limitation.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Sage Weil <sage@inktank.com>
Cc: ceph-devel@vger.kernel.org
---
 drivers/block/rbd.c             |  2 +-
 fs/ceph/cache.c                 |  2 +-
 fs/ceph/caps.c                  |  6 +++---
 fs/ceph/file.c                  |  4 ++--
 fs/ceph/inode.c                 | 20 ++++++++++----------
 fs/ceph/super.h                 |  8 ++++----
 include/linux/ceph/decode.h     |  8 ++++----
 include/linux/ceph/osd_client.h |  4 ++--
 net/ceph/auth_x.c               |  2 +-
 net/ceph/osd_client.c           |  4 ++--
 10 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 4c95b50..d4a7404 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1710,7 +1710,7 @@ static void rbd_osd_req_format_write(struct rbd_obj_request *obj_request)
 	struct rbd_img_request *img_request = obj_request->img_request;
 	struct ceph_osd_request *osd_req = obj_request->osd_req;
 	struct ceph_snap_context *snapc;
-	struct timespec mtime = CURRENT_TIME;
+	struct inode_time mtime = CURRENT_TIME;
 
 	rbd_assert(osd_req != NULL);
 
diff --git a/fs/ceph/cache.c b/fs/ceph/cache.c
index 834f9f3..cf48f4b 100644
--- a/fs/ceph/cache.c
+++ b/fs/ceph/cache.c
@@ -25,7 +25,7 @@
 #include "cache.h"
 
 struct ceph_aux_inode {
-	struct timespec	mtime;
+	struct inode_time mtime;
 	loff_t          size;
 };
 
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index c561b62..263eecd 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -988,7 +988,7 @@ static int send_cap_msg(struct ceph_mds_session *session,
 			int caps, int wanted, int dirty,
 			u32 seq, u64 flush_tid, u32 issue_seq, u32 mseq,
 			u64 size, u64 max_size,
-			struct timespec *mtime, struct timespec *atime,
+			struct inode_time *mtime, struct inode_time *atime,
 			u64 time_warp_seq,
 			kuid_t uid, kgid_t gid, umode_t mode,
 			u64 xattr_version,
@@ -1132,7 +1132,7 @@ static int __send_cap(struct ceph_mds_client *mdsc, struct ceph_cap *cap,
 	int held, revoking, dropping, keep;
 	u64 seq, issue_seq, mseq, time_warp_seq, follows;
 	u64 size, max_size;
-	struct timespec mtime, atime;
+	struct inode_time mtime, atime;
 	int wake = 0;
 	umode_t mode;
 	kuid_t uid;
@@ -2416,7 +2416,7 @@ static void handle_cap_grant(struct inode *inode, struct ceph_mds_caps *grant,
 	int issued, implemented, used, wanted, dirty;
 	u64 size = le64_to_cpu(grant->size);
 	u64 max_size = le64_to_cpu(grant->max_size);
-	struct timespec mtime, atime, ctime;
+	struct inode_time mtime, atime, ctime;
 	int check_caps = 0;
 	int wake = 0;
 	int writeback = 0;
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 3020851..0ebb709 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -546,7 +546,7 @@ ceph_sync_direct_write(struct kiocb *iocb, struct iov_iter *from)
 	int flags;
 	int check_caps = 0;
 	int ret;
-	struct timespec mtime = CURRENT_TIME;
+	struct inode_time mtime = CURRENT_TIME;
 	loff_t pos = iocb->ki_pos;
 	size_t count = iov_iter_count(from);
 
@@ -662,7 +662,7 @@ static ssize_t ceph_sync_write(struct kiocb *iocb, struct iov_iter *from)
 	int flags;
 	int check_caps = 0;
 	int ret;
-	struct timespec mtime = CURRENT_TIME;
+	struct inode_time mtime = CURRENT_TIME;
 	loff_t pos = iocb->ki_pos;
 	size_t count = iov_iter_count(from);
 
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index e4fff9f..ba18c9d 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -506,8 +506,8 @@ int ceph_fill_file_size(struct inode *inode, int issued,
 }
 
 void ceph_fill_file_time(struct inode *inode, int issued,
-			 u64 time_warp_seq, struct timespec *ctime,
-			 struct timespec *mtime, struct timespec *atime)
+			 u64 time_warp_seq, struct inode_time *ctime,
+			 struct inode_time *mtime, struct inode_time *atime)
 {
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	int warn = 0;
@@ -517,7 +517,7 @@ void ceph_fill_file_time(struct inode *inode, int issued,
 		      CEPH_CAP_FILE_BUFFER|
 		      CEPH_CAP_AUTH_EXCL|
 		      CEPH_CAP_XATTR_EXCL)) {
-		if (timespec_compare(ctime, &inode->i_ctime) > 0) {
+		if (inode_time_compare(ctime, &inode->i_ctime) > 0) {
 			dout("ctime %ld.%09ld -> %ld.%09ld inc w/ cap\n",
 			     inode->i_ctime.tv_sec, inode->i_ctime.tv_nsec,
 			     ctime->tv_sec, ctime->tv_nsec);
@@ -536,14 +536,14 @@ void ceph_fill_file_time(struct inode *inode, int issued,
 			ci->i_time_warp_seq = time_warp_seq;
 		} else if (time_warp_seq == ci->i_time_warp_seq) {
 			/* nobody did utimes(); take the max */
-			if (timespec_compare(mtime, &inode->i_mtime) > 0) {
+			if (inode_time_compare(mtime, &inode->i_mtime) > 0) {
 				dout("mtime %ld.%09ld -> %ld.%09ld inc\n",
 				     inode->i_mtime.tv_sec,
 				     inode->i_mtime.tv_nsec,
 				     mtime->tv_sec, mtime->tv_nsec);
 				inode->i_mtime = *mtime;
 			}
-			if (timespec_compare(atime, &inode->i_atime) > 0) {
+			if (inode_time_compare(atime, &inode->i_atime) > 0) {
 				dout("atime %ld.%09ld -> %ld.%09ld inc\n",
 				     inode->i_atime.tv_sec,
 				     inode->i_atime.tv_nsec,
@@ -586,7 +586,7 @@ static int fill_inode(struct inode *inode,
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	int i;
 	int issued = 0, implemented;
-	struct timespec mtime, atime, ctime;
+	struct inode_time mtime, atime, ctime;
 	u32 nsplits;
 	struct ceph_inode_frag *frag;
 	struct rb_node *rb_node;
@@ -1714,12 +1714,12 @@ int ceph_setattr(struct dentry *dentry, struct iattr *attr)
 			inode->i_atime = attr->ia_atime;
 			dirtied |= CEPH_CAP_FILE_EXCL;
 		} else if ((issued & CEPH_CAP_FILE_WR) &&
-			   timespec_compare(&inode->i_atime,
+			   inode_time_compare(&inode->i_atime,
 					    &attr->ia_atime) < 0) {
 			inode->i_atime = attr->ia_atime;
 			dirtied |= CEPH_CAP_FILE_WR;
 		} else if ((issued & CEPH_CAP_FILE_SHARED) == 0 ||
-			   !timespec_equal(&inode->i_atime, &attr->ia_atime)) {
+			   !inode_time_equal(&inode->i_atime, &attr->ia_atime)) {
 			ceph_encode_timespec(&req->r_args.setattr.atime,
 					     &attr->ia_atime);
 			mask |= CEPH_SETATTR_ATIME;
@@ -1736,12 +1736,12 @@ int ceph_setattr(struct dentry *dentry, struct iattr *attr)
 			inode->i_mtime = attr->ia_mtime;
 			dirtied |= CEPH_CAP_FILE_EXCL;
 		} else if ((issued & CEPH_CAP_FILE_WR) &&
-			   timespec_compare(&inode->i_mtime,
+			   inode_time_compare(&inode->i_mtime,
 					    &attr->ia_mtime) < 0) {
 			inode->i_mtime = attr->ia_mtime;
 			dirtied |= CEPH_CAP_FILE_WR;
 		} else if ((issued & CEPH_CAP_FILE_SHARED) == 0 ||
-			   !timespec_equal(&inode->i_mtime, &attr->ia_mtime)) {
+			   !inode_time_equal(&inode->i_mtime, &attr->ia_mtime)) {
 			ceph_encode_timespec(&req->r_args.setattr.mtime,
 					     &attr->ia_mtime);
 			mask |= CEPH_SETATTR_MTIME;
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index ead05cc..15dc11a 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -156,7 +156,7 @@ struct ceph_cap_snap {
 	u64 xattr_version;
 
 	u64 size;
-	struct timespec mtime, atime, ctime;
+	struct inode_time mtime, atime, ctime;
 	u64 time_warp_seq;
 	int writing;   /* a sync write is still in progress */
 	int dirty_pages;     /* dirty pages awaiting writeback */
@@ -263,7 +263,7 @@ struct ceph_inode_info {
 	char *i_symlink;
 
 	/* for dirs */
-	struct timespec i_rctime;
+	struct inode_time i_rctime;
 	u64 i_rbytes, i_rfiles, i_rsubdirs;
 	u64 i_files, i_subdirs;
 
@@ -698,8 +698,8 @@ extern struct inode *ceph_get_snapdir(struct inode *parent);
 extern int ceph_fill_file_size(struct inode *inode, int issued,
 			       u32 truncate_seq, u64 truncate_size, u64 size);
 extern void ceph_fill_file_time(struct inode *inode, int issued,
-				u64 time_warp_seq, struct timespec *ctime,
-				struct timespec *mtime, struct timespec *atime);
+				u64 time_warp_seq, struct inode_time *ctime,
+				struct inode_time *mtime, struct inode_time *atime);
 extern int ceph_fill_trace(struct super_block *sb,
 			   struct ceph_mds_request *req,
 			   struct ceph_mds_session *session);
diff --git a/include/linux/ceph/decode.h b/include/linux/ceph/decode.h
index a6ef9cc..f1bb277 100644
--- a/include/linux/ceph/decode.h
+++ b/include/linux/ceph/decode.h
@@ -132,16 +132,16 @@ bad:
 }
 
 /*
- * struct ceph_timespec <-> struct timespec
+ * struct ceph_timespec <-> struct inode_time
  */
-static inline void ceph_decode_timespec(struct timespec *ts,
+static inline void ceph_decode_timespec(struct inode_time *ts,
 					const struct ceph_timespec *tv)
 {
-	ts->tv_sec = (__kernel_time_t)le32_to_cpu(tv->tv_sec);
+	ts->tv_sec = (u64)le32_to_cpu(tv->tv_sec);
 	ts->tv_nsec = (long)le32_to_cpu(tv->tv_nsec);
 }
 static inline void ceph_encode_timespec(struct ceph_timespec *tv,
-					const struct timespec *ts)
+					const struct inode_time *ts)
 {
 	tv->tv_sec = cpu_to_le32((u32)ts->tv_sec);
 	tv->tv_nsec = cpu_to_le32((u32)ts->tv_nsec);
diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h
index 94ec696..1617d31 100644
--- a/include/linux/ceph/osd_client.h
+++ b/include/linux/ceph/osd_client.h
@@ -312,7 +312,7 @@ extern struct ceph_osd_request *ceph_osdc_alloc_request(struct ceph_osd_client *
 extern void ceph_osdc_build_request(struct ceph_osd_request *req, u64 off,
 				    struct ceph_snap_context *snapc,
 				    u64 snap_id,
-				    struct timespec *mtime);
+				    struct inode_time *mtime);
 
 extern struct ceph_osd_request *ceph_osdc_new_request(struct ceph_osd_client *,
 				      struct ceph_file_layout *layout,
@@ -361,7 +361,7 @@ extern int ceph_osdc_writepages(struct ceph_osd_client *osdc,
 				struct ceph_snap_context *sc,
 				u64 off, u64 len,
 				u32 truncate_seq, u64 truncate_size,
-				struct timespec *mtime,
+				struct inode_time *mtime,
 				struct page **pages, int nr_pages);
 
 /* watch/notify events */
diff --git a/net/ceph/auth_x.c b/net/ceph/auth_x.c
index 96238ba..14f0e8a 100644
--- a/net/ceph/auth_x.c
+++ b/net/ceph/auth_x.c
@@ -163,7 +163,7 @@ static int ceph_x_proc_ticket_reply(struct ceph_auth_client *ac,
 		void *dp, *dend;
 		int dlen;
 		char is_enc;
-		struct timespec validity;
+		struct inode_time validity;
 		struct ceph_crypto_key old_key;
 		void *tp, *tpend;
 		struct ceph_timespec new_validity;
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index b0dfce7..1433798 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -2315,7 +2315,7 @@ bad:
  */
 void ceph_osdc_build_request(struct ceph_osd_request *req, u64 off,
 				struct ceph_snap_context *snapc, u64 snap_id,
-				struct timespec *mtime)
+				struct inode_time *mtime)
 {
 	struct ceph_msg *msg = req->r_request;
 	void *p;
@@ -2630,7 +2630,7 @@ int ceph_osdc_writepages(struct ceph_osd_client *osdc, struct ceph_vino vino,
 			 struct ceph_snap_context *snapc,
 			 u64 off, u64 len,
 			 u32 truncate_seq, u64 truncate_size,
-			 struct timespec *mtime,
+			 struct inode_time *mtime,
 			 struct page **pages, int num_pages)
 {
 	struct ceph_osd_request *req;
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 09/32] fs/pstore: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (7 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 08/32] fs/ceph: convert to 'struct inode_time' Arnd Bergmann
@ 2014-05-30 20:01 ` Arnd Bergmann
  2014-05-30 21:14   ` Kees Cook
  2014-05-30 20:01 ` [RFC 10/32] fs/coda: " Arnd Bergmann
                   ` (25 subsequent siblings)
  34 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Anton Vorontsov, Colin Cross,
	Kees Cook, Tony Luck

pstore uses timestamps encoded in a string as seconds, but on 32-bit systems
cannot go beyond year 2038 because of the limits of struct timespec.

This converts the pstore code to use the new struct inode_time for timestamps.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tony Luck <tony.luck@intel.com>
---
 drivers/firmware/efi/efi-pstore.c | 28 ++++++++++++++--------------
 fs/pstore/inode.c                 |  2 +-
 fs/pstore/internal.h              |  2 +-
 fs/pstore/platform.c              |  2 +-
 fs/pstore/ram.c                   | 18 ++++++++++--------
 include/linux/pstore.h            |  4 ++--
 6 files changed, 29 insertions(+), 27 deletions(-)

diff --git a/drivers/firmware/efi/efi-pstore.c b/drivers/firmware/efi/efi-pstore.c
index 4b9dc83..a1e4153 100644
--- a/drivers/firmware/efi/efi-pstore.c
+++ b/drivers/firmware/efi/efi-pstore.c
@@ -32,7 +32,7 @@ struct pstore_read_data {
 	u64 *id;
 	enum pstore_type_id *type;
 	int *count;
-	struct timespec *timespec;
+	struct inode_time *inode_time;
 	bool *compressed;
 	char **buf;
 };
@@ -63,8 +63,8 @@ static int efi_pstore_read_func(struct efivar_entry *entry, void *data)
 		   cb_data->type, &part, &cnt, &time, &data_type) == 5) {
 		*cb_data->id = generic_id(time, part, cnt);
 		*cb_data->count = cnt;
-		cb_data->timespec->tv_sec = time;
-		cb_data->timespec->tv_nsec = 0;
+		cb_data->inode_time->tv_sec = time;
+		cb_data->inode_time->tv_nsec = 0;
 		if (data_type == 'C')
 			*cb_data->compressed = true;
 		else
@@ -73,8 +73,8 @@ static int efi_pstore_read_func(struct efivar_entry *entry, void *data)
 		   cb_data->type, &part, &cnt, &time) == 4) {
 		*cb_data->id = generic_id(time, part, cnt);
 		*cb_data->count = cnt;
-		cb_data->timespec->tv_sec = time;
-		cb_data->timespec->tv_nsec = 0;
+		cb_data->inode_time->tv_sec = time;
+		cb_data->inode_time->tv_nsec = 0;
 		*cb_data->compressed = false;
 	} else if (sscanf(name, "dump-type%u-%u-%lu",
 			  cb_data->type, &part, &time) == 3) {
@@ -85,8 +85,8 @@ static int efi_pstore_read_func(struct efivar_entry *entry, void *data)
 		 */
 		*cb_data->id = generic_id(time, part, 0);
 		*cb_data->count = 0;
-		cb_data->timespec->tv_sec = time;
-		cb_data->timespec->tv_nsec = 0;
+		cb_data->inode_time->tv_sec = time;
+		cb_data->inode_time->tv_nsec = 0;
 		*cb_data->compressed = false;
 	} else
 		return 0;
@@ -208,7 +208,7 @@ static int efi_pstore_sysfs_entry_iter(void *data, struct efivar_entry **pos)
  *           and pstore will stop reading entry.
  */
 static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
-			       int *count, struct timespec *timespec,
+			       int *count, struct inode_time *inode_time,
 			       char **buf, bool *compressed,
 			       struct pstore_info *psi)
 {
@@ -218,7 +218,7 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
 	data.id = id;
 	data.type = type;
 	data.count = count;
-	data.timespec = timespec;
+	data.inode_time = inode_time;
 	data.compressed = compressed;
 	data.buf = buf;
 
@@ -266,7 +266,7 @@ struct pstore_erase_data {
 	u64 id;
 	enum pstore_type_id type;
 	int count;
-	struct timespec time;
+	struct inode_time time;
 	efi_char16_t *name;
 };
 
@@ -292,8 +292,8 @@ static int efi_pstore_erase_func(struct efivar_entry *entry, void *data)
 		 * Check if an old format, which doesn't support
 		 * holding multiple logs, remains.
 		 */
-		sprintf(name_old, "dump-type%u-%u-%lu", ed->type,
-			(unsigned int)ed->id, ed->time.tv_sec);
+		sprintf(name_old, "dump-type%u-%u-%llu", ed->type,
+			(unsigned int)ed->id, (u64)ed->time.tv_sec);
 
 		for (i = 0; i < DUMP_NAME_LEN; i++)
 			efi_name_old[i] = name_old[i];
@@ -319,7 +319,7 @@ static int efi_pstore_erase_func(struct efivar_entry *entry, void *data)
 }
 
 static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
-			    struct timespec time, struct pstore_info *psi)
+			    struct inode_time time, struct pstore_info *psi)
 {
 	struct pstore_erase_data edata;
 	struct efivar_entry *entry = NULL;
@@ -330,7 +330,7 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
 
 	do_div(id, 1000);
 	part = do_div(id, 100);
-	sprintf(name, "dump-type%u-%u-%d-%lu", type, part, count, time.tv_sec);
+	sprintf(name, "dump-type%u-%u-%d-%llu", type, part, count, (u64)time.tv_sec);
 
 	for (i = 0; i < DUMP_NAME_LEN; i++)
 		efi_name[i] = name[i];
diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
index 192297b..6f3925f 100644
--- a/fs/pstore/inode.c
+++ b/fs/pstore/inode.c
@@ -277,7 +277,7 @@ int pstore_is_mounted(void)
  */
 int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id, int count,
 		  char *data, bool compressed, size_t size,
-		  struct timespec time, struct pstore_info *psi)
+		  struct inode_time time, struct pstore_info *psi)
 {
 	struct dentry		*root = pstore_sb->s_root;
 	struct dentry		*dentry;
diff --git a/fs/pstore/internal.h b/fs/pstore/internal.h
index 3b3d305..eb9c4eb 100644
--- a/fs/pstore/internal.h
+++ b/fs/pstore/internal.h
@@ -51,7 +51,7 @@ extern void	pstore_set_kmsg_bytes(int);
 extern void	pstore_get_records(int);
 extern int	pstore_mkfile(enum pstore_type_id, char *psname, u64 id,
 			      int count, char *data, bool compressed,
-			      size_t size, struct timespec time,
+			      size_t size, struct inode_time time,
 			      struct pstore_info *psi);
 extern int	pstore_is_mounted(void);
 
diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
index 0a9b72c..06f2628 100644
--- a/fs/pstore/platform.c
+++ b/fs/pstore/platform.c
@@ -475,7 +475,7 @@ void pstore_get_records(int quiet)
 	u64			id;
 	int			count;
 	enum pstore_type_id	type;
-	struct timespec		time;
+	struct inode_time	time;
 	int			failed = 0, rc;
 	bool			compressed;
 	int			unzipped_len = -1;
diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index 3b57443..50d7298 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -135,29 +135,31 @@ ramoops_get_next_prz(struct persistent_ram_zone *przs[], uint *c, uint max,
 	return prz;
 }
 
-static void ramoops_read_kmsg_hdr(char *buffer, struct timespec *time,
+static void ramoops_read_kmsg_hdr(char *buffer, struct inode_time *time,
 				  bool *compressed)
 {
 	char data_type;
+	u64 seconds;
 
-	if (sscanf(buffer, RAMOOPS_KERNMSG_HDR "%lu.%lu-%c\n",
-			&time->tv_sec, &time->tv_nsec, &data_type) == 3) {
+	if (sscanf(buffer, RAMOOPS_KERNMSG_HDR "%llu.%lu-%c\n",
+			&seconds, &time->tv_nsec, &data_type) == 3) {
 		if (data_type == 'C')
 			*compressed = true;
 		else
 			*compressed = false;
-	} else if (sscanf(buffer, RAMOOPS_KERNMSG_HDR "%lu.%lu\n",
-			&time->tv_sec, &time->tv_nsec) == 2) {
+	} else if (sscanf(buffer, RAMOOPS_KERNMSG_HDR "%llu.%lu\n",
+			&seconds, &time->tv_nsec) == 2) {
 			*compressed = false;
 	} else {
-		time->tv_sec = 0;
+		seconds = 0;
 		time->tv_nsec = 0;
 		*compressed = false;
 	}
+	time->tv_sec = seconds;
 }
 
 static ssize_t ramoops_pstore_read(u64 *id, enum pstore_type_id *type,
-				   int *count, struct timespec *time,
+				   int *count, struct inode_time *time,
 				   char **buf, bool *compressed,
 				   struct pstore_info *psi)
 {
@@ -278,7 +280,7 @@ static int notrace ramoops_pstore_write_buf(enum pstore_type_id type,
 }
 
 static int ramoops_pstore_erase(enum pstore_type_id type, u64 id, int count,
-				struct timespec time, struct pstore_info *psi)
+				struct inode_time time, struct pstore_info *psi)
 {
 	struct ramoops_context *cxt = psi->data;
 	struct persistent_ram_zone *prz;
diff --git a/include/linux/pstore.h b/include/linux/pstore.h
index ece0c6b..f293905 100644
--- a/include/linux/pstore.h
+++ b/include/linux/pstore.h
@@ -55,7 +55,7 @@ struct pstore_info {
 	int		(*open)(struct pstore_info *psi);
 	int		(*close)(struct pstore_info *psi);
 	ssize_t		(*read)(u64 *id, enum pstore_type_id *type,
-			int *count, struct timespec *time, char **buf,
+			int *count, struct inode_time *time, char **buf,
 			bool *compressed, struct pstore_info *psi);
 	int		(*write)(enum pstore_type_id type,
 			enum kmsg_dump_reason reason, u64 *id,
@@ -66,7 +66,7 @@ struct pstore_info {
 			unsigned int part, const char *buf, bool compressed,
 			size_t size, struct pstore_info *psi);
 	int		(*erase)(enum pstore_type_id type, u64 id,
-			int count, struct timespec time,
+			int count, struct inode_time time,
 			struct pstore_info *psi);
 	void		*data;
 };
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 10/32] fs/coda: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (8 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 09/32] fs/pstore: convert to struct inode_time Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-30 20:01 ` [RFC 11/32] xfs: " Arnd Bergmann
                   ` (24 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Jan Harkes, coda, codalist

This converts the coda file system to use inode_time, which we will
need to fix the y2038 limit. However, inode time stamps in coda
are communicated to user space through coda_pioctl() as a 'struct
timespec', so this cannot be fixed for coda without changing the
user space interface.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: coda@cs.cmu.edu
Cc: codalist@coda.cs.cmu.edu
---
 fs/coda/coda_linux.c      | 18 ++++++++++++------
 include/uapi/linux/coda.h |  1 +
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/fs/coda/coda_linux.c b/fs/coda/coda_linux.c
index 2849f41..f2fcec5 100644
--- a/fs/coda/coda_linux.c
+++ b/fs/coda/coda_linux.c
@@ -110,11 +110,14 @@ void coda_vattr_to_iattr(struct inode *inode, struct coda_vattr *attr)
 	if (attr->va_size != -1)
 		inode->i_blocks = (attr->va_size + 511) >> 9;
 	if (attr->va_atime.tv_sec != -1) 
-	        inode->i_atime = attr->va_atime;
+	        inode->i_atime = (struct inode_time)
+			{ attr->va_atime.tv_sec, attr->va_atime.tv_nsec };
 	if (attr->va_mtime.tv_sec != -1)
-	        inode->i_mtime = attr->va_mtime;
+	        inode->i_mtime = (struct inode_time)
+			{ attr->va_mtime.tv_sec, attr->va_mtime.tv_nsec };
         if (attr->va_ctime.tv_sec != -1)
-	        inode->i_ctime = attr->va_ctime;
+	        inode->i_ctime = (struct inode_time)
+			{ attr->va_ctime.tv_sec, attr->va_ctime.tv_nsec };
 }
 
 
@@ -180,13 +183,16 @@ void coda_iattr_to_vattr(struct iattr *iattr, struct coda_vattr *vattr)
                 vattr->va_size = iattr->ia_size;
 	}
         if ( valid & ATTR_ATIME ) {
-                vattr->va_atime = iattr->ia_atime;
+                vattr->va_atime = (struct timespec)
+			{ iattr->ia_atime.tv_sec, iattr->ia_atime.tv_nsec };
 	}
         if ( valid & ATTR_MTIME ) {
-                vattr->va_mtime = iattr->ia_mtime;
+                vattr->va_mtime = (struct timespec)
+			{ iattr->ia_mtime.tv_sec, iattr->ia_mtime.tv_nsec };
 	}
         if ( valid & ATTR_CTIME ) {
-                vattr->va_ctime = iattr->ia_ctime;
+                vattr->va_ctime = (struct timespec)
+			{ iattr->ia_ctime.tv_sec, iattr->ia_ctime.tv_nsec };
 	}
 }
 
diff --git a/include/uapi/linux/coda.h b/include/uapi/linux/coda.h
index 695fade..e7258f7 100644
--- a/include/uapi/linux/coda.h
+++ b/include/uapi/linux/coda.h
@@ -220,6 +220,7 @@ struct coda_vattr {
 	long		va_fileid;	/* file id */
 	u_quad_t	va_size;	/* file size in bytes */
 	long		va_blocksize;	/* blocksize preferred for i/o */
+	/* FIXME: timespec in user API */
 	struct timespec	va_atime;	/* time of last access */
 	struct timespec	va_mtime;	/* time of last modification */
 	struct timespec	va_ctime;	/* time file changed */
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 11/32] xfs: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (9 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 10/32] fs/coda: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-31  0:37   ` Dave Chinner
  2014-05-30 20:01 ` [RFC 12/32] btrfs: " Arnd Bergmann
                   ` (23 subsequent siblings)
  34 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Dave Chinner, xfs

xfs uses unsigned 32-bit seconds for inode timestamps, which will work
for the next 92 years, but the VFS uses struct timespec for timestamps,
which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in XFS.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
---
 fs/xfs/time.h            | 4 ++--
 fs/xfs/xfs_inode.c       | 2 +-
 fs/xfs/xfs_iops.c        | 2 +-
 fs/xfs/xfs_trans_inode.c | 6 +++---
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/xfs/time.h b/fs/xfs/time.h
index 387e695..a490f1b 100644
--- a/fs/xfs/time.h
+++ b/fs/xfs/time.h
@@ -21,14 +21,14 @@
 #include <linux/sched.h>
 #include <linux/time.h>
 
-typedef struct timespec timespec_t;
+typedef struct inode_time timespec_t;
 
 static inline void delay(long ticks)
 {
 	schedule_timeout_uninterruptible(ticks);
 }
 
-static inline void nanotime(struct timespec *tvp)
+static inline void nanotime(struct inode_time *tvp)
 {
 	*tvp = CURRENT_TIME;
 }
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index a6115fe..16d5392 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -654,7 +654,7 @@ xfs_ialloc(
 	xfs_inode_t	*ip;
 	uint		flags;
 	int		error;
-	timespec_t	tv;
+	struct inode_time tv;
 
 	/*
 	 * Call the space management code to pick
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 205613a..092ee7c 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -956,7 +956,7 @@ xfs_vn_setattr(
 STATIC int
 xfs_vn_update_time(
 	struct inode		*inode,
-	struct timespec		*now,
+	struct inode_time	*now,
 	int			flags)
 {
 	struct xfs_inode	*ip = XFS_I(inode);
diff --git a/fs/xfs/xfs_trans_inode.c b/fs/xfs/xfs_trans_inode.c
index 50c3f56..bae2520 100644
--- a/fs/xfs/xfs_trans_inode.c
+++ b/fs/xfs/xfs_trans_inode.c
@@ -70,7 +70,7 @@ xfs_trans_ichgtime(
 	int			flags)
 {
 	struct inode		*inode = VFS_I(ip);
-	timespec_t		tv;
+	struct inode_time	tv;
 
 	ASSERT(tp);
 	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
@@ -78,13 +78,13 @@ xfs_trans_ichgtime(
 	tv = current_fs_time(inode->i_sb);
 
 	if ((flags & XFS_ICHGTIME_MOD) &&
-	    !timespec_equal(&inode->i_mtime, &tv)) {
+	    !inode_time_equal(&inode->i_mtime, &tv)) {
 		inode->i_mtime = tv;
 		ip->i_d.di_mtime.t_sec = tv.tv_sec;
 		ip->i_d.di_mtime.t_nsec = tv.tv_nsec;
 	}
 	if ((flags & XFS_ICHGTIME_CHG) &&
-	    !timespec_equal(&inode->i_ctime, &tv)) {
+	    !inode_time_equal(&inode->i_ctime, &tv)) {
 		inode->i_ctime = tv;
 		ip->i_d.di_ctime.t_sec = tv.tv_sec;
 		ip->i_d.di_ctime.t_nsec = tv.tv_nsec;
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 12/32] btrfs: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (10 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 11/32] xfs: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-30 20:01 ` [RFC 13/32] ext3: " Arnd Bergmann
                   ` (22 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Chris Mason, Josef Bacik,
	linux-btrfs

btrfs uses unsigned 64-bit seconds for inode timestamps, which will work
basically forever, but the VFS uses struct timespec for timestamps,
which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in btrfs.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: linux-btrfs@vger.kernel.org
---
 fs/btrfs/file.c        | 6 +++---
 fs/btrfs/inode.c       | 4 ++--
 fs/btrfs/ioctl.c       | 4 ++--
 fs/btrfs/root-tree.c   | 2 +-
 fs/btrfs/transaction.c | 2 +-
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index a58df83..3e16a4e 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1693,16 +1693,16 @@ out:
 
 static void update_time_for_write(struct inode *inode)
 {
-	struct timespec now;
+	struct inode_time now;
 
 	if (IS_NOCMTIME(inode))
 		return;
 
 	now = current_fs_time(inode->i_sb);
-	if (!timespec_equal(&inode->i_mtime, &now))
+	if (!inode_time_equal(&inode->i_mtime, &now))
 		inode->i_mtime = now;
 
-	if (!timespec_equal(&inode->i_ctime, &now))
+	if (!inode_time_equal(&inode->i_ctime, &now))
 		inode->i_ctime = now;
 
 	if (IS_I_VERSION(inode))
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 2ac3036..d825387 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5440,7 +5440,7 @@ static int btrfs_dirty_inode(struct inode *inode)
  * This is a copy of file_update_time.  We need this so we can return error on
  * ENOSPC for updating the inode in the case of file write and mmap writes.
  */
-static int btrfs_update_time(struct inode *inode, struct timespec *now,
+static int btrfs_update_time(struct inode *inode, struct inode_time *now,
 			     int flags)
 {
 	struct btrfs_root *root = BTRFS_I(inode)->root;
@@ -8223,7 +8223,7 @@ static int btrfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	struct btrfs_root *dest = BTRFS_I(new_dir)->root;
 	struct inode *new_inode = new_dentry->d_inode;
 	struct inode *old_inode = old_dentry->d_inode;
-	struct timespec ctime = CURRENT_TIME;
+	struct inode_time ctime = CURRENT_TIME;
 	u64 index = 0;
 	u64 root_objectid;
 	int ret;
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index a313ab0..2de5f86 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -435,7 +435,7 @@ static noinline int create_subvol(struct inode *dir,
 	struct btrfs_root *root = BTRFS_I(dir)->root;
 	struct btrfs_root *new_root;
 	struct btrfs_block_rsv block_rsv;
-	struct timespec cur_time = CURRENT_TIME;
+	struct inode_time cur_time = CURRENT_TIME;
 	struct inode *inode;
 	int ret;
 	int err;
@@ -4456,7 +4456,7 @@ static long _btrfs_ioctl_set_received_subvol(struct file *file,
 	struct btrfs_root *root = BTRFS_I(inode)->root;
 	struct btrfs_root_item *root_item = &root->root_item;
 	struct btrfs_trans_handle *trans;
-	struct timespec ct = CURRENT_TIME;
+	struct inode_time ct = CURRENT_TIME;
 	int ret = 0;
 	int received_uuid_changed;
 
diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
index 38bb47e..344e89f 100644
--- a/fs/btrfs/root-tree.c
+++ b/fs/btrfs/root-tree.c
@@ -487,7 +487,7 @@ void btrfs_update_root_times(struct btrfs_trans_handle *trans,
 			     struct btrfs_root *root)
 {
 	struct btrfs_root_item *item = &root->root_item;
-	struct timespec ct = CURRENT_TIME;
+	struct inode_time ct = CURRENT_TIME;
 
 	spin_lock(&root->root_item_lock);
 	btrfs_set_root_ctransid(item, trans->transid);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 7579f6d..09dcc8a 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1133,7 +1133,7 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans,
 	struct dentry *dentry;
 	struct extent_buffer *tmp;
 	struct extent_buffer *old;
-	struct timespec cur_time = CURRENT_TIME;
+	struct inode_time cur_time = CURRENT_TIME;
 	int ret = 0;
 	u64 to_reserve = 0;
 	u64 index = 0;
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 13/32] ext3: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (11 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 12/32] btrfs: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-31  9:10   ` H. Peter Anvin
  2014-05-30 20:01 ` [RFC 14/32] ext4: " Arnd Bergmann
                   ` (21 subsequent siblings)
  34 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Jan Kara, Andrew Morton,
	Andreas Dilger, linux-ext4

ext3fs uses unsigned 32-bit seconds for inode timestamps, which will work
for the next 92 years, but the VFS uses struct timespec for timestamps,
which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in ext3. The on-disk format limit is lifted in ext4,
which will work until 2514.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: linux-ext4@vger.kernel.org
---
 fs/ext3/inode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index 4d32133..8b76f80 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -752,7 +752,7 @@ static int ext3_splice_branch(handle_t *handle, struct inode *inode,
 	struct ext3_block_alloc_info *block_i;
 	ext3_fsblk_t current_block;
 	struct ext3_inode_info *ei = EXT3_I(inode);
-	struct timespec now;
+	struct inode_time now;
 
 	block_i = ei->i_block_alloc_info;
 	/*
@@ -793,7 +793,7 @@ static int ext3_splice_branch(handle_t *handle, struct inode *inode,
 
 	/* We are done with atomic stuff, now do the rest of housekeeping */
 	now = CURRENT_TIME_SEC;
-	if (!timespec_equal(&inode->i_ctime, &now) || !where->bh) {
+	if (!inode_time_equal(&inode->i_ctime, &now) || !where->bh) {
 		inode->i_ctime = now;
 		ext3_mark_inode_dirty(handle, inode);
 	}
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 14/32] ext4: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (12 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 13/32] ext3: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-30 20:01 ` [RFC 15/32] cifs: " Arnd Bergmann
                   ` (20 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Theodore Ts'o, Andreas Dilger,
	linux-ext4

ext4fs uses unsigned 34-bit seconds for inode timestamps, which will work
for the next 500 years, but the VFS uses struct timespec for timestamps,
which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in ext4.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: linux-ext4@vger.kernel.org
---
 fs/ext4/ext4.h    | 10 +++++-----
 fs/ext4/extents.c |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 92d9f1a..b60adc9 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -726,14 +726,14 @@ struct move_extent {
 	<= (EXT4_GOOD_OLD_INODE_SIZE +			\
 	    (einode)->i_extra_isize))			\
 
-static inline __le32 ext4_encode_extra_time(struct timespec *time)
+static inline __le32 ext4_encode_extra_time(struct inode_time *time)
 {
        return cpu_to_le32((sizeof(time->tv_sec) > 4 ?
 			   (time->tv_sec >> 32) & EXT4_EPOCH_MASK : 0) |
                           ((time->tv_nsec << EXT4_EPOCH_BITS) & EXT4_NSEC_MASK));
 }
 
-static inline void ext4_decode_extra_time(struct timespec *time, __le32 extra)
+static inline void ext4_decode_extra_time(struct inode_time *time, __le32 extra)
 {
        if (sizeof(time->tv_sec) > 4)
 	       time->tv_sec |= (__u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK)
@@ -879,9 +879,9 @@ struct ext4_inode_info {
 
 	/*
 	 * File creation time. Its function is same as that of
-	 * struct timespec i_{a,c,m}time in the generic inode.
+	 * struct inode_time i_{a,c,m}time in the generic inode.
 	 */
-	struct timespec i_crtime;
+	struct inode_time i_crtime;
 
 	/* mballoc */
 	struct list_head i_prealloc_list;
@@ -1354,7 +1354,7 @@ static inline struct ext4_inode_info *EXT4_I(struct inode *inode)
 	return container_of(inode, struct ext4_inode_info, vfs_inode);
 }
 
-static inline struct timespec ext4_current_time(struct inode *inode)
+static inline struct inode_time ext4_current_time(struct inode *inode)
 {
 	return (inode->i_sb->s_time_gran < NSEC_PER_SEC) ?
 		current_fs_time(inode->i_sb) : CURRENT_TIME_SEC;
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index ee14768..ed11d79 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4894,7 +4894,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 	int ret = 0;
 	int flags;
 	ext4_lblk_t lblk;
-	struct timespec tv;
+	struct inode_time tv;
 	unsigned int blkbits = inode->i_blkbits;
 
 	/* Return error if mode is not supported */
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 15/32] cifs: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (13 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 14/32] ext4: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-30 20:01 ` [RFC 16/32] ntfs: " Arnd Bergmann
                   ` (19 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Steve French, linux-cifs,
	samba-technical

cifs uses multiple time formats for inode timestamps, which will work
at least another 92 years, but the VFS uses struct timespec for timestamps,
which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in cifs. After 2106, users of the old protocol versions
will have to move to the latest version.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Steve French <sfrench@samba.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
---
 fs/cifs/cache.c     |  6 +++---
 fs/cifs/cifsglob.h  |  6 +++---
 fs/cifs/cifsproto.h |  6 +++---
 fs/cifs/cifssmb.c   |  5 +++--
 fs/cifs/inode.c     |  2 +-
 fs/cifs/netmisc.c   | 15 ++++++++-------
 6 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/fs/cifs/cache.c b/fs/cifs/cache.c
index 6c665bf..5343f38 100644
--- a/fs/cifs/cache.c
+++ b/fs/cifs/cache.c
@@ -221,9 +221,9 @@ const struct fscache_cookie_def cifs_fscache_super_index_def = {
  * Auxiliary data attached to CIFS inode within the cache
  */
 struct cifs_fscache_inode_auxdata {
-	struct timespec	last_write_time;
-	struct timespec	last_change_time;
-	u64		eof;
+	struct inode_time	last_write_time;
+	struct inode_time	last_change_time;
+	u64			eof;
 };
 
 static uint16_t cifs_fscache_inode_get_key(const void *cookie_netfs_data,
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index de6aed8..f944c44 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -1344,9 +1344,9 @@ struct cifs_fattr {
 	dev_t		cf_rdev;
 	unsigned int	cf_nlink;
 	unsigned int	cf_dtype;
-	struct timespec	cf_atime;
-	struct timespec	cf_mtime;
-	struct timespec	cf_ctime;
+	struct inode_time cf_atime;
+	struct inode_time cf_mtime;
+	struct inode_time cf_ctime;
 };
 
 static inline void free_dfs_info_param(struct dfs_info3_param *param)
diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index ca7980a..ad476c6 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -122,9 +122,9 @@ extern enum securityEnum select_sectype(struct TCP_Server_Info *server,
 				enum securityEnum requested);
 extern int CIFS_SessSetup(const unsigned int xid, struct cifs_ses *ses,
 			  const struct nls_table *nls_cp);
-extern struct timespec cifs_NTtimeToUnix(__le64 utc_nanoseconds_since_1601);
-extern u64 cifs_UnixTimeToNT(struct timespec);
-extern struct timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time,
+extern struct inode_time cifs_NTtimeToUnix(__le64 utc_nanoseconds_since_1601);
+extern u64 cifs_UnixTimeToNT(struct inode_time);
+extern struct inode_time cnvrtDosUnixTm(__le16 le_date, __le16 le_time,
 				      int offset);
 extern void cifs_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock);
 extern int cifs_get_writer(struct cifsInodeInfo *cinode);
diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
index c3dc52e..4452be7 100644
--- a/fs/cifs/cifssmb.c
+++ b/fs/cifs/cifssmb.c
@@ -482,7 +482,7 @@ decode_lanman_negprot_rsp(struct TCP_Server_Info *server, NEGOTIATE_RSP *pSMBr)
 		 * this requirement.
 		 */
 		int val, seconds, remain, result;
-		struct timespec ts, utc;
+		struct inode_time ts, utc;
 		utc = CURRENT_TIME;
 		ts = cnvrtDosUnixTm(rsp->SrvTime.Date,
 				    rsp->SrvTime.Time, 0);
@@ -3952,7 +3952,8 @@ QInfRetry:
 	if (rc) {
 		cifs_dbg(FYI, "Send error in QueryInfo = %d\n", rc);
 	} else if (data) {
-		struct timespec ts;
+		struct inode_time ts;
+		/* FIXME: 32-bit time? */
 		__u32 time = le32_to_cpu(pSMBr->last_write_time);
 
 		/* decode response */
diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
index 9ff8df8..30ff02f 100644
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -109,7 +109,7 @@ cifs_revalidate_cache(struct inode *inode, struct cifs_fattr *fattr)
 	}
 
 	 /* revalidate if mtime or size have changed */
-	if (timespec_equal(&inode->i_mtime, &fattr->cf_mtime) &&
+	if (inode_time_equal(&inode->i_mtime, &fattr->cf_mtime) &&
 	    cifs_i->server_eof == fattr->cf_eof) {
 		cifs_dbg(FYI, "%s: inode %llu is unchanged\n",
 			 __func__, cifs_i->uniqueid);
diff --git a/fs/cifs/netmisc.c b/fs/cifs/netmisc.c
index 6834b9c..40bcbcb 100644
--- a/fs/cifs/netmisc.c
+++ b/fs/cifs/netmisc.c
@@ -918,10 +918,10 @@ smbCalcSize(void *buf)
  * Convert the NT UTC (based 1601-01-01, in hundred nanosecond units)
  * into Unix UTC (based 1970-01-01, in seconds).
  */
-struct timespec
+struct inode_time
 cifs_NTtimeToUnix(__le64 ntutc)
 {
-	struct timespec ts;
+	struct inode_time ts;
 	/* BB what about the timezone? BB */
 
 	/* Subtract the NTFS time offset, then convert to 1s intervals. */
@@ -935,7 +935,7 @@ cifs_NTtimeToUnix(__le64 ntutc)
 
 /* Convert the Unix UTC into NT UTC. */
 u64
-cifs_UnixTimeToNT(struct timespec t)
+cifs_UnixTimeToNT(struct inode_time t)
 {
 	/* Convert to 100ns intervals and then add the NTFS time offset. */
 	return (u64) t.tv_sec * 10000000 + t.tv_nsec/100 + NTFS_TIME_OFFSET;
@@ -945,10 +945,11 @@ static const int total_days_of_prev_months[] = {
 	0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334
 };
 
-struct timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset)
+struct inode_time cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset)
 {
-	struct timespec ts;
-	int sec, min, days, month, year;
+	struct inode_time ts;
+	long long sec;
+	int min, days, month, year;
 	u16 date = le16_to_cpu(le_date);
 	u16 time = le16_to_cpu(le_time);
 	SMB_TIME *st = (SMB_TIME *)&time;
@@ -959,7 +960,7 @@ struct timespec cnvrtDosUnixTm(__le16 le_date, __le16 le_time, int offset)
 	sec = 2 * st->TwoSeconds;
 	min = st->Minutes;
 	if ((sec > 59) || (min > 59))
-		cifs_dbg(VFS, "illegal time min %d sec %d\n", min, sec);
+		cifs_dbg(VFS, "illegal time min %d sec %lld\n", min, sec);
 	sec += (min * 60);
 	sec += 60 * 60 * st->Hours;
 	if (st->Hours > 24)
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 16/32] ntfs: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (14 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 15/32] cifs: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-30 20:01 ` [RFC 17/32] ubifs: " Arnd Bergmann
                   ` (18 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Anton Altaparmakov, linux-ntfs-dev

ntfs uses 64-bit integers for inode timestamps, which will work
thousands of years, but the VFS uses struct timespec for timestamps,
which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in ntfs.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: linux-ntfs-dev@lists.sourceforge.net
---
 fs/ntfs/inode.c | 12 ++++++------
 fs/ntfs/time.h  |  8 ++++----
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/fs/ntfs/inode.c b/fs/ntfs/inode.c
index f47af5e..8f7cba5 100644
--- a/fs/ntfs/inode.c
+++ b/fs/ntfs/inode.c
@@ -2811,11 +2811,11 @@ done:
 	 * for real.
 	 */
 	if (!IS_NOCMTIME(VFS_I(base_ni)) && !IS_RDONLY(VFS_I(base_ni))) {
-		struct timespec now = current_fs_time(VFS_I(base_ni)->i_sb);
+		struct inode_time now = current_fs_time(VFS_I(base_ni)->i_sb);
 		int sync_it = 0;
 
-		if (!timespec_equal(&VFS_I(base_ni)->i_mtime, &now) ||
-		    !timespec_equal(&VFS_I(base_ni)->i_ctime, &now))
+		if (!inode_time_equal(&VFS_I(base_ni)->i_mtime, &now) ||
+		    !inode_time_equal(&VFS_I(base_ni)->i_ctime, &now))
 			sync_it = 1;
 		VFS_I(base_ni)->i_mtime = now;
 		VFS_I(base_ni)->i_ctime = now;
@@ -2930,13 +2930,13 @@ int ntfs_setattr(struct dentry *dentry, struct iattr *attr)
 		}
 	}
 	if (ia_valid & ATTR_ATIME)
-		vi->i_atime = timespec_trunc(attr->ia_atime,
+		vi->i_atime = inode_time_trunc(attr->ia_atime,
 				vi->i_sb->s_time_gran);
 	if (ia_valid & ATTR_MTIME)
-		vi->i_mtime = timespec_trunc(attr->ia_mtime,
+		vi->i_mtime = inode_time_trunc(attr->ia_mtime,
 				vi->i_sb->s_time_gran);
 	if (ia_valid & ATTR_CTIME)
-		vi->i_ctime = timespec_trunc(attr->ia_ctime,
+		vi->i_ctime = inode_time_trunc(attr->ia_ctime,
 				vi->i_sb->s_time_gran);
 	mark_inode_dirty(vi);
 out:
diff --git a/fs/ntfs/time.h b/fs/ntfs/time.h
index 0123398..2c8d325 100644
--- a/fs/ntfs/time.h
+++ b/fs/ntfs/time.h
@@ -45,7 +45,7 @@
  * measured as the number of 100-nano-second intervals since 1st January 1601,
  * 00:00:00 UTC.
  */
-static inline sle64 utc2ntfs(const struct timespec ts)
+static inline sle64 utc2ntfs(const struct inode_time ts)
 {
 	/*
 	 * Convert the seconds to 100ns intervals, add the nano-seconds
@@ -63,7 +63,7 @@ static inline sle64 utc2ntfs(const struct timespec ts)
  */
 static inline sle64 get_current_ntfs_time(void)
 {
-	return utc2ntfs(current_kernel_time());
+	return utc2ntfs(CURRENT_TIME);
 }
 
 /**
@@ -82,9 +82,9 @@ static inline sle64 get_current_ntfs_time(void)
  * measured as the number of 100 nano-second intervals since 1st January 1601,
  * 00:00:00 UTC.
  */
-static inline struct timespec ntfs2utc(const sle64 time)
+static inline struct inode_time ntfs2utc(const sle64 time)
 {
-	struct timespec ts;
+	struct inode_time ts;
 
 	/* Subtract the NTFS time offset. */
 	u64 t = (u64)(sle64_to_cpu(time) - NTFS_TIME_OFFSET);
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 17/32] ubifs: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (15 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 16/32] ntfs: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-06-02  7:54   ` Artem Bityutskiy
  2014-05-30 20:01 ` [RFC 18/32] ocfs2: " Arnd Bergmann
                   ` (17 subsequent siblings)
  34 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Artem Bityutskiy, Adrian Hunter,
	linux-mtd

ubifs uses 64-bit integers for inode timestamps, which will work
practicall forever, but the VFS uses struct timespec for timestamps,
which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in ubifs.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: linux-mtd@lists.infradead.org
---
 fs/ubifs/dir.c  |  2 +-
 fs/ubifs/file.c | 16 ++++++++--------
 fs/ubifs/misc.h |  2 +-
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
index ea41649..a551ecc 100644
--- a/fs/ubifs/dir.c
+++ b/fs/ubifs/dir.c
@@ -965,7 +965,7 @@ static int ubifs_rename(struct inode *old_dir, struct dentry *old_dentry,
 					.dirtied_ino = 3 };
 	struct ubifs_budget_req ino_req = { .dirtied_ino = 1,
 			.dirtied_ino_d = ALIGN(old_inode_ui->data_len, 8) };
-	struct timespec time;
+	struct inode_time time;
 	unsigned int uninitialized_var(saved_nlink);
 
 	/*
diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index ebcf15f..55cd034 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -1073,13 +1073,13 @@ static void do_attr_changes(struct inode *inode, const struct iattr *attr)
 	if (attr->ia_valid & ATTR_GID)
 		inode->i_gid = attr->ia_gid;
 	if (attr->ia_valid & ATTR_ATIME)
-		inode->i_atime = timespec_trunc(attr->ia_atime,
+		inode->i_atime = inode_time_trunc(attr->ia_atime,
 						inode->i_sb->s_time_gran);
 	if (attr->ia_valid & ATTR_MTIME)
-		inode->i_mtime = timespec_trunc(attr->ia_mtime,
+		inode->i_mtime = inode_time_trunc(attr->ia_mtime,
 						inode->i_sb->s_time_gran);
 	if (attr->ia_valid & ATTR_CTIME)
-		inode->i_ctime = timespec_trunc(attr->ia_ctime,
+		inode->i_ctime = inode_time_trunc(attr->ia_ctime,
 						inode->i_sb->s_time_gran);
 	if (attr->ia_valid & ATTR_MODE) {
 		umode_t mode = attr->ia_mode;
@@ -1353,10 +1353,10 @@ out:
  * granularity, they are not updated. This is an optimization.
  */
 static inline int mctime_update_needed(const struct inode *inode,
-				       const struct timespec *now)
+				       const struct inode_time *now)
 {
-	if (!timespec_equal(&inode->i_mtime, now) ||
-	    !timespec_equal(&inode->i_ctime, now))
+	if (!inode_time_equal(&inode->i_mtime, now) ||
+	    !inode_time_equal(&inode->i_ctime, now))
 		return 1;
 	return 0;
 }
@@ -1371,7 +1371,7 @@ static inline int mctime_update_needed(const struct inode *inode,
  */
 static int update_mctime(struct inode *inode)
 {
-	struct timespec now = ubifs_current_time(inode);
+	struct inode_time now = ubifs_current_time(inode);
 	struct ubifs_inode *ui = ubifs_inode(inode);
 	struct ubifs_info *c = inode->i_sb->s_fs_info;
 
@@ -1443,7 +1443,7 @@ static int ubifs_vm_page_mkwrite(struct vm_area_struct *vma,
 	struct page *page = vmf->page;
 	struct inode *inode = file_inode(vma->vm_file);
 	struct ubifs_info *c = inode->i_sb->s_fs_info;
-	struct timespec now = ubifs_current_time(inode);
+	struct inode_time now = ubifs_current_time(inode);
 	struct ubifs_budget_req req = { .new_page = 1 };
 	int err, update_time;
 
diff --git a/fs/ubifs/misc.h b/fs/ubifs/misc.h
index ee7cb5e..ca0fcac 100644
--- a/fs/ubifs/misc.h
+++ b/fs/ubifs/misc.h
@@ -233,7 +233,7 @@ static inline void *ubifs_idx_key(const struct ubifs_info *c,
  * ubifs_current_time - round current time to time granularity.
  * @inode: inode
  */
-static inline struct timespec ubifs_current_time(struct inode *inode)
+static inline struct inode_time ubifs_current_time(struct inode *inode)
 {
 	return (inode->i_sb->s_time_gran < NSEC_PER_SEC) ?
 		current_fs_time(inode->i_sb) : CURRENT_TIME_SEC;
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 18/32] ocfs2: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (16 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 17/32] ubifs: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-30 20:01 ` [RFC 19/32] fs/fat: " Arnd Bergmann
                   ` (16 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Mark Fasheh, Joel Becker,
	ocfs2-devel

ocfs2 uses unsigned 34-bit seconds for inode timestamps, which will work
for the next 500 years, but the VFS uses struct timespec for timestamps,
which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in ocfs2.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: ocfs2-devel@oss.oracle.com
---
 fs/ocfs2/dlmglue.c | 16 ++++++++--------
 fs/ocfs2/file.c    |  6 +++---
 fs/ocfs2/ocfs2.h   |  2 +-
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c
index 6bd690b..26913ae 100644
--- a/fs/ocfs2/dlmglue.c
+++ b/fs/ocfs2/dlmglue.c
@@ -2010,7 +2010,7 @@ static void ocfs2_downconvert_on_unlock(struct ocfs2_super *osb,
 
 /* LVB only has room for 64 bits of time here so we pack it for
  * now. */
-static u64 ocfs2_pack_timespec(struct timespec *spec)
+static u64 ocfs2_pack_inode_time(struct inode_time *spec)
 {
 	u64 res;
 	u64 sec = spec->tv_sec;
@@ -2050,11 +2050,11 @@ static void __ocfs2_stuff_meta_lvb(struct inode *inode)
 	lvb->lvb_imode     = cpu_to_be16(inode->i_mode);
 	lvb->lvb_inlink    = cpu_to_be16(inode->i_nlink);
 	lvb->lvb_iatime_packed  =
-		cpu_to_be64(ocfs2_pack_timespec(&inode->i_atime));
+		cpu_to_be64(ocfs2_pack_inode_time(&inode->i_atime));
 	lvb->lvb_ictime_packed =
-		cpu_to_be64(ocfs2_pack_timespec(&inode->i_ctime));
+		cpu_to_be64(ocfs2_pack_inode_time(&inode->i_ctime));
 	lvb->lvb_imtime_packed =
-		cpu_to_be64(ocfs2_pack_timespec(&inode->i_mtime));
+		cpu_to_be64(ocfs2_pack_inode_time(&inode->i_mtime));
 	lvb->lvb_iattr    = cpu_to_be32(oi->ip_attr);
 	lvb->lvb_idynfeatures = cpu_to_be16(oi->ip_dyn_features);
 	lvb->lvb_igeneration = cpu_to_be32(inode->i_generation);
@@ -2063,7 +2063,7 @@ out:
 	mlog_meta_lvb(0, lockres);
 }
 
-static void ocfs2_unpack_timespec(struct timespec *spec,
+static void ocfs2_unpack_inode_time(struct inode_time *spec,
 				  u64 packed_time)
 {
 	spec->tv_sec = packed_time >> OCFS2_SEC_SHIFT;
@@ -2099,11 +2099,11 @@ static void ocfs2_refresh_inode_from_lvb(struct inode *inode)
 	i_gid_write(inode, be32_to_cpu(lvb->lvb_igid));
 	inode->i_mode    = be16_to_cpu(lvb->lvb_imode);
 	set_nlink(inode, be16_to_cpu(lvb->lvb_inlink));
-	ocfs2_unpack_timespec(&inode->i_atime,
+	ocfs2_unpack_inode_time(&inode->i_atime,
 			      be64_to_cpu(lvb->lvb_iatime_packed));
-	ocfs2_unpack_timespec(&inode->i_mtime,
+	ocfs2_unpack_inode_time(&inode->i_mtime,
 			      be64_to_cpu(lvb->lvb_imtime_packed));
-	ocfs2_unpack_timespec(&inode->i_ctime,
+	ocfs2_unpack_inode_time(&inode->i_ctime,
 			      be64_to_cpu(lvb->lvb_ictime_packed));
 	spin_unlock(&oi->ip_lock);
 }
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 2930e23..88deaa6 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -216,7 +216,7 @@ static int ocfs2_sync_file(struct file *file, loff_t start, loff_t end,
 int ocfs2_should_update_atime(struct inode *inode,
 			      struct vfsmount *vfsmnt)
 {
-	struct timespec now;
+	struct inode_time now;
 	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
 
 	if (ocfs2_is_hard_readonly(osb) || ocfs2_is_soft_readonly(osb))
@@ -242,8 +242,8 @@ int ocfs2_should_update_atime(struct inode *inode,
 		return 0;
 
 	if (vfsmnt->mnt_flags & MNT_RELATIME) {
-		if ((timespec_compare(&inode->i_atime, &inode->i_mtime) <= 0) ||
-		    (timespec_compare(&inode->i_atime, &inode->i_ctime) <= 0))
+		if ((inode_time_compare(&inode->i_atime, &inode->i_mtime) <= 0) ||
+		    (inode_time_compare(&inode->i_atime, &inode->i_ctime) <= 0))
 			return 1;
 
 		return 0;
diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
index bbec539..11a06e7 100644
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -213,7 +213,7 @@ struct ocfs2_orphan_scan {
 	struct ocfs2_super 	*os_osb;
 	struct ocfs2_lock_res 	os_lockres;     /* lock to synchronize scans */
 	struct delayed_work 	os_orphan_scan_work;
-	struct timespec		os_scantime;  /* time this node ran the scan */
+	struct inode_time	os_scantime;  /* time this node ran the scan */
 	u32			os_count;      /* tracks node specific scans */
 	u32  			os_seqno;       /* tracks cluster wide scans */
 	atomic_t		os_state;              /* ACTIVE or INACTIVE */
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 19/32] fs/fat: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (17 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 18/32] ocfs2: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-30 20:01 ` [RFC 20/32] afs: " Arnd Bergmann
                   ` (15 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, OGAWA Hirofumi

fat uses 7-bit year numbers for inode timestamps, which will work
for the next 93 years, but the VFS uses struct timespec for timestamps,
which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in fat.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
---
 fs/fat/dir.c         |  2 +-
 fs/fat/fat.h         |  6 +++---
 fs/fat/misc.c        |  4 ++--
 fs/fat/namei_msdos.c |  8 ++++----
 fs/fat/namei_vfat.c  | 10 +++++-----
 5 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/fs/fat/dir.c b/fs/fat/dir.c
index 3963ede..9a9a12d 100644
--- a/fs/fat/dir.c
+++ b/fs/fat/dir.c
@@ -1127,7 +1127,7 @@ error:
 	return err;
 }
 
-int fat_alloc_new_dir(struct inode *dir, struct timespec *ts)
+int fat_alloc_new_dir(struct inode *dir, struct inode_time *ts)
 {
 	struct super_block *sb = dir->i_sb;
 	struct msdos_sb_info *sbi = MSDOS_SB(sb);
diff --git a/fs/fat/fat.h b/fs/fat/fat.h
index 13b7202..70a32b4 100644
--- a/fs/fat/fat.h
+++ b/fs/fat/fat.h
@@ -307,7 +307,7 @@ extern int fat_scan_logstart(struct inode *dir, int i_logstart,
 			     struct fat_slot_info *sinfo);
 extern int fat_get_dotdot_entry(struct inode *dir, struct buffer_head **bh,
 				struct msdos_dir_entry **de);
-extern int fat_alloc_new_dir(struct inode *dir, struct timespec *ts);
+extern int fat_alloc_new_dir(struct inode *dir, struct inode_time *ts);
 extern int fat_add_entries(struct inode *dir, void *slots, int nr_slots,
 			   struct fat_slot_info *sinfo);
 extern int fat_remove_entries(struct inode *dir, struct fat_slot_info *sinfo);
@@ -407,9 +407,9 @@ void fat_msg(struct super_block *sb, const char *level, const char *fmt, ...);
 	 } while (0)
 extern int fat_clusters_flush(struct super_block *sb);
 extern int fat_chain_add(struct inode *inode, int new_dclus, int nr_cluster);
-extern void fat_time_fat2unix(struct msdos_sb_info *sbi, struct timespec *ts,
+extern void fat_time_fat2unix(struct msdos_sb_info *sbi, struct inode_time *ts,
 			      __le16 __time, __le16 __date, u8 time_cs);
-extern void fat_time_unix2fat(struct msdos_sb_info *sbi, struct timespec *ts,
+extern void fat_time_unix2fat(struct msdos_sb_info *sbi, struct inode_time *ts,
 			      __le16 *time, __le16 *date, u8 *time_cs);
 extern int fat_sync_bhs(struct buffer_head **bhs, int nr_bhs);
 
diff --git a/fs/fat/misc.c b/fs/fat/misc.c
index 628e22a..8710e24 100644
--- a/fs/fat/misc.c
+++ b/fs/fat/misc.c
@@ -192,7 +192,7 @@ static time_t days_in_year[] = {
 };
 
 /* Convert a FAT time/date pair to a UNIX date (seconds since 1 1 70). */
-void fat_time_fat2unix(struct msdos_sb_info *sbi, struct timespec *ts,
+void fat_time_fat2unix(struct msdos_sb_info *sbi, struct inode_time *ts,
 		       __le16 __time, __le16 __date, u8 time_cs)
 {
 	u16 time = le16_to_cpu(__time), date = le16_to_cpu(__date);
@@ -230,7 +230,7 @@ void fat_time_fat2unix(struct msdos_sb_info *sbi, struct timespec *ts,
 }
 
 /* Convert linear UNIX date to a FAT time/date pair. */
-void fat_time_unix2fat(struct msdos_sb_info *sbi, struct timespec *ts,
+void fat_time_unix2fat(struct msdos_sb_info *sbi, struct inode_time *ts,
 		       __le16 *time, __le16 *date, u8 *time_cs)
 {
 	struct tm tm;
diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c
index a783b0e..699aa3e 100644
--- a/fs/fat/namei_msdos.c
+++ b/fs/fat/namei_msdos.c
@@ -226,7 +226,7 @@ static struct dentry *msdos_lookup(struct inode *dir, struct dentry *dentry,
 /***** Creates a directory entry (name is already formatted). */
 static int msdos_add_entry(struct inode *dir, const unsigned char *name,
 			   int is_dir, int is_hid, int cluster,
-			   struct timespec *ts, struct fat_slot_info *sinfo)
+			   struct inode_time *ts, struct fat_slot_info *sinfo)
 {
 	struct msdos_sb_info *sbi = MSDOS_SB(dir->i_sb);
 	struct msdos_dir_entry de;
@@ -267,7 +267,7 @@ static int msdos_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 	struct super_block *sb = dir->i_sb;
 	struct inode *inode = NULL;
 	struct fat_slot_info sinfo;
-	struct timespec ts;
+	struct inode_time ts;
 	unsigned char msdos_name[MSDOS_NAME];
 	int err, is_hid;
 
@@ -349,7 +349,7 @@ static int msdos_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 	struct fat_slot_info sinfo;
 	struct inode *inode;
 	unsigned char msdos_name[MSDOS_NAME];
-	struct timespec ts;
+	struct inode_time ts;
 	int err, is_hid, cluster;
 
 	mutex_lock(&MSDOS_SB(sb)->s_lock);
@@ -437,7 +437,7 @@ static int do_msdos_rename(struct inode *old_dir, unsigned char *old_name,
 	struct msdos_dir_entry *dotdot_de;
 	struct inode *old_inode, *new_inode;
 	struct fat_slot_info old_sinfo, sinfo;
-	struct timespec ts;
+	struct inode_time ts;
 	loff_t new_i_pos;
 	int err, old_attrs, is_dir, update_dotdot, corrupt = 0;
 
diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c
index 6df8d3d..f31b0af 100644
--- a/fs/fat/namei_vfat.c
+++ b/fs/fat/namei_vfat.c
@@ -579,7 +579,7 @@ xlate_to_uni(const unsigned char *name, int len, unsigned char *outname,
 
 static int vfat_build_slots(struct inode *dir, const unsigned char *name,
 			    int len, int is_dir, int cluster,
-			    struct timespec *ts,
+			    struct inode_time *ts,
 			    struct msdos_dir_slot *slots, int *nr_slots)
 {
 	struct msdos_sb_info *sbi = MSDOS_SB(dir->i_sb);
@@ -655,7 +655,7 @@ out_free:
 }
 
 static int vfat_add_entry(struct inode *dir, struct qstr *qname, int is_dir,
-			  int cluster, struct timespec *ts,
+			  int cluster, struct inode_time *ts,
 			  struct fat_slot_info *sinfo)
 {
 	struct msdos_dir_slot *slots;
@@ -772,7 +772,7 @@ static int vfat_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 	struct super_block *sb = dir->i_sb;
 	struct inode *inode;
 	struct fat_slot_info sinfo;
-	struct timespec ts;
+	struct inode_time ts;
 	int err;
 
 	mutex_lock(&MSDOS_SB(sb)->s_lock);
@@ -860,7 +860,7 @@ static int vfat_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 	struct super_block *sb = dir->i_sb;
 	struct inode *inode;
 	struct fat_slot_info sinfo;
-	struct timespec ts;
+	struct inode_time ts;
 	int err, cluster;
 
 	mutex_lock(&MSDOS_SB(sb)->s_lock);
@@ -909,7 +909,7 @@ static int vfat_rename(struct inode *old_dir, struct dentry *old_dentry,
 	struct msdos_dir_entry *dotdot_de;
 	struct inode *old_inode, *new_inode;
 	struct fat_slot_info old_sinfo, sinfo;
-	struct timespec ts;
+	struct inode_time ts;
 	loff_t new_i_pos;
 	int err, is_dir, update_dotdot, corrupt = 0;
 	struct super_block *sb = old_dir->i_sb;
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 20/32] afs: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (18 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 19/32] fs/fat: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-30 20:01 ` [RFC 21/32] udf: " Arnd Bergmann
                   ` (14 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, David Howells, linux-afs

afs uses an unsigned 32-bit seconds numbers for inode timestamps on the
wire, which will work for the next 92 years, but the code internally
uses a signed time_t, which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in afs, and changes the afs code to use an unsigned
number internally.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: David Howells <dhowells@redhat.com>
Cc: linux-afs@lists.infradead.org
---
 fs/afs/afs.h      | 6 +++---
 fs/afs/fsclient.c | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/afs/afs.h b/fs/afs/afs.h
index 3c462ff..53113a8 100644
--- a/fs/afs/afs.h
+++ b/fs/afs/afs.h
@@ -125,8 +125,8 @@ struct afs_file_status {
 	afs_access_t		anon_access;	/* access rights for unauthenticated caller */
 	umode_t			mode;		/* UNIX mode */
 	struct afs_fid		parent;		/* parent dir ID for non-dirs only */
-	time_t			mtime_client;	/* last time client changed data */
-	time_t			mtime_server;	/* last time server changed data */
+	u32			mtime_client;	/* last time client changed data */
+	u32			mtime_server;	/* last time server changed data */
 	s32			lock_count;	/* file lock count (0=UNLK -1=WRLCK +ve=#RDLCK */
 };
 
@@ -144,7 +144,7 @@ struct afs_file_status {
  * AFS volume synchronisation information
  */
 struct afs_volsync {
-	time_t			creation;	/* volume creation time */
+	u32			creation;	/* volume creation time */
 };
 
 /*
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index c2e930e..7c0f4a5 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -85,7 +85,7 @@ static void xdr_decode_AFSFetchStatus(const __be32 **_bp,
 	}
 	status->mode &= S_IALLUGO;
 
-	_debug("vnode time %lx, %lx",
+	_debug("vnode time %x, %x",
 	       status->mtime_client, status->mtime_server);
 
 	if (vnode) {
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 21/32] udf: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (19 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 20/32] afs: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-30 20:01 ` [RFC 22/32] fs: convert simple fs to inode_time Arnd Bergmann
                   ` (13 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Jan Kara

udf uses 16-bit year numbers for inode timestamps, which will work
for the 15000 years, but the VFS uses struct timespec for timestamps,
and the Linux udf implementation internally uses a time_t, both of
which are only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in udf, but the implementation still depends on
the 'year_seconds' array that needs to be extended or replaced by
an algorithm that calculates the correct time for every year.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jan Kara <jack@suse.cz>
---
 fs/udf/udf_i.h   | 2 +-
 fs/udf/udf_sb.h  | 2 +-
 fs/udf/udfdecl.h | 7 ++++---
 fs/udf/udftime.c | 7 ++++---
 4 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/fs/udf/udf_i.h b/fs/udf/udf_i.h
index b5cd8ed..ba12313 100644
--- a/fs/udf/udf_i.h
+++ b/fs/udf/udf_i.h
@@ -27,7 +27,7 @@ struct udf_ext_cache {
  */
 
 struct udf_inode_info {
-	struct timespec		i_crtime;
+	struct inode_time	i_crtime;
 	/* Physical address of inode */
 	struct kernel_lb_addr		i_location;
 	__u64			i_unique;
diff --git a/fs/udf/udf_sb.h b/fs/udf/udf_sb.h
index 1f32c7b..20ab0ca 100644
--- a/fs/udf/udf_sb.h
+++ b/fs/udf/udf_sb.h
@@ -135,7 +135,7 @@ struct udf_sb_info {
 	rwlock_t		s_cred_lock;
 
 	/* Root Info */
-	struct timespec		s_record_time;
+	struct inode_time	s_record_time;
 
 	/* Fileset Info */
 	__u16			s_serial_number;
diff --git a/fs/udf/udfdecl.h b/fs/udf/udfdecl.h
index be7dabb..88320da 100644
--- a/fs/udf/udfdecl.h
+++ b/fs/udf/udfdecl.h
@@ -237,8 +237,9 @@ extern struct long_ad *udf_get_filelongad(uint8_t *, int, uint32_t *, int);
 extern struct short_ad *udf_get_fileshortad(uint8_t *, int, uint32_t *, int);
 
 /* udftime.c */
-extern struct timespec *udf_disk_stamp_to_time(struct timespec *dest,
-						struct timestamp src);
-extern struct timestamp *udf_time_to_disk_stamp(struct timestamp *dest, struct timespec src);
+extern struct inode_time *udf_disk_stamp_to_time(struct inode_time *dest,
+						 struct timestamp src);
+extern struct timestamp *udf_time_to_disk_stamp(struct timestamp *dest,
+						struct inode_time src);
 
 #endif				/* __UDF_DECL_H */
diff --git a/fs/udf/udftime.c b/fs/udf/udftime.c
index 1f11483..1ab6fe7 100644
--- a/fs/udf/udftime.c
+++ b/fs/udf/udftime.c
@@ -60,6 +60,7 @@ static const unsigned short int __mon_yday[2][13] = {
 #define SPD			0x15180	/*3600*24 */
 #define SPY(y, l, s)		(SPD * (365 * y + l) + s)
 
+/* FIXME: convert this to a 64-bit type */
 static time_t year_seconds[MAX_YEAR_SECONDS] = {
 /*1970*/ SPY(0,   0, 0), SPY(1,   0, 0), SPY(2,   0, 0), SPY(3,   1, 0),
 /*1974*/ SPY(4,   1, 0), SPY(5,   1, 0), SPY(6,   1, 0), SPY(7,   2, 0),
@@ -86,8 +87,8 @@ extern struct timezone sys_tz;
 #define SECS_PER_HOUR	(60 * 60)
 #define SECS_PER_DAY	(SECS_PER_HOUR * 24)
 
-struct timespec *
-udf_disk_stamp_to_time(struct timespec *dest, struct timestamp src)
+struct inode_time *
+udf_disk_stamp_to_time(struct inode_time *dest, struct timestamp src)
 {
 	int yday;
 	u16 typeAndTimezone = le16_to_cpu(src.typeAndTimezone);
@@ -119,7 +120,7 @@ udf_disk_stamp_to_time(struct timespec *dest, struct timestamp src)
 }
 
 struct timestamp *
-udf_time_to_disk_stamp(struct timestamp *dest, struct timespec ts)
+udf_time_to_disk_stamp(struct timestamp *dest, struct inode_time ts)
 {
 	long int days, rem, y;
 	const unsigned short int *ip;
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 22/32] fs: convert simple fs to inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (20 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 21/32] udf: " Arnd Bergmann
@ 2014-05-30 20:01 ` Arnd Bergmann
  2014-05-30 23:06   ` Greg Kroah-Hartman
  2014-05-30 20:01 ` [RFC 23/32] logfs: convert to struct inode_time Arnd Bergmann
                   ` (12 subsequent siblings)
  34 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Greg Kroah-Hartman, Joel Becker

tty, usbgadgetfs, configfs and cramfs do not store inode timestamps
permanently, but they use code that interacts with the VFS inode
times. In order to change over VFS to a struct inode_time, we
have to make trivial changes to these file systems as well.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Joel Becker <jlbec@evilplan.org>
---
 drivers/tty/tty_io.c      | 2 +-
 drivers/usb/gadget/f_fs.c | 2 +-
 fs/configfs/inode.c       | 6 +++---
 fs/cramfs/inode.c         | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 3411071..c2c63e5 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -994,7 +994,7 @@ void start_tty(struct tty_struct *tty)
 EXPORT_SYMBOL(start_tty);
 
 /* We limit tty time update visibility to every 8 seconds or so. */
-static void tty_update_time(struct timespec *time)
+static void tty_update_time(struct inode_time *time)
 {
 	unsigned long sec = get_seconds() & ~7;
 	if ((long)(sec - time->tv_sec) > 0)
diff --git a/drivers/usb/gadget/f_fs.c b/drivers/usb/gadget/f_fs.c
index 74202d6..3947655 100644
--- a/drivers/usb/gadget/f_fs.c
+++ b/drivers/usb/gadget/f_fs.c
@@ -1069,7 +1069,7 @@ ffs_sb_make_inode(struct super_block *sb, void *data,
 	inode = new_inode(sb);
 
 	if (likely(inode)) {
-		struct timespec current_time = CURRENT_TIME;
+		struct inode_time current_time = CURRENT_TIME;
 
 		inode->i_ino	 = get_next_ino();
 		inode->i_mode    = perms->mode;
diff --git a/fs/configfs/inode.c b/fs/configfs/inode.c
index 5946ad9..62f10c0 100644
--- a/fs/configfs/inode.c
+++ b/fs/configfs/inode.c
@@ -95,13 +95,13 @@ int configfs_setattr(struct dentry * dentry, struct iattr * iattr)
 	if (ia_valid & ATTR_GID)
 		sd_iattr->ia_gid = iattr->ia_gid;
 	if (ia_valid & ATTR_ATIME)
-		sd_iattr->ia_atime = timespec_trunc(iattr->ia_atime,
+		sd_iattr->ia_atime = inode_time_trunc(iattr->ia_atime,
 						inode->i_sb->s_time_gran);
 	if (ia_valid & ATTR_MTIME)
-		sd_iattr->ia_mtime = timespec_trunc(iattr->ia_mtime,
+		sd_iattr->ia_mtime = inode_time_trunc(iattr->ia_mtime,
 						inode->i_sb->s_time_gran);
 	if (ia_valid & ATTR_CTIME)
-		sd_iattr->ia_ctime = timespec_trunc(iattr->ia_ctime,
+		sd_iattr->ia_ctime = inode_time_trunc(iattr->ia_ctime,
 						inode->i_sb->s_time_gran);
 	if (ia_valid & ATTR_MODE) {
 		umode_t mode = iattr->ia_mode;
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index ddcfe59..1b9ed3b 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -79,7 +79,7 @@ static struct inode *get_cramfs_inode(struct super_block *sb,
 	const struct cramfs_inode *cramfs_inode, unsigned int offset)
 {
 	struct inode *inode;
-	static struct timespec zerotime;
+	static struct inode_time zerotime;
 
 	inode = iget_locked(sb, cramino(cramfs_inode, offset));
 	if (!inode)
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 23/32] logfs: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (21 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 22/32] fs: convert simple fs to inode_time Arnd Bergmann
@ 2014-05-30 20:01 ` Arnd Bergmann
  2014-05-30 20:01 ` [RFC 24/32] hfs, hfsplus: " Arnd Bergmann
                   ` (11 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Joern Engel, Prasad Joshi, logfs

logfs uses 64-bit integers for inode timestamps, which will work
for the next 550 years, but the VFS uses struct timespec for timestamps,
which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in logfs.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Joern Engel <joern@logfs.org>
Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
Cc: logfs@logfs.org
---
 fs/logfs/readwrite.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/logfs/readwrite.c b/fs/logfs/readwrite.c
index 4814031..df44698 100644
--- a/fs/logfs/readwrite.c
+++ b/fs/logfs/readwrite.c
@@ -101,12 +101,12 @@ void logfs_unpack_index(pgoff_t index, u64 *bix, level_t *level)
 /*
  * Time is stored as nanoseconds since the epoch.
  */
-static struct timespec be64_to_timespec(__be64 betime)
+static struct inode_time be64_to_inode_time(__be64 betime)
 {
-	return ns_to_timespec(be64_to_cpu(betime));
+	return ns_to_inode_time(be64_to_cpu(betime));
 }
 
-static __be64 timespec_to_be64(struct timespec tsp)
+static __be64 inode_time_to_be64(struct inode_time tsp)
 {
 	return cpu_to_be64((u64)tsp.tv_sec * NSEC_PER_SEC + tsp.tv_nsec);
 }
@@ -123,9 +123,9 @@ static void logfs_disk_to_inode(struct logfs_disk_inode *di, struct inode*inode)
 	i_gid_write(inode, be32_to_cpu(di->di_gid));
 	inode->i_size	= be64_to_cpu(di->di_size);
 	logfs_set_blocks(inode, be64_to_cpu(di->di_used_bytes));
-	inode->i_atime	= be64_to_timespec(di->di_atime);
-	inode->i_ctime	= be64_to_timespec(di->di_ctime);
-	inode->i_mtime	= be64_to_timespec(di->di_mtime);
+	inode->i_atime	= be64_to_inode_time(di->di_atime);
+	inode->i_ctime	= be64_to_inode_time(di->di_ctime);
+	inode->i_mtime	= be64_to_inode_time(di->di_mtime);
 	set_nlink(inode, be32_to_cpu(di->di_refcount));
 	inode->i_generation = be32_to_cpu(di->di_generation);
 
@@ -160,9 +160,9 @@ static void logfs_inode_to_disk(struct inode *inode, struct logfs_disk_inode*di)
 	di->di_gid	= cpu_to_be32(i_gid_read(inode));
 	di->di_size	= cpu_to_be64(i_size_read(inode));
 	di->di_used_bytes = cpu_to_be64(li->li_used_bytes);
-	di->di_atime	= timespec_to_be64(inode->i_atime);
-	di->di_ctime	= timespec_to_be64(inode->i_ctime);
-	di->di_mtime	= timespec_to_be64(inode->i_mtime);
+	di->di_atime	= inode_time_to_be64(inode->i_atime);
+	di->di_ctime	= inode_time_to_be64(inode->i_ctime);
+	di->di_mtime	= inode_time_to_be64(inode->i_mtime);
 	di->di_refcount	= cpu_to_be32(inode->i_nlink);
 	di->di_generation = cpu_to_be32(inode->i_generation);
 
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 24/32] hfs, hfsplus: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (22 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 23/32] logfs: convert to struct inode_time Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-31 14:23   ` Vyacheslav Dubeyko
  2014-05-30 20:01 ` [RFC 25/32] gfs2: " Arnd Bergmann
                   ` (10 subsequent siblings)
  34 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann

hfs uses 32-bit integers based at 1904 for inode timestamps, which will
only work until 2040, but the VFS uses struct timespec for timestamps,
which expires even earlier in 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in logfs.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 fs/hfs/hfs_fs.h         | 2 +-
 fs/hfsplus/hfsplus_fs.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/hfs/hfs_fs.h b/fs/hfs/hfs_fs.h
index 0524cda..1a449b6 100644
--- a/fs/hfs/hfs_fs.h
+++ b/fs/hfs/hfs_fs.h
@@ -257,7 +257,7 @@ extern struct timezone sys_tz;
 #define HFS_I(inode)	(list_entry(inode, struct hfs_inode_info, vfs_inode))
 #define HFS_SB(sb)	((struct hfs_sb_info *)(sb)->s_fs_info)
 
-#define hfs_m_to_utime(time)	(struct timespec){ .tv_sec = __hfs_m_to_utime(time) }
+#define hfs_m_to_utime(time)	(struct inode_time){ .tv_sec = __hfs_m_to_utime(time) }
 #define hfs_u_to_mtime(time)	__hfs_u_to_mtime((time).tv_sec)
 #define hfs_mtime()		__hfs_u_to_mtime(get_seconds())
 
diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h
index 8b35648..dd9e642 100644
--- a/fs/hfsplus/hfsplus_fs.h
+++ b/fs/hfsplus/hfsplus_fs.h
@@ -522,7 +522,7 @@ int hfsplus_submit_bio(struct super_block *sb, sector_t sector,
 #define __hfsp_ut2mt(t)		(cpu_to_be32(t + 2082844800U))
 
 /* compatibility */
-#define hfsp_mt2ut(t)		(struct timespec){ .tv_sec = __hfsp_mt2ut(t) }
+#define hfsp_mt2ut(t)		(struct inode_time){ .tv_sec = __hfsp_mt2ut(t) }
 #define hfsp_ut2mt(t)		__hfsp_ut2mt((t).tv_sec)
 #define hfsp_now2mt()		__hfsp_ut2mt(get_seconds())
 
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 25/32] gfs2: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (23 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 24/32] hfs, hfsplus: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-06-02  9:52   ` Steven Whitehouse
  2014-05-30 20:01 ` [RFC 26/32] reiserfs: " Arnd Bergmann
                   ` (9 subsequent siblings)
  34 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Steven Whitehouse, cluster-devel

gfs2 uses 64-bit integers for inode timestamps, which will work
basically forever, but the VFS uses struct timespec for timestamps,
which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in gfs2.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: cluster-devel@redhat.com
---
 fs/gfs2/dir.c   | 6 +++---
 fs/gfs2/glops.c | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
index 1a349f9..ec57538 100644
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -835,7 +835,7 @@ static struct gfs2_leaf *new_leaf(struct inode *inode, struct buffer_head **pbh,
 	struct gfs2_leaf *leaf;
 	struct gfs2_dirent *dent;
 	struct qstr name = { .name = "" };
-	struct timespec tv = CURRENT_TIME;
+	struct inode_time tv = CURRENT_TIME;
 
 	error = gfs2_alloc_blocks(ip, &bn, &n, 0, NULL);
 	if (error)
@@ -1716,7 +1716,7 @@ int gfs2_dir_add(struct inode *inode, const struct qstr *name,
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct buffer_head *bh = da->bh;
 	struct gfs2_dirent *dent = da->dent;
-	struct timespec tv;
+	struct inode_time tv;
 	struct gfs2_leaf *leaf;
 	int error;
 
@@ -1794,7 +1794,7 @@ int gfs2_dir_del(struct gfs2_inode *dip, const struct dentry *dentry)
 	const struct qstr *name = &dentry->d_name;
 	struct gfs2_dirent *dent, *prev = NULL;
 	struct buffer_head *bh;
-	struct timespec tv = CURRENT_TIME;
+	struct inode_time tv = CURRENT_TIME;
 
 	/* Returns _either_ the entry (if its first in block) or the
 	   previous entry otherwise */
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index fc11007..b55308f 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -318,7 +318,7 @@ static void gfs2_set_nlink(struct inode *inode, u32 nlink)
 static int gfs2_dinode_in(struct gfs2_inode *ip, const void *buf)
 {
 	const struct gfs2_dinode *str = buf;
-	struct timespec atime;
+	struct inode_time atime;
 	u16 height, depth;
 
 	if (unlikely(ip->i_no_addr != be64_to_cpu(str->di_num.no_addr)))
@@ -341,7 +341,7 @@ static int gfs2_dinode_in(struct gfs2_inode *ip, const void *buf)
 	gfs2_set_inode_blocks(&ip->i_inode, be64_to_cpu(str->di_blocks));
 	atime.tv_sec = be64_to_cpu(str->di_atime);
 	atime.tv_nsec = be32_to_cpu(str->di_atime_nsec);
-	if (timespec_compare(&ip->i_inode.i_atime, &atime) < 0)
+	if (inode_time_compare(&ip->i_inode.i_atime, &atime) < 0)
 		ip->i_inode.i_atime = atime;
 	ip->i_inode.i_mtime.tv_sec = be64_to_cpu(str->di_mtime);
 	ip->i_inode.i_mtime.tv_nsec = be32_to_cpu(str->di_mtime_nsec);
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 26/32] reiserfs: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (24 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 25/32] gfs2: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-30 20:01 ` [RFC 27/32] jffs2: " Arnd Bergmann
                   ` (8 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, reiserfs-devel

reiserfs uses unsigned 32-bit seconds for inode timestamps, which will work
for the next 92 years, but the VFS uses struct timespec for timestamps,
which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in reiserfs.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: reiserfs-devel@vger.kernel.org
---
 fs/reiserfs/namei.c | 2 +-
 fs/reiserfs/xattr.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/reiserfs/namei.c b/fs/reiserfs/namei.c
index e825f8b..4b81c84 100644
--- a/fs/reiserfs/namei.c
+++ b/fs/reiserfs/namei.c
@@ -1215,7 +1215,7 @@ static int reiserfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 	int jbegin_count;
 	umode_t old_inode_mode;
 	unsigned long savelink = 1;
-	struct timespec ctime;
+	struct inode_time ctime;
 
 	/* three balancings: (1) old name removal, (2) new name insertion
 	   and (3) maybe "save" link insertion
diff --git a/fs/reiserfs/xattr.c b/fs/reiserfs/xattr.c
index 5cdfbd6..13367ca 100644
--- a/fs/reiserfs/xattr.c
+++ b/fs/reiserfs/xattr.c
@@ -426,9 +426,9 @@ int reiserfs_commit_write(struct file *f, struct page *page,
 
 static void update_ctime(struct inode *inode)
 {
-	struct timespec now = current_fs_time(inode->i_sb);
+	struct inode_time now = current_fs_time(inode->i_sb);
 	if (inode_unhashed(inode) || !inode->i_nlink ||
-	    timespec_equal(&inode->i_ctime, &now))
+	    inode_time_equal(&inode->i_ctime, &now))
 		return;
 
 	inode->i_ctime = CURRENT_TIME_SEC;
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 27/32] jffs2: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (25 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 26/32] reiserfs: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-30 20:01 ` [RFC 28/32] adfs: " Arnd Bergmann
                   ` (7 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, David Woodhouse, linux-mtd

jffs2 uses unsigned 32-bit seconds for inode timestamps, which will work
for the next 92 years, but the VFS uses struct timespec for timestamps,
which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in jffs2.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: linux-mtd@lists.infradead.org
---
 fs/jffs2/os-linux.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/jffs2/os-linux.h b/fs/jffs2/os-linux.h
index d200a9b..64c2dfc 100644
--- a/fs/jffs2/os-linux.h
+++ b/fs/jffs2/os-linux.h
@@ -31,7 +31,7 @@ struct kvec;
 #define JFFS2_F_I_GID(f) (i_gid_read(OFNI_EDONI_2SFFJ(f)))
 #define JFFS2_F_I_RDEV(f) (OFNI_EDONI_2SFFJ(f)->i_rdev)
 
-#define ITIME(sec) ((struct timespec){sec, 0})
+#define ITIME(sec) ((struct inode_time){sec, 0})
 #define I_SEC(tv) ((tv).tv_sec)
 #define JFFS2_F_I_CTIME(f) (OFNI_EDONI_2SFFJ(f)->i_ctime.tv_sec)
 #define JFFS2_F_I_MTIME(f) (OFNI_EDONI_2SFFJ(f)->i_mtime.tv_sec)
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 28/32] adfs: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (26 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 27/32] jffs2: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-30 20:01 ` [RFC 29/32] f2fs: " Arnd Bergmann
                   ` (6 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann

adfs uses unsigned 40-bit seconds for inode timestamps, which will work
for the next 234 years, but the VFS uses struct timespec for timestamps,
which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in adfs.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 fs/adfs/inode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/adfs/inode.c b/fs/adfs/inode.c
index b9acada..5698b9e 100644
--- a/fs/adfs/inode.c
+++ b/fs/adfs/inode.c
@@ -167,7 +167,7 @@ adfs_mode2atts(struct super_block *sb, struct inode *inode)
  * of time to convert from RISC OS epoch to Unix epoch.
  */
 static void
-adfs_adfs2unix_time(struct timespec *tv, struct inode *inode)
+adfs_adfs2unix_time(struct inode_time *tv, struct inode *inode)
 {
 	unsigned int high, low;
 	/* 01 Jan 1970 00:00:00 (Unix epoch) as nanoseconds since
@@ -195,7 +195,7 @@ adfs_adfs2unix_time(struct timespec *tv, struct inode *inode)
 	/* convert from RISC OS to Unix epoch */
 	nsec -= nsec_unix_epoch_diff_risc_os_epoch;
 
-	*tv = ns_to_timespec(nsec);
+	*tv = ns_to_inode_time(nsec);
 	return;
 
  cur_time:
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 29/32] f2fs: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (27 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 28/32] adfs: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-30 20:01 ` [RFC 30/32] fuse: " Arnd Bergmann
                   ` (5 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Jaegeuk Kim, linux-f2fs-devel

f2fs uses unsigned 40-bit seconds for inode timestamps, which will work
basically forever, but the VFS uses struct timespec for timestamps,
which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in f2fs.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Cc: linux-f2fs-devel@lists.sourceforge.net
---
 fs/f2fs/file.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 37d0e1f..6ff6e5b 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -474,13 +474,13 @@ static void __setattr_copy(struct inode *inode, const struct iattr *attr)
 	if (ia_valid & ATTR_GID)
 		inode->i_gid = attr->ia_gid;
 	if (ia_valid & ATTR_ATIME)
-		inode->i_atime = timespec_trunc(attr->ia_atime,
+		inode->i_atime = inode_time_trunc(attr->ia_atime,
 						inode->i_sb->s_time_gran);
 	if (ia_valid & ATTR_MTIME)
-		inode->i_mtime = timespec_trunc(attr->ia_mtime,
+		inode->i_mtime = inode_time_trunc(attr->ia_mtime,
 						inode->i_sb->s_time_gran);
 	if (ia_valid & ATTR_CTIME)
-		inode->i_ctime = timespec_trunc(attr->ia_ctime,
+		inode->i_ctime = inode_time_trunc(attr->ia_ctime,
 						inode->i_sb->s_time_gran);
 	if (ia_valid & ATTR_MODE) {
 		umode_t mode = attr->ia_mode;
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 30/32] fuse: convert to struct inode_time
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (28 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 29/32] f2fs: " Arnd Bergmann
@ 2014-05-30 20:01 ` " Arnd Bergmann
  2014-05-30 20:01 ` [RFC 31/32] scsi: fnic: use current_kernel_time() for timestamp Arnd Bergmann
                   ` (4 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Miklos Szeredi, fuse-devel

fuse uses 64-bit seconds for inode timestamps in the user interface,
which will work basically forever, but the VFS uses struct timespec
for timestamps, which is only good until 2038 on 32-bit CPUs.

This gets us one small step closer to lifting the VFS limit by using
struct inode_time in fuse.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: fuse-devel@lists.sourceforge.net
---
 fs/fuse/inode.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 754dcf2..58f138e 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -203,7 +203,7 @@ void fuse_change_attributes(struct inode *inode, struct fuse_attr *attr,
 	struct fuse_inode *fi = get_fuse_inode(inode);
 	bool is_wb = fc->writeback_cache;
 	loff_t oldsize;
-	struct timespec old_mtime;
+	struct inode_time old_mtime;
 
 	spin_lock(&fc->lock);
 	if ((attr_version != 0 && fi->attr_version > attr_version) ||
@@ -232,7 +232,7 @@ void fuse_change_attributes(struct inode *inode, struct fuse_attr *attr,
 			truncate_pagecache(inode, attr->size);
 			inval = true;
 		} else if (fc->auto_inval_data) {
-			struct timespec new_mtime = {
+			struct inode_time new_mtime = {
 				.tv_sec = attr->mtime,
 				.tv_nsec = attr->mtimensec,
 			};
@@ -241,7 +241,7 @@ void fuse_change_attributes(struct inode *inode, struct fuse_attr *attr,
 			 * Auto inval mode also checks and invalidates if mtime
 			 * has changed.
 			 */
-			if (!timespec_equal(&old_mtime, &new_mtime))
+			if (!inode_time_equal(&old_mtime, &new_mtime))
 				inval = true;
 		}
 
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 31/32] scsi: fnic: use current_kernel_time() for timestamp
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (29 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 30/32] fuse: " Arnd Bergmann
@ 2014-05-30 20:01 ` Arnd Bergmann
  2014-05-30 20:01 ` [RFC 32/32] fs: use new inode_time definition unconditionally Arnd Bergmann
                   ` (3 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann, Hiral Patel, Suma Ramars,
	Brian Uchino, linux-scsi

The fnic driver currently uses the CURRENT_TIME macro to
generate a timestamp. Since this is otherwise used only in
file system code and we want to change the type, it's
better for this driver to use the equivalent function that
continues to return a struct timespec.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Hiral Patel <hiralpat@cisco.com>
Cc: Suma Ramars <sramars@cisco.com>
Cc: Brian Uchino <buchino@cisco.com>
Cc: linux-scsi@vger.kernel.org
---
 drivers/scsi/fnic/fnic_trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/fnic/fnic_trace.c b/drivers/scsi/fnic/fnic_trace.c
index c772859..2659538 100644
--- a/drivers/scsi/fnic/fnic_trace.c
+++ b/drivers/scsi/fnic/fnic_trace.c
@@ -612,7 +612,7 @@ int fnic_fc_trace_set_data(u32 host_no, u8 frame_type,
 			fc_trace_entries.rd_idx = 0;
 	}
 
-	fc_buf->time_stamp = CURRENT_TIME;
+	fc_buf->time_stamp = current_kernel_time();
 	fc_buf->host_no = host_no;
 	fc_buf->frame_type = frame_type;
 
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* [RFC 32/32] fs: use new inode_time definition unconditionally
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (30 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 31/32] scsi: fnic: use current_kernel_time() for timestamp Arnd Bergmann
@ 2014-05-30 20:01 ` Arnd Bergmann
  2014-05-31 14:30 ` [RFC 00/32] making inode time stamps y2038 ready Vyacheslav Dubeyko
                   ` (2 subsequent siblings)
  34 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-30 20:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, Arnd Bergmann

After all file systems have been converted to use 'struct inode_time'
for timestamps, we can remove the compatibility definition for this
structure.

This patch picks the first of the three variants I defined, but we
could pick one of the others as well.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 include/linux/time.h | 31 ++-----------------------------
 1 file changed, 2 insertions(+), 29 deletions(-)

diff --git a/include/linux/time.h b/include/linux/time.h
index f431263..00d2f14 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -6,44 +6,17 @@
 # include <linux/math64.h>
 #include <uapi/linux/time.h>
 
-#ifdef CONFIG_NEW_INODE_TIME
 /*
  * This is the type we use internally in the kernel to represent
- * absolute times in file system metadata.
+ * absolute times in file system metadata using unsigned seconds
+ * lets us extend the life span for another 69 years beyond 2038.
  * This structure must not leak out to user space, and new interfaces
  * should be using 64-bit types right away.
  */
-
-/*
- * Variant a) using unsigned seconds lets us extend the life span
- * for another 69 years beyond 2038.
- */
 struct inode_time {
 	unsigned long	tv_sec;
 	long		tv_nsec;
 };
-#elif 0
-/*
- * This variant can represent the widest range of times, but also
- * bloats 'struct inode' a little more.
- */
-struct inode_time {
-	long long	tv_sec __attribute__((packed));
-	int		tv_nsec;
-};
-#elif 0
-/*
- * The variant using bit fields is less efficient to access, but
- * small and has a wider range as the 32-bit one, plus it keeps
- * the signedness of the original timespec.
- */
-struct inode_time {
-	long long	tv_sec	: 34;
-	int		tv_nsec : 30;
-};
-#else
-#define inode_time timespec
-#endif
 
 extern struct timezone sys_tz;
 
-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 02/32] uapi: add struct __kernel_timespec{32,64}
  2014-05-30 20:01 ` [RFC 02/32] uapi: add struct __kernel_timespec{32,64} Arnd Bergmann
@ 2014-05-30 20:18   ` H. Peter Anvin
  2014-05-31 15:09     ` Arnd Bergmann
  0 siblings, 1 reply; 124+ messages in thread
From: H. Peter Anvin @ 2014-05-30 20:18 UTC (permalink / raw)
  To: Arnd Bergmann, linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, linux-fsdevel

On 05/30/2014 01:01 PM, Arnd Bergmann wrote:
> We cannot use time_t or any derived structures beyond the year
> 2038 in interfaces between kernel and user space, on 32-bit
> machines.
> 
> This is my suggestion for how to migrate syscall and ioctl
> interfaces: We completely phase out time_t, timeval and timespec
> from the uapi header files and replace them with types that are
> either explicitly safe (__kernel_timespec64), or explicitly
> unsafe (e.g. __kernel_timespec32). For each unsafe interface,
> there needs to be a safe replacement interface.
> 

This gets really messy for structures where this is ABI-dependent.  I'm
not sure this is a net win.

> +/*
> + * __kernel_timespec64 is the general type to be used for
> + * new user space interfaces passing a time argument.
> + * 64-bit nanoseconds is a bit silly, but the advantage is
> + * that it is compatible with the native 'struct timespec'
> + * on 64-bit user space. This simplifies the compat code.
> + */
> +struct __kernel_timespec64 {
> +	long long tv_sec;
> +	long long tv_nsec;
> +};

So it seems that it is not just POSIX that is drain bramaged with this,
but the "long" type for tv_nsec idiocy has made it into the C11
standard.  This unfortunately means that now there are two standards
bodies involved, at least one of which moves very slowly.

This makes me wonder if we don't need to deal with the problem in the
case of 32-bit ABIs with 64-bit time_t.  The logical thing seems to be
to EITHER:

a. ALWAYS ignore the upper 32 bits of tv_nsec when read from user space,
   but always set them to zero, or
b. Only ignore the upper 32 bits of tv_nsec when we are known to come
   from a 32-bit ABI context, but still always return zero.  These bits
   are already only used for validity checking.

   This most likely introduces a whole lot of new tests in deep paths,
   although we probably can centralize this in a single function, which
   otherwise ends up looking a lot like compat_get_timespec().

Getting rid of struct timespec on the kernel/user boundary is probably
not really feasible.

	-hpa



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 09/32] fs/pstore: convert to struct inode_time
  2014-05-30 20:01 ` [RFC 09/32] fs/pstore: convert to struct inode_time Arnd Bergmann
@ 2014-05-30 21:14   ` Kees Cook
  0 siblings, 0 replies; 124+ messages in thread
From: Kees Cook @ 2014-05-30 21:14 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: LKML, linux-arch, Joseph S. Myers, John Stultz,
	Christoph Hellwig, Thomas Gleixner, Geert Uytterhoeven, lftan,
	H. Peter Anvin, linux-fsdevel, Anton Vorontsov, Colin Cross,
	Tony Luck

On Fri, May 30, 2014 at 1:01 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> pstore uses timestamps encoded in a string as seconds, but on 32-bit systems
> cannot go beyond year 2038 because of the limits of struct timespec.
>
> This converts the pstore code to use the new struct inode_time for timestamps.
>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> Cc: Anton Vorontsov <anton@enomsg.org>
> Cc: Colin Cross <ccross@android.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Tony Luck <tony.luck@intel.com>

I don't see why you don't want to support making Linux work as a time
machine to visit the 70s! ;)

Acked-by: Kees Cook <keescook@chromium.org>

-Kees

> ---
>  drivers/firmware/efi/efi-pstore.c | 28 ++++++++++++++--------------
>  fs/pstore/inode.c                 |  2 +-
>  fs/pstore/internal.h              |  2 +-
>  fs/pstore/platform.c              |  2 +-
>  fs/pstore/ram.c                   | 18 ++++++++++--------
>  include/linux/pstore.h            |  4 ++--
>  6 files changed, 29 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/firmware/efi/efi-pstore.c b/drivers/firmware/efi/efi-pstore.c
> index 4b9dc83..a1e4153 100644
> --- a/drivers/firmware/efi/efi-pstore.c
> +++ b/drivers/firmware/efi/efi-pstore.c
> @@ -32,7 +32,7 @@ struct pstore_read_data {
>         u64 *id;
>         enum pstore_type_id *type;
>         int *count;
> -       struct timespec *timespec;
> +       struct inode_time *inode_time;
>         bool *compressed;
>         char **buf;
>  };
> @@ -63,8 +63,8 @@ static int efi_pstore_read_func(struct efivar_entry *entry, void *data)
>                    cb_data->type, &part, &cnt, &time, &data_type) == 5) {
>                 *cb_data->id = generic_id(time, part, cnt);
>                 *cb_data->count = cnt;
> -               cb_data->timespec->tv_sec = time;
> -               cb_data->timespec->tv_nsec = 0;
> +               cb_data->inode_time->tv_sec = time;
> +               cb_data->inode_time->tv_nsec = 0;
>                 if (data_type == 'C')
>                         *cb_data->compressed = true;
>                 else
> @@ -73,8 +73,8 @@ static int efi_pstore_read_func(struct efivar_entry *entry, void *data)
>                    cb_data->type, &part, &cnt, &time) == 4) {
>                 *cb_data->id = generic_id(time, part, cnt);
>                 *cb_data->count = cnt;
> -               cb_data->timespec->tv_sec = time;
> -               cb_data->timespec->tv_nsec = 0;
> +               cb_data->inode_time->tv_sec = time;
> +               cb_data->inode_time->tv_nsec = 0;
>                 *cb_data->compressed = false;
>         } else if (sscanf(name, "dump-type%u-%u-%lu",
>                           cb_data->type, &part, &time) == 3) {
> @@ -85,8 +85,8 @@ static int efi_pstore_read_func(struct efivar_entry *entry, void *data)
>                  */
>                 *cb_data->id = generic_id(time, part, 0);
>                 *cb_data->count = 0;
> -               cb_data->timespec->tv_sec = time;
> -               cb_data->timespec->tv_nsec = 0;
> +               cb_data->inode_time->tv_sec = time;
> +               cb_data->inode_time->tv_nsec = 0;
>                 *cb_data->compressed = false;
>         } else
>                 return 0;
> @@ -208,7 +208,7 @@ static int efi_pstore_sysfs_entry_iter(void *data, struct efivar_entry **pos)
>   *           and pstore will stop reading entry.
>   */
>  static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
> -                              int *count, struct timespec *timespec,
> +                              int *count, struct inode_time *inode_time,
>                                char **buf, bool *compressed,
>                                struct pstore_info *psi)
>  {
> @@ -218,7 +218,7 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
>         data.id = id;
>         data.type = type;
>         data.count = count;
> -       data.timespec = timespec;
> +       data.inode_time = inode_time;
>         data.compressed = compressed;
>         data.buf = buf;
>
> @@ -266,7 +266,7 @@ struct pstore_erase_data {
>         u64 id;
>         enum pstore_type_id type;
>         int count;
> -       struct timespec time;
> +       struct inode_time time;
>         efi_char16_t *name;
>  };
>
> @@ -292,8 +292,8 @@ static int efi_pstore_erase_func(struct efivar_entry *entry, void *data)
>                  * Check if an old format, which doesn't support
>                  * holding multiple logs, remains.
>                  */
> -               sprintf(name_old, "dump-type%u-%u-%lu", ed->type,
> -                       (unsigned int)ed->id, ed->time.tv_sec);
> +               sprintf(name_old, "dump-type%u-%u-%llu", ed->type,
> +                       (unsigned int)ed->id, (u64)ed->time.tv_sec);
>
>                 for (i = 0; i < DUMP_NAME_LEN; i++)
>                         efi_name_old[i] = name_old[i];
> @@ -319,7 +319,7 @@ static int efi_pstore_erase_func(struct efivar_entry *entry, void *data)
>  }
>
>  static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
> -                           struct timespec time, struct pstore_info *psi)
> +                           struct inode_time time, struct pstore_info *psi)
>  {
>         struct pstore_erase_data edata;
>         struct efivar_entry *entry = NULL;
> @@ -330,7 +330,7 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count,
>
>         do_div(id, 1000);
>         part = do_div(id, 100);
> -       sprintf(name, "dump-type%u-%u-%d-%lu", type, part, count, time.tv_sec);
> +       sprintf(name, "dump-type%u-%u-%d-%llu", type, part, count, (u64)time.tv_sec);
>
>         for (i = 0; i < DUMP_NAME_LEN; i++)
>                 efi_name[i] = name[i];
> diff --git a/fs/pstore/inode.c b/fs/pstore/inode.c
> index 192297b..6f3925f 100644
> --- a/fs/pstore/inode.c
> +++ b/fs/pstore/inode.c
> @@ -277,7 +277,7 @@ int pstore_is_mounted(void)
>   */
>  int pstore_mkfile(enum pstore_type_id type, char *psname, u64 id, int count,
>                   char *data, bool compressed, size_t size,
> -                 struct timespec time, struct pstore_info *psi)
> +                 struct inode_time time, struct pstore_info *psi)
>  {
>         struct dentry           *root = pstore_sb->s_root;
>         struct dentry           *dentry;
> diff --git a/fs/pstore/internal.h b/fs/pstore/internal.h
> index 3b3d305..eb9c4eb 100644
> --- a/fs/pstore/internal.h
> +++ b/fs/pstore/internal.h
> @@ -51,7 +51,7 @@ extern void   pstore_set_kmsg_bytes(int);
>  extern void    pstore_get_records(int);
>  extern int     pstore_mkfile(enum pstore_type_id, char *psname, u64 id,
>                               int count, char *data, bool compressed,
> -                             size_t size, struct timespec time,
> +                             size_t size, struct inode_time time,
>                               struct pstore_info *psi);
>  extern int     pstore_is_mounted(void);
>
> diff --git a/fs/pstore/platform.c b/fs/pstore/platform.c
> index 0a9b72c..06f2628 100644
> --- a/fs/pstore/platform.c
> +++ b/fs/pstore/platform.c
> @@ -475,7 +475,7 @@ void pstore_get_records(int quiet)
>         u64                     id;
>         int                     count;
>         enum pstore_type_id     type;
> -       struct timespec         time;
> +       struct inode_time       time;
>         int                     failed = 0, rc;
>         bool                    compressed;
>         int                     unzipped_len = -1;
> diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
> index 3b57443..50d7298 100644
> --- a/fs/pstore/ram.c
> +++ b/fs/pstore/ram.c
> @@ -135,29 +135,31 @@ ramoops_get_next_prz(struct persistent_ram_zone *przs[], uint *c, uint max,
>         return prz;
>  }
>
> -static void ramoops_read_kmsg_hdr(char *buffer, struct timespec *time,
> +static void ramoops_read_kmsg_hdr(char *buffer, struct inode_time *time,
>                                   bool *compressed)
>  {
>         char data_type;
> +       u64 seconds;
>
> -       if (sscanf(buffer, RAMOOPS_KERNMSG_HDR "%lu.%lu-%c\n",
> -                       &time->tv_sec, &time->tv_nsec, &data_type) == 3) {
> +       if (sscanf(buffer, RAMOOPS_KERNMSG_HDR "%llu.%lu-%c\n",
> +                       &seconds, &time->tv_nsec, &data_type) == 3) {
>                 if (data_type == 'C')
>                         *compressed = true;
>                 else
>                         *compressed = false;
> -       } else if (sscanf(buffer, RAMOOPS_KERNMSG_HDR "%lu.%lu\n",
> -                       &time->tv_sec, &time->tv_nsec) == 2) {
> +       } else if (sscanf(buffer, RAMOOPS_KERNMSG_HDR "%llu.%lu\n",
> +                       &seconds, &time->tv_nsec) == 2) {
>                         *compressed = false;
>         } else {
> -               time->tv_sec = 0;
> +               seconds = 0;
>                 time->tv_nsec = 0;
>                 *compressed = false;
>         }
> +       time->tv_sec = seconds;
>  }
>
>  static ssize_t ramoops_pstore_read(u64 *id, enum pstore_type_id *type,
> -                                  int *count, struct timespec *time,
> +                                  int *count, struct inode_time *time,
>                                    char **buf, bool *compressed,
>                                    struct pstore_info *psi)
>  {
> @@ -278,7 +280,7 @@ static int notrace ramoops_pstore_write_buf(enum pstore_type_id type,
>  }
>
>  static int ramoops_pstore_erase(enum pstore_type_id type, u64 id, int count,
> -                               struct timespec time, struct pstore_info *psi)
> +                               struct inode_time time, struct pstore_info *psi)
>  {
>         struct ramoops_context *cxt = psi->data;
>         struct persistent_ram_zone *prz;
> diff --git a/include/linux/pstore.h b/include/linux/pstore.h
> index ece0c6b..f293905 100644
> --- a/include/linux/pstore.h
> +++ b/include/linux/pstore.h
> @@ -55,7 +55,7 @@ struct pstore_info {
>         int             (*open)(struct pstore_info *psi);
>         int             (*close)(struct pstore_info *psi);
>         ssize_t         (*read)(u64 *id, enum pstore_type_id *type,
> -                       int *count, struct timespec *time, char **buf,
> +                       int *count, struct inode_time *time, char **buf,
>                         bool *compressed, struct pstore_info *psi);
>         int             (*write)(enum pstore_type_id type,
>                         enum kmsg_dump_reason reason, u64 *id,
> @@ -66,7 +66,7 @@ struct pstore_info {
>                         unsigned int part, const char *buf, bool compressed,
>                         size_t size, struct pstore_info *psi);
>         int             (*erase)(enum pstore_type_id type, u64 id,
> -                       int count, struct timespec time,
> +                       int count, struct inode_time time,
>                         struct pstore_info *psi);
>         void            *data;
>  };
> --
> 1.8.3.2
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 22/32] fs: convert simple fs to inode_time
  2014-05-30 20:01 ` [RFC 22/32] fs: convert simple fs to inode_time Arnd Bergmann
@ 2014-05-30 23:06   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 124+ messages in thread
From: Greg Kroah-Hartman @ 2014-05-30 23:06 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linux-arch, joseph, john.stultz, hch, tglx, geert,
	lftan, hpa, linux-fsdevel, Joel Becker

On Fri, May 30, 2014 at 10:01:46PM +0200, Arnd Bergmann wrote:
> tty, usbgadgetfs, configfs and cramfs do not store inode timestamps
> permanently, but they use code that interacts with the VFS inode
> times. In order to change over VFS to a struct inode_time, we
> have to make trivial changes to these file systems as well.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Joel Becker <jlbec@evilplan.org>

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-05-30 20:01 ` [RFC 11/32] xfs: " Arnd Bergmann
@ 2014-05-31  0:37   ` Dave Chinner
  2014-05-31  0:41     ` H. Peter Anvin
  0 siblings, 1 reply; 124+ messages in thread
From: Dave Chinner @ 2014-05-31  0:37 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linux-arch, joseph, john.stultz, hch, tglx, geert,
	lftan, hpa, linux-fsdevel, xfs

On Fri, May 30, 2014 at 10:01:35PM +0200, Arnd Bergmann wrote:
> xfs uses unsigned 32-bit seconds for inode timestamps, which will work
> for the next 92 years, but the VFS uses struct timespec for timestamps,
> which is only good until 2038 on 32-bit CPUs.
> 
> This gets us one small step closer to lifting the VFS limit by using
> struct inode_time in XFS.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> Cc: Dave Chinner <david@fromorbit.com>
> Cc: xfs@oss.sgi.com
> ---
>  fs/xfs/time.h            | 4 ++--
>  fs/xfs/xfs_inode.c       | 2 +-
>  fs/xfs/xfs_iops.c        | 2 +-
>  fs/xfs/xfs_trans_inode.c | 6 +++---
>  4 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/xfs/time.h b/fs/xfs/time.h
> index 387e695..a490f1b 100644
> --- a/fs/xfs/time.h
> +++ b/fs/xfs/time.h
> @@ -21,14 +21,14 @@
>  #include <linux/sched.h>
>  #include <linux/time.h>
>  
> -typedef struct timespec timespec_t;
> +typedef struct inode_time timespec_t;
>  
>  static inline void delay(long ticks)
>  {
>  	schedule_timeout_uninterruptible(ticks);
>  }
>  
> -static inline void nanotime(struct timespec *tvp)
> +static inline void nanotime(struct inode_time *tvp)
>  {
>  	*tvp = CURRENT_TIME;
>  }
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index a6115fe..16d5392 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -654,7 +654,7 @@ xfs_ialloc(
>  	xfs_inode_t	*ip;
>  	uint		flags;
>  	int		error;
> -	timespec_t	tv;
> +	struct inode_time tv;
>  
>  	/*
>  	 * Call the space management code to pick
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 205613a..092ee7c 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -956,7 +956,7 @@ xfs_vn_setattr(
>  STATIC int
>  xfs_vn_update_time(
>  	struct inode		*inode,
> -	struct timespec		*now,
> +	struct inode_time	*now,
>  	int			flags)
>  {
>  	struct xfs_inode	*ip = XFS_I(inode);
> diff --git a/fs/xfs/xfs_trans_inode.c b/fs/xfs/xfs_trans_inode.c
> index 50c3f56..bae2520 100644
> --- a/fs/xfs/xfs_trans_inode.c
> +++ b/fs/xfs/xfs_trans_inode.c
> @@ -70,7 +70,7 @@ xfs_trans_ichgtime(
>  	int			flags)
>  {
>  	struct inode		*inode = VFS_I(ip);
> -	timespec_t		tv;
> +	struct inode_time	tv;
>  
>  	ASSERT(tp);
>  	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
> @@ -78,13 +78,13 @@ xfs_trans_ichgtime(
>  	tv = current_fs_time(inode->i_sb);
>  
>  	if ((flags & XFS_ICHGTIME_MOD) &&
> -	    !timespec_equal(&inode->i_mtime, &tv)) {
> +	    !inode_time_equal(&inode->i_mtime, &tv)) {
>  		inode->i_mtime = tv;
>  		ip->i_d.di_mtime.t_sec = tv.tv_sec;
>  		ip->i_d.di_mtime.t_nsec = tv.tv_nsec;
>  	}

The problem I see here is that the code is now potentially stuffing
a variable that is larger than 32 bits into on on-disk structure
that is only 32 bits in size.  You can't just change the in-memory
representation of inode timestamps and expect the problem to be
fixed - this just pushes the problem down a layer without any
intrastructure allowing filesystems to handle storage of the new
timestamp format sanely.

IOWs, the filesystem has to be able to reject any attempt to set a
timestamp that is can't represent on disk otherwise Bad Stuff will
happen, and filesystems have to be able to specify in their on
disk format what timestamp encoding is being used. The solution will
be different for every filesystem that needs to support time beyond
2038.

Hence I think you are going to need superblock flags and/or
variables to indicate the epoch range the fielsystem can support.
Then the fileystems need conversion functions from whatever the
internal VFS timestamp representation is to whatever their on-disk
format is, and only then can we switch the VFS to using a new
timestamp format.

At that point, filesystem developers can make the changes they need
to the on-disk format to support timestamps beyond 2038, and all
they need to do at the VFS layer is set the "supported range" fields
appropriately in the VFS superblock...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-05-31  0:37   ` Dave Chinner
@ 2014-05-31  0:41     ` H. Peter Anvin
  2014-05-31  1:14       ` Dave Chinner
  0 siblings, 1 reply; 124+ messages in thread
From: H. Peter Anvin @ 2014-05-31  0:41 UTC (permalink / raw)
  To: Dave Chinner, Arnd Bergmann
  Cc: linux-kernel, linux-arch, joseph, john.stultz, hch, tglx, geert,
	lftan, linux-fsdevel, xfs

On 05/30/2014 05:37 PM, Dave Chinner wrote:
> 
> IOWs, the filesystem has to be able to reject any attempt to set a
> timestamp that is can't represent on disk otherwise Bad Stuff will
> happen,

Actually it is questionable if it is worse to reject a timestamp or just
let it wrap.  Rejecting a valid timestamp is a bit like "You don't
exist, go away."

> and filesystems have to be able to specify in their on
> disk format what timestamp encoding is being used. The solution will
> be different for every filesystem that needs to support time beyond
> 2038.

Actually the cutoff can be really different for each filesystem, not
necessarily 2038.  However, I maintain the above still holds.

Consider a filesystem that kept timestamps in YYMMDDHHMMSS format.  What
would you have expected such a filesystem to do on Jan 1, 2000?

	-hpa



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-05-31  0:41     ` H. Peter Anvin
@ 2014-05-31  1:14       ` Dave Chinner
  2014-05-31  1:22         ` H. Peter Anvin
  2014-05-31 15:37         ` Arnd Bergmann
  0 siblings, 2 replies; 124+ messages in thread
From: Dave Chinner @ 2014-05-31  1:14 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Arnd Bergmann, linux-kernel, linux-arch, joseph, john.stultz,
	hch, tglx, geert, lftan, linux-fsdevel, xfs

On Fri, May 30, 2014 at 05:41:14PM -0700, H. Peter Anvin wrote:
> On 05/30/2014 05:37 PM, Dave Chinner wrote:
> > 
> > IOWs, the filesystem has to be able to reject any attempt to set a
> > timestamp that is can't represent on disk otherwise Bad Stuff will
> > happen,
> 
> Actually it is questionable if it is worse to reject a timestamp or just
> let it wrap.  Rejecting a valid timestamp is a bit like "You don't
> exist, go away."

I think having the new systems calls being able to
return EINVAL if the value cannot be stored permanently on disk
correctly is the right thing to do. Having it silently mangled
by the filesystem and returning "everything is just fine, trust me"
is close to the worst solution I can think of. That's exactly what
leads to overflow bugs occurring....

> > and filesystems have to be able to specify in their on
> > disk format what timestamp encoding is being used. The solution will
> > be different for every filesystem that needs to support time beyond
> > 2038.
> 
> Actually the cutoff can be really different for each filesystem, not
> necessarily 2038.  However, I maintain the above still holds.

Sure, but all filesystems are supposed to handle at least the
current unix epoch.

> Consider a filesystem that kept timestamps in YYMMDDHHMMSS format.  What
> would you have expected such a filesystem to do on Jan 1, 2000?

Strawman.

We don't need to cater for fundamentally broken designs that can't
even handle the current unix epoch correctly. If such filesystems
exist, then they can simple say "original unix epoch support only"
and do whatever crap they are doing right now.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-05-31  1:14       ` Dave Chinner
@ 2014-05-31  1:22         ` H. Peter Anvin
  2014-05-31  5:54           ` Dave Chinner
  2014-05-31 15:37         ` Arnd Bergmann
  1 sibling, 1 reply; 124+ messages in thread
From: H. Peter Anvin @ 2014-05-31  1:22 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Arnd Bergmann, linux-kernel, linux-arch, joseph, john.stultz,
	hch, tglx, geert, lftan, linux-fsdevel, xfs

No, not a strawman.  Replace with Jan 26, 2038 and you have the same situation.

On May 30, 2014 6:14:50 PM PDT, Dave Chinner <david@fromorbit.com> wrote:
>On Fri, May 30, 2014 at 05:41:14PM -0700, H. Peter Anvin wrote:
>> On 05/30/2014 05:37 PM, Dave Chinner wrote:
>> > 
>> > IOWs, the filesystem has to be able to reject any attempt to set a
>> > timestamp that is can't represent on disk otherwise Bad Stuff will
>> > happen,
>> 
>> Actually it is questionable if it is worse to reject a timestamp or
>just
>> let it wrap.  Rejecting a valid timestamp is a bit like "You don't
>> exist, go away."
>
>I think having the new systems calls being able to
>return EINVAL if the value cannot be stored permanently on disk
>correctly is the right thing to do. Having it silently mangled
>by the filesystem and returning "everything is just fine, trust me"
>is close to the worst solution I can think of. That's exactly what
>leads to overflow bugs occurring....
>
>> > and filesystems have to be able to specify in their on
>> > disk format what timestamp encoding is being used. The solution
>will
>> > be different for every filesystem that needs to support time beyond
>> > 2038.
>> 
>> Actually the cutoff can be really different for each filesystem, not
>> necessarily 2038.  However, I maintain the above still holds.
>
>Sure, but all filesystems are supposed to handle at least the
>current unix epoch.
>
>> Consider a filesystem that kept timestamps in YYMMDDHHMMSS format. 
>What
>> would you have expected such a filesystem to do on Jan 1, 2000?
>
>Strawman.
>
>We don't need to cater for fundamentally broken designs that can't
>even handle the current unix epoch correctly. If such filesystems
>exist, then they can simple say "original unix epoch support only"
>and do whatever crap they are doing right now.
>
>Cheers,
>
>Dave.

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-05-31  1:22         ` H. Peter Anvin
@ 2014-05-31  5:54           ` Dave Chinner
  2014-05-31  8:41             ` H. Peter Anvin
  2014-06-02 14:00             ` Joseph S. Myers
  0 siblings, 2 replies; 124+ messages in thread
From: Dave Chinner @ 2014-05-31  5:54 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Arnd Bergmann, linux-kernel, linux-arch, joseph, john.stultz,
	hch, tglx, geert, lftan, linux-fsdevel, xfs


[ Please don't top post. ]

On Fri, May 30, 2014 at 06:22:55PM -0700, H. Peter Anvin wrote:
> On May 30, 2014 6:14:50 PM PDT, Dave Chinner <david@fromorbit.com> wrote:
> >On Fri, May 30, 2014 at 05:41:14PM -0700, H. Peter Anvin wrote:
> >> On 05/30/2014 05:37 PM, Dave Chinner wrote:
> >> > 
> >> > IOWs, the filesystem has to be able to reject any attempt to
> >> > set a timestamp that is can't represent on disk otherwise Bad
> >> > Stuff will happen,
> >> 
> >> Actually it is questionable if it is worse to reject a
> >> timestamp or
> >just
> >> let it wrap.  Rejecting a valid timestamp is a bit like "You
> >> don't exist, go away."
> >
> >I think having the new systems calls being able to return EINVAL
> >if the value cannot be stored permanently on disk correctly is
> >the right thing to do. Having it silently mangled by the
> >filesystem and returning "everything is just fine, trust me" is
> >close to the worst solution I can think of. That's exactly what
> >leads to overflow bugs occurring....
> >
> >> > and filesystems have to be able to specify in their on disk
> >> > format what timestamp encoding is being used. The solution
> >will
> >> > be different for every filesystem that needs to support time
> >> > beyond 2038.
> >> 
> >> Actually the cutoff can be really different for each
> >> filesystem, not necessarily 2038.  However, I maintain the
> >> above still holds.
> >
> >Sure, but all filesystems are supposed to handle at least the
> >current unix epoch.
> >
> >> Consider a filesystem that kept timestamps in YYMMDDHHMMSS
> >> format. 
> >What
> >> would you have expected such a filesystem to do on Jan 1, 2000?
> >
> >Strawman.
> >
> >We don't need to cater for fundamentally broken designs that
> >can't even handle the current unix epoch correctly. If such
> >filesystems exist, then they can simple say "original unix epoch
> >support only" and do whatever crap they are doing right now.
>
> No, not a strawman.  Replace with Jan 26, 2038 and you have the
> same situation.

But that's not the problem I'm talking about.  The problem isn't the
roll-over date of the epoch - the problem is that we're changing the
in-memory meaning of time without changing what the filesystems
store on disk or how they translate them.

To use your example, what I'm actually talking about is the kernel
switching to CCYYMMDDHHMMSS while the filesystem has YYMMDDHHMMSS on
disk. The filesystem doesn't know the timestamp is now a different
format, so it could mangle it writing it to disk, or it could mangle
existing timestamps in the YY.. format reading them from disk and
putting them into CC.. format structures. IOWs, it will
incorrectly translate YY  format dates to CC format, or translate
something in the CC format as though it was in YY format. And it
wouldn't even know what was the correct format because there's
nothing telling it on disk whether the date is in CC or YY format.

Either way, you get mangled timestamps, the filesystem doesn't know
about it because it's just storing what the kernel gives it, the
kernel thinks they are fine because they are just opaque when read
back, but the user says "what the fuck did a reboot do to all these
timestamps?".

Hence your example of roll-over dates is a strawman - you've
constructed a problem that is irrelevant to the issue being pointed
out.

FWIW, we already have code in the superblock and VFS to avoid such
problems on filesystems with limited timestamp resolution (i.e
s_time_gran and current_fs_time()) so that what the VFS hands the
filesystem is exactly what the VFS expects to get back from disk
when comparing timestamps.

If we are changing the in-kernel timestamp to have a greater dynamic
range that anything we current support on disk, then we need support
for all filesystems for similar translation and constraint. The
filesystems need to be able to tell the kernel what they timestamp
range they support, and then the kernel needs to follow those
guidelines. And if the filesystem is mounted on a kernel that
doesn't support the current filesystem's timestamp format, then at
minimum that filesystem cannot do anything that writes a
timestamp....

Put simply: the filesystem defines the timestamp range that can be
used safely, not the userspace API. If the filesystem can't support
the date it is handed then that is an out-of-range error. Since
when have we accepted that it's OK to handle out-of-range data with
silent overflows or corruption of the data that we are attempting to
store? We're defining a new API to support a wider date range -
there is nothing that prevents us from saying ERANGE can be returned
to a timestamp that the file cannot store correctly....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 01/32] fs: introduce new 'struct inode_time'
  2014-05-30 20:01 ` [RFC 01/32] fs: introduce new 'struct inode_time' Arnd Bergmann
@ 2014-05-31  7:56   ` Geert Uytterhoeven
  2014-05-31  8:39     ` Andreas Schwab
  2014-05-31  9:03   ` H. Peter Anvin
  1 sibling, 1 reply; 124+ messages in thread
From: Geert Uytterhoeven @ 2014-05-31  7:56 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, Linux-Arch, Joseph S. Myers, John Stultz,
	Christoph Hellwig, Thomas Gleixner, Ley Foon Tan, H. Peter Anvin,
	Linux FS Devel

Hi Arnd,

On Fri, May 30, 2014 at 10:01 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> + * The variant using bit fields is less efficient to access, but
> + * small and has a wider range as the 32-bit one, plus it keeps
> + * the signedness of the original timespec.
> + */
> +struct inode_time {
> +       long long       tv_sec  : 34;
> +       int             tv_nsec : 30;
> +};

Don't you need 31 bits for tv_nsec, to accommodate for the sign bit?
I know you won't really store negative numbers there, but storing a large
positive number will become negative on read out, won't it?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 06/32] isofs: fix timestamps beyond 2027
  2014-05-30 20:01 ` [RFC 06/32] isofs: fix timestamps beyond 2027 Arnd Bergmann
@ 2014-05-31  7:59   ` Geert Uytterhoeven
  2014-05-31  8:47     ` H. Peter Anvin
  0 siblings, 1 reply; 124+ messages in thread
From: Geert Uytterhoeven @ 2014-05-31  7:59 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, Linux-Arch, Joseph S. Myers, John Stultz,
	Christoph Hellwig, Thomas Gleixner, Ley Foon Tan, H. Peter Anvin,
	Linux FS Devel

On Fri, May 30, 2014 at 10:01 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> isofs uses a 'char' variable to load the number of years since
> 1900 for an inode timestamp. On architectures that use a signed
> char type by default, this results in an invalid date for
> anything beyond 2027.
>
> This adds a cast to 'u8' for the year number, which should extend
> the shelf life of the file system until 2155.

Oops, the CD archive of my scanned Napoleon manuscripts no longer has the
right file date? ;-)

Are there any practical uses of representating dates between 1772 and 1900
on CD/DVD?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 01/32] fs: introduce new 'struct inode_time'
  2014-05-31  7:56   ` Geert Uytterhoeven
@ 2014-05-31  8:39     ` Andreas Schwab
  2014-05-31 13:19       ` Geert Uytterhoeven
  2014-05-31 14:54       ` Arnd Bergmann
  0 siblings, 2 replies; 124+ messages in thread
From: Andreas Schwab @ 2014-05-31  8:39 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Arnd Bergmann, linux-kernel\,
	Linux-Arch, Joseph S. Myers, John Stultz, Christoph Hellwig,
	Thomas Gleixner, Ley Foon Tan, H. Peter Anvin, Linux FS Devel

Geert Uytterhoeven <geert@linux-m68k.org> writes:

> Hi Arnd,
>
> On Fri, May 30, 2014 at 10:01 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>> + * The variant using bit fields is less efficient to access, but
>> + * small and has a wider range as the 32-bit one, plus it keeps
>> + * the signedness of the original timespec.
>> + */
>> +struct inode_time {
>> +       long long       tv_sec  : 34;
>> +       int             tv_nsec : 30;
>> +};
>
> Don't you need 31 bits for tv_nsec, to accommodate for the sign bit?
> I know you won't really store negative numbers there, but storing a large
> positive number will become negative on read out, won't it?

Only if the int bitfield is signed.  Bitfields are weird, aren't they? :-)

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-05-31  5:54           ` Dave Chinner
@ 2014-05-31  8:41             ` H. Peter Anvin
  2014-05-31 15:46               ` Nicolas Pitre
  2014-06-01  0:39               ` Dave Chinner
  2014-06-02 14:00             ` Joseph S. Myers
  1 sibling, 2 replies; 124+ messages in thread
From: H. Peter Anvin @ 2014-05-31  8:41 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Arnd Bergmann, linux-kernel, linux-arch, joseph, john.stultz,
	hch, tglx, geert, lftan, linux-fsdevel, xfs

On 05/30/2014 10:54 PM, Dave Chinner wrote:
> 
> If we are changing the in-kernel timestamp to have a greater dynamic
> range that anything we current support on disk, then we need support
> for all filesystems for similar translation and constraint. The
> filesystems need to be able to tell the kernel what they timestamp
> range they support, and then the kernel needs to follow those
> guidelines. And if the filesystem is mounted on a kernel that
> doesn't support the current filesystem's timestamp format, then at
> minimum that filesystem cannot do anything that writes a
> timestamp....
> 
> Put simply: the filesystem defines the timestamp range that can be
> used safely, not the userspace API. If the filesystem can't support
> the date it is handed then that is an out-of-range error. Since
> when have we accepted that it's OK to handle out-of-range data with
> silent overflows or corruption of the data that we are attempting to
> store? We're defining a new API to support a wider date range -
> there is nothing that prevents us from saying ERANGE can be returned
> to a timestamp that the file cannot store correctly....
> 

I'm still puzzled.

Are you saying that you want a program that does:

	/* Deliberately simplified */
	gettimeofdayns(&now ...);
	utimensat(... now);

... to suddenly start failing on Jan 19, 2038 (for a filesystem with
32-bit timestamps), or would you propose some ways for the filesystems
in question to extend the range of the timestamps?

What you seem to propose also seems to imply that on Jan 19, 2038
anything that writes a timestamp with the current date (which logically
ends up being almost every write operation) would be dead and frozen on
such a filesystem -- pretty much meaning the filesystem would become
readonly if not in reality than in practice.

I strongly suspect that that would be a more catastrophic failure than
incorrect timestamps, as you suddenly have all kinds of machines
embedded in $DEITY knows what places just stop and refuse to run.

If that is not what you mean I genuinely like to understand the
situation better.

	-hpa


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 06/32] isofs: fix timestamps beyond 2027
  2014-05-31  7:59   ` Geert Uytterhoeven
@ 2014-05-31  8:47     ` H. Peter Anvin
  0 siblings, 0 replies; 124+ messages in thread
From: H. Peter Anvin @ 2014-05-31  8:47 UTC (permalink / raw)
  To: Geert Uytterhoeven, Arnd Bergmann
  Cc: linux-kernel, Linux-Arch, Joseph S. Myers, John Stultz,
	Christoph Hellwig, Thomas Gleixner, Ley Foon Tan, Linux FS Devel

On 05/31/2014 12:59 AM, Geert Uytterhoeven wrote:
> On Fri, May 30, 2014 at 10:01 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>> isofs uses a 'char' variable to load the number of years since
>> 1900 for an inode timestamp. On architectures that use a signed
>> char type by default, this results in an invalid date for
>> anything beyond 2027.
>>
>> This adds a cast to 'u8' for the year number, which should extend
>> the shelf life of the file system until 2155.
> 
> Oops, the CD archive of my scanned Napoleon manuscripts no longer has the
> right file date? ;-)
> 
> Are there any practical uses of representating dates between 1772 and 1900
> on CD/DVD?
> 

Unlikely, furthermore, the spec explicitly states that the number is
unsigned (ref: ECMA-119, 2nd ed, 9.1.5 which specifies that the numbers
are "recorded according to 7.1.1"; 7.1.1 specifies "8-bit unsigned
numerical values").

	-hpa



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 01/32] fs: introduce new 'struct inode_time'
  2014-05-30 20:01 ` [RFC 01/32] fs: introduce new 'struct inode_time' Arnd Bergmann
  2014-05-31  7:56   ` Geert Uytterhoeven
@ 2014-05-31  9:03   ` H. Peter Anvin
  2014-05-31 14:53     ` Arnd Bergmann
  1 sibling, 1 reply; 124+ messages in thread
From: H. Peter Anvin @ 2014-05-31  9:03 UTC (permalink / raw)
  To: Arnd Bergmann, linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, linux-fsdevel

On 05/30/2014 01:01 PM, Arnd Bergmann wrote:
> +#ifdef CONFIG_NEW_INODE_TIME
> +/*
> + * This is the type we use internally in the kernel to represent
> + * absolute times in file system metadata.
> + * This structure must not leak out to user space, and new interfaces
> + * should be using 64-bit types right away.
> + */
> +
> +/*
> + * Variant a) using unsigned seconds lets us extend the life span
> + * for another 69 years beyond 2038.
> + */
> +struct inode_time {
> +	unsigned long	tv_sec;
> +	long		tv_nsec;
> +};

This now differs between 32- and 64-bit systems, and on 32-bit systems
some timestamps well within the range of representation of current
systems just became unrepresentable, which is something that I thought
people were objecting very strongly to.

> +#elif 0
> +/*
> + * This variant can represent the widest range of times, but also
> + * bloats 'struct inode' a little more.
> + */
> +struct inode_time {
> +	long long	tv_sec __attribute__((packed));
> +	int		tv_nsec;
> +};

Seriously, though, can we really impose constraints stricter than what
the filesystems themselves do?  It seems we ought to be able to
represent whatever time the filesystem can represent... (modulo some
kind of window control as Y2038 or any other break point approaches.)

	-hpa





^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 13/32] ext3: convert to struct inode_time
  2014-05-30 20:01 ` [RFC 13/32] ext3: " Arnd Bergmann
@ 2014-05-31  9:10   ` H. Peter Anvin
  2014-05-31 14:32     ` Arnd Bergmann
  0 siblings, 1 reply; 124+ messages in thread
From: H. Peter Anvin @ 2014-05-31  9:10 UTC (permalink / raw)
  To: Arnd Bergmann, linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan,
	linux-fsdevel, Jan Kara, Andrew Morton, Andreas Dilger,
	linux-ext4

On 05/30/2014 01:01 PM, Arnd Bergmann wrote:
> ext3fs uses unsigned 32-bit seconds for inode timestamps, which will work
> for the next 92 years, but the VFS uses struct timespec for timestamps,
> which is only good until 2038 on 32-bit CPUs.
> 
> This gets us one small step closer to lifting the VFS limit by using
> struct inode_time in ext3. The on-disk format limit is lifted in ext4,
> which will work until 2514.
> 

This may be what the spec says, but when I experimented with this just
now it does seem that both ext2 and ext3 actually interpret timestamps
as *signed* 32-bit seconds.

	-hpa


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 03/32] fs: introduce sys_utimens64at
  2014-05-30 20:01 ` [RFC 03/32] fs: introduce sys_utimens64at Arnd Bergmann
@ 2014-05-31  9:22   ` Andreas Schwab
  2014-05-31 14:55     ` Arnd Bergmann
  0 siblings, 1 reply; 124+ messages in thread
From: Andreas Schwab @ 2014-05-31  9:22 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linux-arch, joseph, john.stultz, hch, tglx, geert,
	lftan, hpa, linux-fsdevel

Arnd Bergmann <arnd@arndb.de> writes:

> +asmlinkage long sys_utimens64at(int dfd, const char __user *filename,

All existing syscall names have the 64 suffix last, including the *at
variants, so sys_utimensat64 would be more in line.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 01/32] fs: introduce new 'struct inode_time'
  2014-05-31  8:39     ` Andreas Schwab
@ 2014-05-31 13:19       ` Geert Uytterhoeven
  2014-05-31 13:46         ` Andreas Schwab
  2014-05-31 14:54       ` Arnd Bergmann
  1 sibling, 1 reply; 124+ messages in thread
From: Geert Uytterhoeven @ 2014-05-31 13:19 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Arnd Bergmann, linux-kernel, Linux-Arch, Joseph S. Myers,
	John Stultz, Christoph Hellwig, Thomas Gleixner, Ley Foon Tan,
	H. Peter Anvin, Linux FS Devel

Hi Andreas,

On Sat, May 31, 2014 at 10:39 AM, Andreas Schwab <schwab@linux-m68k.org> wrote:
> Geert Uytterhoeven <geert@linux-m68k.org> writes:
>
>> Hi Arnd,
>>
>> On Fri, May 30, 2014 at 10:01 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>>> + * The variant using bit fields is less efficient to access, but
>>> + * small and has a wider range as the 32-bit one, plus it keeps
>>> + * the signedness of the original timespec.
>>> + */
>>> +struct inode_time {
>>> +       long long       tv_sec  : 34;
>>> +       int             tv_nsec : 30;
>>> +};
>>
>> Don't you need 31 bits for tv_nsec, to accommodate for the sign bit?
>> I know you won't really store negative numbers there, but storing a large
>> positive number will become negative on read out, won't it?
>
> Only if the int bitfield is signed.  Bitfields are weird, aren't they? :-)

"int" is signed, right? Or do you mean a bitfield needs an explicit "signed"
keyword to be signed?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 01/32] fs: introduce new 'struct inode_time'
  2014-05-31 13:19       ` Geert Uytterhoeven
@ 2014-05-31 13:46         ` Andreas Schwab
  0 siblings, 0 replies; 124+ messages in thread
From: Andreas Schwab @ 2014-05-31 13:46 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Arnd Bergmann, linux-kernel\,
	Linux-Arch, Joseph S. Myers, John Stultz, Christoph Hellwig,
	Thomas Gleixner, Ley Foon Tan, H. Peter Anvin, Linux FS Devel

Geert Uytterhoeven <geert@linux-m68k.org> writes:

> "int" is signed, right? Or do you mean a bitfield needs an explicit "signed"
> keyword to be signed?

Yes, see 6.7.2#5.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 24/32] hfs, hfsplus: convert to struct inode_time
  2014-05-30 20:01 ` [RFC 24/32] hfs, hfsplus: " Arnd Bergmann
@ 2014-05-31 14:23   ` Vyacheslav Dubeyko
  0 siblings, 0 replies; 124+ messages in thread
From: Vyacheslav Dubeyko @ 2014-05-31 14:23 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linux-arch, joseph, john.stultz, hch, tglx, geert,
	lftan, hpa, linux-fsdevel

Hi Arnd,

On Fri, 2014-05-30 at 22:01 +0200, Arnd Bergmann wrote:
> hfs uses 32-bit integers based at 1904 for inode timestamps, which will
> only work until 2040, but the VFS uses struct timespec for timestamps,
> which expires even earlier in 2038 on 32-bit CPUs.
> 
> This gets us one small step closer to lifting the VFS limit by using
> struct inode_time in logfs.
> 

I think you meant hfs/hfsplus here. I suppose that mentioning the logfs
is simple misspelling.

Thanks,
Vyacheslav Dubeyko.



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (31 preceding siblings ...)
  2014-05-30 20:01 ` [RFC 32/32] fs: use new inode_time definition unconditionally Arnd Bergmann
@ 2014-05-31 14:30 ` Vyacheslav Dubeyko
  2014-06-03 12:21   ` Arnd Bergmann
  2014-05-31 14:51 ` Richard Cochran
  2014-06-02 13:52 ` Joseph S. Myers
  34 siblings, 1 reply; 124+ messages in thread
From: Vyacheslav Dubeyko @ 2014-05-31 14:30 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linux-arch, joseph, john.stultz, hch, tglx, geert,
	lftan, hpa, linux-fsdevel, ceph-devel, cluster-devel, coda,
	codalist, fuse-devel, linux-afs, linux-btrfs, linux-cifs,
	linux-ext4, linux-f2fs-devel, linux-mtd, linux-nfs,
	linux-ntfs-dev, linux-scsi, logfs, ocfs2-devel, reiserfs-devel,
	samba-technical, xfs

Hi Arnd,

On Fri, 2014-05-30 at 22:01 +0200, Arnd Bergmann wrote:

[snip]
> 
> Arnd Bergmann (32):
>   fs: introduce new 'struct inode_time'
>   uapi: add struct __kernel_timespec{32,64}
>   fs: introduce sys_utimens64at
>   fs: introduce sys_newfstat64/sys_newfstatat64
>   arch: hook up new stat and utimes syscalls
>   isofs: fix timestamps beyond 2027
>   fs/nfs: convert to struct inode_time
>   fs/ceph: convert to 'struct inode_time'
>   fs/pstore: convert to struct inode_time
>   fs/coda: convert to struct inode_time
>   xfs: convert to struct inode_time
>   btrfs: convert to struct inode_time
>   ext3: convert to struct inode_time
>   ext4: convert to struct inode_time
>   cifs: convert to struct inode_time
>   ntfs: convert to struct inode_time
>   ubifs: convert to struct inode_time
>   ocfs2: convert to struct inode_time
>   fs/fat: convert to struct inode_time
>   afs: convert to struct inode_time
>   udf: convert to struct inode_time
>   fs: convert simple fs to inode_time
>   logfs: convert to struct inode_time
>   hfs, hfsplus: convert to struct inode_time
>   gfs2: convert to struct inode_time
>   reiserfs: convert to struct inode_time
>   jffs2: convert to struct inode_time
>   adfs: convert to struct inode_time
>   f2fs: convert to struct inode_time
>   fuse: convert to struct inode_time
>   scsi: fnic: use current_kernel_time() for timestamp
>   fs: use new inode_time definition unconditionally
> 

By the way, what about NILFS2? Is NILFS2 ready for suggested approach
without any changes?

Thanks,
Vyacheslav Dubeyko.



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 13/32] ext3: convert to struct inode_time
  2014-05-31  9:10   ` H. Peter Anvin
@ 2014-05-31 14:32     ` Arnd Bergmann
  0 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-31 14:32 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: linux-kernel, linux-arch, joseph, john.stultz, hch, tglx, geert,
	lftan, linux-fsdevel, Jan Kara, Andrew Morton, Andreas Dilger,
	linux-ext4

On Saturday 31 May 2014 02:10:45 H. Peter Anvin wrote:
> On 05/30/2014 01:01 PM, Arnd Bergmann wrote:
> > ext3fs uses unsigned 32-bit seconds for inode timestamps, which will work
> > for the next 92 years, but the VFS uses struct timespec for timestamps,
> > which is only good until 2038 on 32-bit CPUs.
> > 
> > This gets us one small step closer to lifting the VFS limit by using
> > struct inode_time in ext3. The on-disk format limit is lifted in ext4,
> > which will work until 2514.
> > 
> 
> This may be what the spec says, but when I experimented with this just
> now it does seem that both ext2 and ext3 actually interpret timestamps
> as *signed* 32-bit seconds.

Right, I can see that in ext3_iget() now:

        inode->i_atime.tv_sec = (signed)le32_to_cpu(raw_inode->i_atime);

I may have just looked at ext3_do_update_inode(), which uses this
unsigned conversion:

	raw_inode->i_ctime = cpu_to_le32(inode->i_ctime.tv_sec);

and didn't realize that this is only half of the story, and since it
converts from (potentially 64-bit) long to u32, it doesn't matter
whether that is signed or unsigned.

I may have to go through all of them again to see if I made the same
mistake in other file systems as well.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (32 preceding siblings ...)
  2014-05-31 14:30 ` [RFC 00/32] making inode time stamps y2038 ready Vyacheslav Dubeyko
@ 2014-05-31 14:51 ` Richard Cochran
  2014-05-31 15:23   ` Arnd Bergmann
  2014-06-02 13:52 ` Joseph S. Myers
  34 siblings, 1 reply; 124+ messages in thread
From: Richard Cochran @ 2014-05-31 14:51 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, hch, linux-mtd, hpa, logfs, linux-afs, joseph,
	linux-arch, linux-cifs, linux-scsi, ceph-devel, codalist,
	cluster-devel, coda, geert, linux-ext4, fuse-devel,
	reiserfs-devel, xfs, john.stultz, tglx, linux-nfs,
	linux-ntfs-dev, samba-technical, linux-f2fs-devel, ocfs2-devel,
	linux-fsdevel, lftan, linux-btrfs

On Fri, May 30, 2014 at 10:01:24PM +0200, Arnd Bergmann wrote:
> 
> I picked this because it is a fairly isolated problem, as the
> inode time stamps are rarely assigned to any other time values.
> As a byproduct of this work, I documented for each of the file
> systems we support how long the on-disk format can work[1].

Why are some of the time stamp expiration dates marked as "never"?

Thanks,
Richard

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 01/32] fs: introduce new 'struct inode_time'
  2014-05-31  9:03   ` H. Peter Anvin
@ 2014-05-31 14:53     ` Arnd Bergmann
  2014-05-31 14:55       ` H. Peter Anvin
  0 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-31 14:53 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: linux-kernel, linux-arch, joseph, john.stultz, hch, tglx, geert,
	lftan, linux-fsdevel

On Saturday 31 May 2014 02:03:38 H. Peter Anvin wrote:
> On 05/30/2014 01:01 PM, Arnd Bergmann wrote:
> > +#ifdef CONFIG_NEW_INODE_TIME
> > +/*
> > + * This is the type we use internally in the kernel to represent
> > + * absolute times in file system metadata.
> > + * This structure must not leak out to user space, and new interfaces
> > + * should be using 64-bit types right away.
> > + */
> > +
> > +/*
> > + * Variant a) using unsigned seconds lets us extend the life span
> > + * for another 69 years beyond 2038.
> > + */
> > +struct inode_time {
> > +     unsigned long   tv_sec;
> > +     long            tv_nsec;
> > +};
> 
> This now differs between 32- and 64-bit systems, and on 32-bit systems
> some timestamps well within the range of representation of current
> systems just became unrepresentable, which is something that I thought
> people were objecting very strongly to.

It really depends on the file system. As you pointed out, I was reading
the ext2/ext3 and xfs code incorrectly, so my assumption when I wrote this
was that they already used the same type, with a 1970-2106 window, rather
than the regular signed Unix epoch.

> > +#elif 0
> > +/*
> > + * This variant can represent the widest range of times, but also
> > + * bloats 'struct inode' a little more.
> > + */
> > +struct inode_time {
> > +     long long       tv_sec __attribute__((packed));
> > +     int             tv_nsec;
> > +};
> 
> Seriously, though, can we really impose constraints stricter than what
> the filesystems themselves do?  It seems we ought to be able to
> represent whatever time the filesystem can represent... (modulo some
> kind of window control as Y2038 or any other break point approaches.)

Just to make sure, do you say we should be using the 'long long/int'
struct, or something else?

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 01/32] fs: introduce new 'struct inode_time'
  2014-05-31  8:39     ` Andreas Schwab
  2014-05-31 13:19       ` Geert Uytterhoeven
@ 2014-05-31 14:54       ` Arnd Bergmann
  2014-05-31 16:15         ` Geert Uytterhoeven
  1 sibling, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-31 14:54 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Geert Uytterhoeven, linux-kernel, Linux-Arch, Joseph S. Myers,
	John Stultz, Christoph Hellwig, Thomas Gleixner, Ley Foon Tan,
	H. Peter Anvin, Linux FS Devel

On Saturday 31 May 2014 10:39:02 Andreas Schwab wrote:
> Geert Uytterhoeven <geert@linux-m68k.org> writes:
> 
> > Hi Arnd,
> >
> > On Fri, May 30, 2014 at 10:01 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> + * The variant using bit fields is less efficient to access, but
> >> + * small and has a wider range as the 32-bit one, plus it keeps
> >> + * the signedness of the original timespec.
> >> + */
> >> +struct inode_time {
> >> +       long long       tv_sec  : 34;
> >> +       int             tv_nsec : 30;
> >> +};
> >
> > Don't you need 31 bits for tv_nsec, to accommodate for the sign bit?
> > I know you won't really store negative numbers there, but storing a large
> > positive number will become negative on read out, won't it?
> 
> Only if the int bitfield is signed.  Bitfields are weird, aren't they? 

It was a mistake on my side, as I didn't know about that rule and
meant write 'unsigned int' really. Also, I always have a bad feeling
about using bitfields in general.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 01/32] fs: introduce new 'struct inode_time'
  2014-05-31 14:53     ` Arnd Bergmann
@ 2014-05-31 14:55       ` H. Peter Anvin
  0 siblings, 0 replies; 124+ messages in thread
From: H. Peter Anvin @ 2014-05-31 14:55 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linux-arch, joseph, john.stultz, hch, tglx, geert,
	lftan, linux-fsdevel

Yes, s64/u32 or s64/s32.

On May 31, 2014 7:53:01 AM PDT, Arnd Bergmann <arnd@arndb.de> wrote:
>On Saturday 31 May 2014 02:03:38 H. Peter Anvin wrote:
>> On 05/30/2014 01:01 PM, Arnd Bergmann wrote:
>> > +#ifdef CONFIG_NEW_INODE_TIME
>> > +/*
>> > + * This is the type we use internally in the kernel to represent
>> > + * absolute times in file system metadata.
>> > + * This structure must not leak out to user space, and new
>interfaces
>> > + * should be using 64-bit types right away.
>> > + */
>> > +
>> > +/*
>> > + * Variant a) using unsigned seconds lets us extend the life span
>> > + * for another 69 years beyond 2038.
>> > + */
>> > +struct inode_time {
>> > +     unsigned long   tv_sec;
>> > +     long            tv_nsec;
>> > +};
>> 
>> This now differs between 32- and 64-bit systems, and on 32-bit
>systems
>> some timestamps well within the range of representation of current
>> systems just became unrepresentable, which is something that I
>thought
>> people were objecting very strongly to.
>
>It really depends on the file system. As you pointed out, I was reading
>the ext2/ext3 and xfs code incorrectly, so my assumption when I wrote
>this
>was that they already used the same type, with a 1970-2106 window,
>rather
>than the regular signed Unix epoch.
>
>> > +#elif 0
>> > +/*
>> > + * This variant can represent the widest range of times, but also
>> > + * bloats 'struct inode' a little more.
>> > + */
>> > +struct inode_time {
>> > +     long long       tv_sec __attribute__((packed));
>> > +     int             tv_nsec;
>> > +};
>> 
>> Seriously, though, can we really impose constraints stricter than
>what
>> the filesystems themselves do?  It seems we ought to be able to
>> represent whatever time the filesystem can represent... (modulo some
>> kind of window control as Y2038 or any other break point approaches.)
>
>Just to make sure, do you say we should be using the 'long long/int'
>struct, or something else?
>
>	Arnd

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 03/32] fs: introduce sys_utimens64at
  2014-05-31  9:22   ` Andreas Schwab
@ 2014-05-31 14:55     ` Arnd Bergmann
  0 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-31 14:55 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: linux-kernel, linux-arch, joseph, john.stultz, hch, tglx, geert,
	lftan, hpa, linux-fsdevel

On Saturday 31 May 2014 11:22:38 Andreas Schwab wrote:
> Arnd Bergmann <arnd@arndb.de> writes:
> 
> > +asmlinkage long sys_utimens64at(int dfd, const char __user *filename,
> 
> All existing syscall names have the 64 suffix last, including the *at
> variants, so sys_utimensat64 would be more in line.

Ok, makes sense. I actually typed utimensat64 a few times myself and
couldn't really decide which one to use.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 02/32] uapi: add struct __kernel_timespec{32,64}
  2014-05-30 20:18   ` H. Peter Anvin
@ 2014-05-31 15:09     ` Arnd Bergmann
  0 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-31 15:09 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: linux-kernel, linux-arch, joseph, john.stultz, hch, tglx, geert,
	lftan, linux-fsdevel

On Friday 30 May 2014 13:18:45 H. Peter Anvin wrote:
> On 05/30/2014 01:01 PM, Arnd Bergmann wrote:
> > We cannot use time_t or any derived structures beyond the year
> > 2038 in interfaces between kernel and user space, on 32-bit
> > machines.
> > 
> > This is my suggestion for how to migrate syscall and ioctl
> > interfaces: We completely phase out time_t, timeval and timespec
> > from the uapi header files and replace them with types that are
> > either explicitly safe (__kernel_timespec64), or explicitly
> > unsafe (e.g. __kernel_timespec32). For each unsafe interface,
> > there needs to be a safe replacement interface.
> > 
> 
> This gets really messy for structures where this is ABI-dependent.  I'm
> not sure this is a net win.

We could have an extra '__kernel_oldtimespec' type that we can
use for all ABIs that are today defined in terms of timespec.

What I was mostly trying to avoid here is leaving any 'struct timespec'
in header files, because glibc may define that type differently
depending on a __TIME_BITS macro. This is more of a problem for
ioctls than for system calls.

> > +/*
> > + * __kernel_timespec64 is the general type to be used for
> > + * new user space interfaces passing a time argument.
> > + * 64-bit nanoseconds is a bit silly, but the advantage is
> > + * that it is compatible with the native 'struct timespec'
> > + * on 64-bit user space. This simplifies the compat code.
> > + */
> > +struct __kernel_timespec64 {
> > +	long long tv_sec;
> > +	long long tv_nsec;
> > +};
> 
> So it seems that it is not just POSIX that is drain bramaged with this,
> but the "long" type for tv_nsec idiocy has made it into the C11
> standard.  This unfortunately means that now there are two standards
> bodies involved, at least one of which moves very slowly.

My feeling is that our best hope is to completely isolate the kernel
interfaces from what user space wants to have as time_t. glibc for
instance may have a different idea about standards compliance than
android or klibc.

> This makes me wonder if we don't need to deal with the problem in the
> case of 32-bit ABIs with 64-bit time_t.  The logical thing seems to be
> to EITHER:
> 
> a. ALWAYS ignore the upper 32 bits of tv_nsec when read from user space,
>    but always set them to zero, or
> b. Only ignore the upper 32 bits of tv_nsec when we are known to come
>    from a 32-bit ABI context, but still always return zero.  These bits
>    are already only used for validity checking.
> 
>    This most likely introduces a whole lot of new tests in deep paths,
>    although we probably can centralize this in a single function, which
>    otherwise ends up looking a lot like compat_get_timespec().
> 
> Getting rid of struct timespec on the kernel/user boundary is probably
> not really feasible.

My approach was based on the discussion with Joseph, who would like glibc
to support both 32 and 64 bit time_t using the same libc binary and 
versioned symbols. I don't see how that could work when you build a
user space program that sees a timespec in kernel headers and tries
to pass that into a non-translated kernel interface (e.g. ioctl) but
use the same timespec for a glibc-provided function like gettimeofday().

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-05-31 14:51 ` Richard Cochran
@ 2014-05-31 15:23   ` Arnd Bergmann
  2014-05-31 18:22     ` Richard Cochran
  2014-06-01  4:44     ` Richard Cochran
  0 siblings, 2 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-31 15:23 UTC (permalink / raw)
  To: Richard Cochran
  Cc: linux-kernel, hch, linux-mtd, hpa, logfs, linux-afs, joseph,
	linux-arch, linux-cifs, linux-scsi, ceph-devel, codalist,
	cluster-devel, coda, geert, linux-ext4, fuse-devel,
	reiserfs-devel, xfs, john.stultz, tglx, linux-nfs,
	linux-ntfs-dev, samba-technical, linux-f2fs-devel, ocfs2-devel,
	linux-fsdevel, lftan, linux-btrfs

On Saturday 31 May 2014 16:51:15 Richard Cochran wrote:
> On Fri, May 30, 2014 at 10:01:24PM +0200, Arnd Bergmann wrote:
> > 
> > I picked this because it is a fairly isolated problem, as the
> > inode time stamps are rarely assigned to any other time values.
> > As a byproduct of this work, I documented for each of the file
> > systems we support how long the on-disk format can work[1].
> 
> Why are some of the time stamp expiration dates marked as "never"?

It's an approximation:
with 64-bit timestamps, you can represent close to 300 billion
years, which is way past the time that our planet can sustain
life of any form[1].

	Arnd

[1] http://en.wikipedia.org/wiki/Timeline_of_the_far_future

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-05-31  1:14       ` Dave Chinner
  2014-05-31  1:22         ` H. Peter Anvin
@ 2014-05-31 15:37         ` Arnd Bergmann
  2014-06-01  0:24           ` Dave Chinner
  1 sibling, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-05-31 15:37 UTC (permalink / raw)
  To: Dave Chinner
  Cc: H. Peter Anvin, linux-kernel, linux-arch, joseph, john.stultz,
	hch, tglx, geert, lftan, linux-fsdevel, xfs

On Saturday 31 May 2014 11:14:50 Dave Chinner wrote:
> On Fri, May 30, 2014 at 05:41:14PM -0700, H. Peter Anvin wrote:
> > On 05/30/2014 05:37 PM, Dave Chinner wrote:
> > > 
> > > IOWs, the filesystem has to be able to reject any attempt to set a
> > > timestamp that is can't represent on disk otherwise Bad Stuff will
> > > happen,
> > 
> > Actually it is questionable if it is worse to reject a timestamp or just
> > let it wrap.  Rejecting a valid timestamp is a bit like "You don't
> > exist, go away."
> 
> I think having the new systems calls being able to
> return EINVAL if the value cannot be stored permanently on disk
> correctly is the right thing to do. Having it silently mangled
> by the filesystem and returning "everything is just fine, trust me"
> is close to the worst solution I can think of. That's exactly what
> leads to overflow bugs occurring....

While going through the file systems, I was wondering whether
we should have the times stop at the end of each file systems
epoch rather than wrap around.

> > > and filesystems have to be able to specify in their on
> > > disk format what timestamp encoding is being used. The solution will
> > > be different for every filesystem that needs to support time beyond
> > > 2038.
> > 
> > Actually the cutoff can be really different for each filesystem, not
> > necessarily 2038.  However, I maintain the above still holds.
> 
> Sure, but all filesystems are supposed to handle at least the
> current unix epoch.

In my list at http://kernelnewbies.org/y2038, I found that almost
all file systems at least times until 2106, because they treat
the on-disk value as unsigned on 64-bit systems, or they use
a completely different representation. My guess is that somebody
earlier spent a lot of work on making that happen.

The exceptions are:

* exofs uses signed values, which can probably be changed to be
  consistent with the others.
* isofs has a bug that limits it until 2027 on architectures with
  a signed 'char' type (otherwise it's 2155).
* udf can represent times for many thousands of years through a
  16-bit year representation, but the code to convert to epoch
  uses a const array that ends at 2038.
* afs uses signed seconds and can probably be fixed
* coda relies on user space time representation getting passed
  through an ioctl.
* I miscategorized xfs/ext2/ext3 as having unsigned 32-bit seconds,
  where they really use signed.

I was confused about XFS since I didn't noticed that there are
separate xfs_ictimestamp_t and xfs_timestamp_t types, so I expected
XFS to also use the 1970-2106 time range on 64-bit systems today.

If we are using the variant of my patch that extends
indode_time->tv_sec to s64, nothing should change for XFS
at all, the main difference is that we if it gets extended
to wider on-disk timestamps, they will work the same way on
32-bit and 64-bit kernels. 

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-05-31  8:41             ` H. Peter Anvin
@ 2014-05-31 15:46               ` Nicolas Pitre
  2014-06-01 19:56                 ` Arnd Bergmann
  2014-06-01  0:39               ` Dave Chinner
  1 sibling, 1 reply; 124+ messages in thread
From: Nicolas Pitre @ 2014-05-31 15:46 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Dave Chinner, Arnd Bergmann, linux-kernel, linux-arch, joseph,
	john.stultz, hch, tglx, geert, lftan, linux-fsdevel, xfs

On Sat, 31 May 2014, H. Peter Anvin wrote:

> On 05/30/2014 10:54 PM, Dave Chinner wrote:
> > 
> > If we are changing the in-kernel timestamp to have a greater dynamic
> > range that anything we current support on disk, then we need support
> > for all filesystems for similar translation and constraint. The
> > filesystems need to be able to tell the kernel what they timestamp
> > range they support, and then the kernel needs to follow those
> > guidelines. And if the filesystem is mounted on a kernel that
> > doesn't support the current filesystem's timestamp format, then at
> > minimum that filesystem cannot do anything that writes a
> > timestamp....
> > 
> > Put simply: the filesystem defines the timestamp range that can be
> > used safely, not the userspace API. If the filesystem can't support
> > the date it is handed then that is an out-of-range error. Since
> > when have we accepted that it's OK to handle out-of-range data with
> > silent overflows or corruption of the data that we are attempting to
> > store? We're defining a new API to support a wider date range -
> > there is nothing that prevents us from saying ERANGE can be returned
> > to a timestamp that the file cannot store correctly....
> > 
> 
> I'm still puzzled.
> 
> Are you saying that you want a program that does:
> 
> 	/* Deliberately simplified */
> 	gettimeofdayns(&now ...);
> 	utimensat(... now);
> 
> ... to suddenly start failing on Jan 19, 2038 (for a filesystem with
> 32-bit timestamps), or would you propose some ways for the filesystems
> in question to extend the range of the timestamps?
> 
> What you seem to propose also seems to imply that on Jan 19, 2038
> anything that writes a timestamp with the current date (which logically
> ends up being almost every write operation) would be dead and frozen on
> such a filesystem -- pretty much meaning the filesystem would become
> readonly if not in reality than in practice.

For those (legacy) filesystems with a signed 32-bit timestamps, any 
attempt to create a timestamp past Jan 19 03:14:06 2038 UTC should be 
(silently) clamped to 0x7fffffff and that value (the last representable 
time) used as an overflow indicator.  The filesystem driver should 
convert that value into a corresponding overflow value for whatever 
kernel internal time representation being used when read back, and this 
should be propagated up to user space.  It should not be a hard error 
otherwise, as you rightfully stated, everything non read-only would come 
to a halt on that day.

Inside the kernel, the overflow indicator could be as simple as 
dedicating one of the top bit in a 64-bit time_t value in order to still 
transmit the overflow limit.  For example, in the above case, we could 
use 0x40000000-7fffffff to indicate the actual time is unavailable due 
to the filesystem's time representation being overflowed from 
0x7fffffff.

If for example a filesystem cannot represent timestamps from Jan  1 
00:00:00 2100 UTC then the overflow representation for this particular 
filesystem would be 0x40000000-f48656ff.

Those syscalls with a 32-bit time_t would be returned 0x7fffffff 
whenever there is an overflow being signaled.  Whether 64-bit 
overflow-marked time_t values, when passed to user space, should clear 
the overflow bit, or use a unique time_t overflow value, could be 
decided and even changed later after discussion with glibc people for 
example.

Hard errors should be signaled to user space, and the actual operation 
aborted, only with the presence of a new flag passed to the kernel.  
However, by default, things should "just work" albeit with the "wrong" 
i.e clamped time being saved on disk as much as possible otherwise.


Nicolas

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 01/32] fs: introduce new 'struct inode_time'
  2014-05-31 14:54       ` Arnd Bergmann
@ 2014-05-31 16:15         ` Geert Uytterhoeven
  0 siblings, 0 replies; 124+ messages in thread
From: Geert Uytterhoeven @ 2014-05-31 16:15 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Andreas Schwab, linux-kernel, Linux-Arch, Joseph S. Myers,
	John Stultz, Christoph Hellwig, Thomas Gleixner, Ley Foon Tan,
	H. Peter Anvin, Linux FS Devel

On Sat, May 31, 2014 at 4:54 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Saturday 31 May 2014 10:39:02 Andreas Schwab wrote:
>> Geert Uytterhoeven <geert@linux-m68k.org> writes:
>>
>> > Hi Arnd,
>> >
>> > On Fri, May 30, 2014 at 10:01 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>> >> + * The variant using bit fields is less efficient to access, but
>> >> + * small and has a wider range as the 32-bit one, plus it keeps
>> >> + * the signedness of the original timespec.
>> >> + */
>> >> +struct inode_time {
>> >> +       long long       tv_sec  : 34;
>> >> +       int             tv_nsec : 30;
>> >> +};
>> >
>> > Don't you need 31 bits for tv_nsec, to accommodate for the sign bit?
>> > I know you won't really store negative numbers there, but storing a large
>> > positive number will become negative on read out, won't it?
>>
>> Only if the int bitfield is signed.  Bitfields are weird, aren't they?

According to 6.7.2#5 (thanks for the reference), this is implementation defined.

> It was a mistake on my side, as I didn't know about that rule and
> meant write 'unsigned int' really. Also, I always have a bad feeling

IC, but the comment said "plus it keeps the signedness".
So it doesn't keep the signedness for the tv_nsec field.

> about using bitfields in general.

Hehe...

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-05-31 15:23   ` Arnd Bergmann
@ 2014-05-31 18:22     ` Richard Cochran
  2014-05-31 19:34       ` H. Peter Anvin
  2014-06-01  4:44     ` Richard Cochran
  1 sibling, 1 reply; 124+ messages in thread
From: Richard Cochran @ 2014-05-31 18:22 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, hch, linux-mtd, hpa, logfs, linux-afs, joseph,
	linux-arch, linux-cifs, linux-scsi, ceph-devel, codalist,
	cluster-devel, coda, geert, linux-ext4, fuse-devel,
	reiserfs-devel, xfs, john.stultz, tglx, linux-nfs,
	linux-ntfs-dev, samba-technical, linux-f2fs-devel, ocfs2-devel,
	linux-fsdevel, lftan, linux-btrfs

On Sat, May 31, 2014 at 05:23:02PM +0200, Arnd Bergmann wrote:
> 
> It's an approximation:

(Approximately never ;)

> with 64-bit timestamps, you can represent close to 300 billion
> years, which is way past the time that our planet can sustain
> life of any form[1].

Did you mean mean 64 bits worth of seconds?

  2^64 / (3600*24*365) = 584,942,417,355

That is more than 300 billion years, and still, it is not quite the
same as "never".

In any case, that term is not too helpful in the comparison table,
IMHO. One could think that some sort of clever running count relative
to the last mount time was implied.

Thanks,
Richard

[1] You are forgetting the immortal robotic overlords.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-05-31 18:22     ` Richard Cochran
@ 2014-05-31 19:34       ` H. Peter Anvin
  2014-06-01  4:46         ` Richard Cochran
  0 siblings, 1 reply; 124+ messages in thread
From: H. Peter Anvin @ 2014-05-31 19:34 UTC (permalink / raw)
  To: Richard Cochran, Arnd Bergmann
  Cc: linux-kernel, hch, linux-mtd, logfs, linux-afs, joseph,
	linux-arch, linux-cifs, linux-scsi, ceph-devel, codalist,
	cluster-devel, coda, geert, linux-ext4, fuse-devel,
	reiserfs-devel, xfs, john.stultz, tglx, linux-nfs,
	linux-ntfs-dev, samba-technical, linux-f2fs-devel, ocfs2-devel,
	linux-fsdevel, lftan, linux-btrfs

Typically they are using 64-bit signed seconds.

On May 31, 2014 11:22:37 AM PDT, Richard Cochran <richardcochran@gmail.com> wrote:
>On Sat, May 31, 2014 at 05:23:02PM +0200, Arnd Bergmann wrote:
>> 
>> It's an approximation:
>
>(Approximately never ;)
>
>> with 64-bit timestamps, you can represent close to 300 billion
>> years, which is way past the time that our planet can sustain
>> life of any form[1].
>
>Did you mean mean 64 bits worth of seconds?
>
>  2^64 / (3600*24*365) = 584,942,417,355
>
>That is more than 300 billion years, and still, it is not quite the
>same as "never".
>
>In any case, that term is not too helpful in the comparison table,
>IMHO. One could think that some sort of clever running count relative
>to the last mount time was implied.
>
>Thanks,
>Richard
>
>[1] You are forgetting the immortal robotic overlords.

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-05-31 15:37         ` Arnd Bergmann
@ 2014-06-01  0:24           ` Dave Chinner
  2014-06-02  0:28             ` Dave Chinner
  0 siblings, 1 reply; 124+ messages in thread
From: Dave Chinner @ 2014-06-01  0:24 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: H. Peter Anvin, linux-kernel, linux-arch, joseph, john.stultz,
	hch, tglx, geert, lftan, linux-fsdevel, xfs

On Sat, May 31, 2014 at 05:37:52PM +0200, Arnd Bergmann wrote:
> On Saturday 31 May 2014 11:14:50 Dave Chinner wrote:
> > On Fri, May 30, 2014 at 05:41:14PM -0700, H. Peter Anvin wrote:
> > > On 05/30/2014 05:37 PM, Dave Chinner wrote:
> > > > 
> > > > IOWs, the filesystem has to be able to reject any attempt to set a
> > > > timestamp that is can't represent on disk otherwise Bad Stuff will
> > > > happen,
> > > 
> > > Actually it is questionable if it is worse to reject a timestamp or just
> > > let it wrap.  Rejecting a valid timestamp is a bit like "You don't
> > > exist, go away."
> > 
> > I think having the new systems calls being able to
> > return EINVAL if the value cannot be stored permanently on disk
> > correctly is the right thing to do. Having it silently mangled
> > by the filesystem and returning "everything is just fine, trust me"
> > is close to the worst solution I can think of. That's exactly what
> > leads to overflow bugs occurring....
> 
> While going through the file systems, I was wondering whether
> we should have the times stop at the end of each file systems
> epoch rather than wrap around.
> 
> > > > and filesystems have to be able to specify in their on
> > > > disk format what timestamp encoding is being used. The solution will
> > > > be different for every filesystem that needs to support time beyond
> > > > 2038.
> > > 
> > > Actually the cutoff can be really different for each filesystem, not
> > > necessarily 2038.  However, I maintain the above still holds.
> > 
> > Sure, but all filesystems are supposed to handle at least the
> > current unix epoch.
> 
> In my list at http://kernelnewbies.org/y2038, I found that almost
> all file systems at least times until 2106, because they treat
> the on-disk value as unsigned on 64-bit systems, or they use
> a completely different representation. My guess is that somebody
> earlier spent a lot of work on making that happen.
> 
> The exceptions are:
> 
> * exofs uses signed values, which can probably be changed to be
>   consistent with the others.
> * isofs has a bug that limits it until 2027 on architectures with
>   a signed 'char' type (otherwise it's 2155).
> * udf can represent times for many thousands of years through a
>   16-bit year representation, but the code to convert to epoch
>   uses a const array that ends at 2038.
> * afs uses signed seconds and can probably be fixed
> * coda relies on user space time representation getting passed
>   through an ioctl.
> * I miscategorized xfs/ext2/ext3 as having unsigned 32-bit seconds,
>   where they really use signed.
> 
> I was confused about XFS since I didn't noticed that there are
> separate xfs_ictimestamp_t and xfs_timestamp_t types, so I expected
> XFS to also use the 1970-2106 time range on 64-bit systems today.

You've missed an awful lot more than just the implications for the
core kernel code.

There's a good chance such changes propagate to APIs elsewhere in
the filesystems, because something you haven't realised is that XFS
effectively exposes the on-disk timestamp format directly to
userspace via the bulkstat interface (see struct xfs_bstat). It also
affects the XFS open-by-handle ioctl and the swap extent ioctl used
by the online defragmenter.

IOWs, if we are changing the on-disk timestamp format then this
affects several ioctl()s and hence quite a few of the XFS userspace
utilities. The hardest to fix will be xfsdump which would need a new
dump format to store the extended timestamp ranges, and then
xfs_restore will need to be able to handle restoring such timestamps
on filesystems that don't have extended timestamp support...

Put simply, changing the structure of system time isn't as straight
forward as changing the kernel structures. System time gets stored
permanently, and that has a cascade effect through the kernel all
to all of the filesystem utilities that know about that permanent
storage in some way....

So yes, you can change the kernel definition, but until the
permanent storage of system time can be extended to support the same
range as the kernel the *system* will still have nasty, silent epoch
overflow, truncation or corruption issues.

> If we are using the variant of my patch that extends
> indode_time->tv_sec to s64, nothing should change for XFS
> at all, the main difference is that we if it gets extended
> to wider on-disk timestamps, they will work the same way on
> 32-bit and 64-bit kernels. 

"nothing should change" except for the fact that a 64 bit timestamp
gets silently truncated to 32 bits and the timestamp is not what the
user expects it to be. The user does not find out until the inode
passes out of cache and is re-read from disk, and then it's wrong.

To put it politely: that is broken, obnoxious behaviour and we don't
design new interfaces with such ugly warts anymore. Define an
EOVERFLOW, EINVAL or ERANGE error in the new syscalls to handle this
case and *hard fail* if the storage cannot support the extended
timestamp being passed in. There is no excuse for silently mangling
out-of-range data, especially as we have plenty of time to add
support to the filesystems so that such errors don't occur. It might
take us a year to implement, but it will be done long before the
epoch overflows.

And, FWIW, this patchset needs a set of regression tests that ensure
timestamps beyond 2038 and 2106 don't change across unmount/mount.
Written for xfstests, preferably, so that it's run as part of every
filesystem developer's daily workflow. This is the only way we are
going to ensure that the filesystem and VFS code works correctly and
continues to work correctly up to the end of the current epoch....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-05-31  8:41             ` H. Peter Anvin
  2014-05-31 15:46               ` Nicolas Pitre
@ 2014-06-01  0:39               ` Dave Chinner
  1 sibling, 0 replies; 124+ messages in thread
From: Dave Chinner @ 2014-06-01  0:39 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Arnd Bergmann, linux-kernel, linux-arch, joseph, john.stultz,
	hch, tglx, geert, lftan, linux-fsdevel, xfs

On Sat, May 31, 2014 at 01:41:56AM -0700, H. Peter Anvin wrote:
> On 05/30/2014 10:54 PM, Dave Chinner wrote:
> > 
> > If we are changing the in-kernel timestamp to have a greater dynamic
> > range that anything we current support on disk, then we need support
> > for all filesystems for similar translation and constraint. The
> > filesystems need to be able to tell the kernel what they timestamp
> > range they support, and then the kernel needs to follow those
> > guidelines. And if the filesystem is mounted on a kernel that
> > doesn't support the current filesystem's timestamp format, then at
> > minimum that filesystem cannot do anything that writes a
> > timestamp....
> > 
> > Put simply: the filesystem defines the timestamp range that can be
> > used safely, not the userspace API. If the filesystem can't support
> > the date it is handed then that is an out-of-range error. Since
> > when have we accepted that it's OK to handle out-of-range data with
> > silent overflows or corruption of the data that we are attempting to
> > store? We're defining a new API to support a wider date range -
> > there is nothing that prevents us from saying ERANGE can be returned
> > to a timestamp that the file cannot store correctly....
> > 
> 
> I'm still puzzled.
> 
> Are you saying that you want a program that does:
> 
> 	/* Deliberately simplified */
> 	gettimeofdayns(&now ...);
> 	utimensat(... now);
> 
> ... to suddenly start failing on Jan 19, 2038 (for a filesystem with
> 32-bit timestamps),

Yes. Hard fail so overflows are in your face and we know exactly
what is going to cause silent timestamp screwups when the epoch

> or would you propose some ways for the filesystems
> in question to extend the range of the timestamps?

Filesystems are going to have to change their on-disk formats, so
we'd do that just like we do every other on-disk format change. With
feature bits and translation layers, new ioctl structures, etc.
Depending on the amount of work necessary, some filesystems could do
this in 3.16, others it might be 3.20 before everything is sorted
out across the kernel and userspace code...

Either way, the hard fail problem goes away as each filesystem is
converted. Further, if we have regression tests then new filesystems
are guaranteed to be designed to handle 2038 epoch rollover, and so
in a year of two this "hard fail" is effectively a non-problem. If
someone breaks something in future, then we'll know about it pretty
quickly.

> What you seem to propose also seems to imply that on Jan 19, 2038
> anything that writes a timestamp with the current date (which logically
> ends up being almost every write operation) would be dead and frozen on
> such a filesystem -- pretty much meaning the filesystem would become
> readonly if not in reality than in practice.

Yup. If we can't do what the user wants without the user thinking
corruption has occurred, then the only thing we are left with is
"shut down the filesystem" error handling. Kind of like using BUG()
rather than returning an error. That's why we need to be able to
hard fail and return an error.

However, we've got 20+ years to fix our current filesystems and all
their support code to ensure this doesn't happen. In the mean time,
having stuff hard fail is a great way to ensure that filesystems get
fixed sooner rather than later...

> I strongly suspect that that would be a more catastrophic failure than
> incorrect timestamps, as you suddenly have all kinds of machines
> embedded in $DEITY knows what places just stop and refuse to run.

Yup, that's a great way of flushing out problems 20 years before
they really matter.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-05-31 15:23   ` Arnd Bergmann
  2014-05-31 18:22     ` Richard Cochran
@ 2014-06-01  4:44     ` Richard Cochran
  1 sibling, 0 replies; 124+ messages in thread
From: Richard Cochran @ 2014-06-01  4:44 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, hch, linux-mtd, hpa, logfs, linux-afs, joseph,
	linux-arch, linux-cifs, linux-scsi, ceph-devel, codalist,
	cluster-devel, coda, geert, linux-ext4, fuse-devel,
	reiserfs-devel, xfs, john.stultz, tglx, linux-nfs,
	linux-ntfs-dev, samba-technical, linux-f2fs-devel, ocfs2-devel,
	linux-fsdevel, lftan, linux-btrfs

On Sat, May 31, 2014 at 05:23:02PM +0200, Arnd Bergmann wrote:
> On Saturday 31 May 2014 16:51:15 Richard Cochran wrote:
> > 
> > Why are some of the time stamp expiration dates marked as "never"?
> 
> It's an approximation:

Also, the term "never" might mean using arbitrarily long integers
as in ASN.1.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-05-31 19:34       ` H. Peter Anvin
@ 2014-06-01  4:46         ` Richard Cochran
  0 siblings, 0 replies; 124+ messages in thread
From: Richard Cochran @ 2014-06-01  4:46 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Arnd Bergmann, linux-kernel, hch, linux-mtd, logfs, linux-afs,
	joseph, linux-arch, linux-cifs, linux-scsi, ceph-devel, codalist,
	cluster-devel, coda, geert, linux-ext4, fuse-devel,
	reiserfs-devel, xfs, john.stultz, tglx, linux-nfs,
	linux-ntfs-dev, samba-technical, linux-f2fs-devel, ocfs2-devel,
	linux-fsdevel, lftan, linux-btrfs

On Sat, May 31, 2014 at 12:34:12PM -0700, H. Peter Anvin wrote:
> Typically they are using 64-bit signed seconds.

Okay, that is what I wanted to know.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-05-31 15:46               ` Nicolas Pitre
@ 2014-06-01 19:56                 ` Arnd Bergmann
  2014-06-01 20:26                   ` H. Peter Anvin
  2014-06-02  1:36                   ` Nicolas Pitre
  0 siblings, 2 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-01 19:56 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: H. Peter Anvin, Dave Chinner, linux-kernel, linux-arch, joseph,
	john.stultz, hch, tglx, geert, lftan, linux-fsdevel, xfs

On Saturday 31 May 2014 11:46:16 Nicolas Pitre wrote:
> > readonly if not in reality than in practice.
> 
> For those (legacy) filesystems with a signed 32-bit timestamps, any 
> attempt to create a timestamp past Jan 19 03:14:06 2038 UTC should be 
> (silently) clamped to 0x7fffffff and that value (the last representable 
> time) used as an overflow indicator.  The filesystem driver should 
> convert that value into a corresponding overflow value for whatever 
> kernel internal time representation being used when read back, and this 
> should be propagated up to user space.  It should not be a hard error 
> otherwise, as you rightfully stated, everything non read-only would come 
> to a halt on that day.

I don't think there is much of a difference between not being able to
write at all and all newly written files having the same timestamp,
causing random things to break differently.

The clamp to the maximum supported time stamp sounds like a reasonable
choice for 'utimens' and related syscalls for the case of someone
setting an arbitrary future date beyond what the file system can
represent. Then again, I don't see a reason why that shouldn't just
cause an error to be returned.

For actually running kernels beyond 2038, the best idea I've seen so
far is to disallow all broken code at compile time. I don't see
a choice but to audit the entire kernel for invalid uses on both
32 and 64 bit in the next few years. A lot of code will get changed
in the process so we can actually keep running 32-bit kernels and
file systems, but other code will likely go away:

* any system calls that pass a time_t, timeval or timespec on
  32-bit systems return -ENOSYS, to ensure all user land uses
  the replacements we will put into place
* The definition of 'time_t', 'timval' and 'timespec' can be hidden
  from the kernel, and all code using it left out.
* ext2 and ext3 file system code will have to be disabled, but that's
  file since ext4 can mount old file systems.
* until xfs gets extended, we can also disiable it at build time.

For most users, we probably want to leave all that enabled by
default until we get much closer to 2038, but a compile time
option should allow us to test what works or doesn't, and it
can be set by embedded developers that want to ensure their
code keeps running for the next few decades.

	Arnd



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-01 19:56                 ` Arnd Bergmann
@ 2014-06-01 20:26                   ` H. Peter Anvin
  2014-06-02 11:02                     ` Arnd Bergmann
  2014-06-02  1:36                   ` Nicolas Pitre
  1 sibling, 1 reply; 124+ messages in thread
From: H. Peter Anvin @ 2014-06-01 20:26 UTC (permalink / raw)
  To: Arnd Bergmann, Nicolas Pitre
  Cc: Dave Chinner, linux-kernel, linux-arch, joseph, john.stultz, hch,
	tglx, geert, lftan, linux-fsdevel, xfs

Perhaps we should make this a kernel command line option instead, with the settings: error out on outside the standard window, or a date indicating the earliest date that should be recognized and do windowing (0 for no windowing, 1970 for retconning the Unix epoch as unsigned...)

But again, the kernel is probably the least problem here...

On June 1, 2014 12:56:52 PM PDT, Arnd Bergmann <arnd@arndb.de> wrote:
>On Saturday 31 May 2014 11:46:16 Nicolas Pitre wrote:
>> > readonly if not in reality than in practice.
>> 
>> For those (legacy) filesystems with a signed 32-bit timestamps, any 
>> attempt to create a timestamp past Jan 19 03:14:06 2038 UTC should be
>
>> (silently) clamped to 0x7fffffff and that value (the last
>representable 
>> time) used as an overflow indicator.  The filesystem driver should 
>> convert that value into a corresponding overflow value for whatever 
>> kernel internal time representation being used when read back, and
>this 
>> should be propagated up to user space.  It should not be a hard error
>
>> otherwise, as you rightfully stated, everything non read-only would
>come 
>> to a halt on that day.
>
>I don't think there is much of a difference between not being able to
>write at all and all newly written files having the same timestamp,
>causing random things to break differently.
>
>The clamp to the maximum supported time stamp sounds like a reasonable
>choice for 'utimens' and related syscalls for the case of someone
>setting an arbitrary future date beyond what the file system can
>represent. Then again, I don't see a reason why that shouldn't just
>cause an error to be returned.
>
>For actually running kernels beyond 2038, the best idea I've seen so
>far is to disallow all broken code at compile time. I don't see
>a choice but to audit the entire kernel for invalid uses on both
>32 and 64 bit in the next few years. A lot of code will get changed
>in the process so we can actually keep running 32-bit kernels and
>file systems, but other code will likely go away:
>
>* any system calls that pass a time_t, timeval or timespec on
>  32-bit systems return -ENOSYS, to ensure all user land uses
>  the replacements we will put into place
>* The definition of 'time_t', 'timval' and 'timespec' can be hidden
>  from the kernel, and all code using it left out.
>* ext2 and ext3 file system code will have to be disabled, but that's
>  file since ext4 can mount old file systems.
>* until xfs gets extended, we can also disiable it at build time.
>
>For most users, we probably want to leave all that enabled by
>default until we get much closer to 2038, but a compile time
>option should allow us to test what works or doesn't, and it
>can be set by embedded developers that want to ensure their
>code keeps running for the next few decades.
>
>	Arnd

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-01  0:24           ` Dave Chinner
@ 2014-06-02  0:28             ` Dave Chinner
  2014-06-02 11:35               ` Roger Willcocks
  2014-06-02 11:43               ` Arnd Bergmann
  0 siblings, 2 replies; 124+ messages in thread
From: Dave Chinner @ 2014-06-02  0:28 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: H. Peter Anvin, linux-kernel, linux-arch, joseph, john.stultz,
	hch, tglx, geert, lftan, linux-fsdevel, xfs

On Sun, Jun 01, 2014 at 10:24:37AM +1000, Dave Chinner wrote:
> On Sat, May 31, 2014 at 05:37:52PM +0200, Arnd Bergmann wrote:
> > In my list at http://kernelnewbies.org/y2038, I found that almost
> > all file systems at least times until 2106, because they treat
> > the on-disk value as unsigned on 64-bit systems, or they use
> > a completely different representation. My guess is that somebody
> > earlier spent a lot of work on making that happen.
> > 
> > The exceptions are:
> > 
> > * exofs uses signed values, which can probably be changed to be
> >   consistent with the others.
> > * isofs has a bug that limits it until 2027 on architectures with
> >   a signed 'char' type (otherwise it's 2155).
> > * udf can represent times for many thousands of years through a
> >   16-bit year representation, but the code to convert to epoch
> >   uses a const array that ends at 2038.
> > * afs uses signed seconds and can probably be fixed
> > * coda relies on user space time representation getting passed
> >   through an ioctl.
> > * I miscategorized xfs/ext2/ext3 as having unsigned 32-bit seconds,
> >   where they really use signed.
> > 
> > I was confused about XFS since I didn't noticed that there are
> > separate xfs_ictimestamp_t and xfs_timestamp_t types, so I expected
> > XFS to also use the 1970-2106 time range on 64-bit systems today.
> 
> You've missed an awful lot more than just the implications for the
> core kernel code.
> 
> There's a good chance such changes propagate to APIs elsewhere in
> the filesystems, because something you haven't realised is that XFS
> effectively exposes the on-disk timestamp format directly to
> userspace via the bulkstat interface (see struct xfs_bstat). It also
> affects the XFS open-by-handle ioctl and the swap extent ioctl used
> by the online defragmenter.
> 
> IOWs, if we are changing the on-disk timestamp format then this
> affects several ioctl()s and hence quite a few of the XFS userspace
> utilities. The hardest to fix will be xfsdump which would need a new
> dump format to store the extended timestamp ranges, and then
> xfs_restore will need to be able to handle restoring such timestamps
> on filesystems that don't have extended timestamp support...
> 
> Put simply, changing the structure of system time isn't as straight
> forward as changing the kernel structures. System time gets stored
> permanently, and that has a cascade effect through the kernel all
> to all of the filesystem utilities that know about that permanent
> storage in some way....
> 
> So yes, you can change the kernel definition, but until the
> permanent storage of system time can be extended to support the same
> range as the kernel the *system* will still have nasty, silent epoch
> overflow, truncation or corruption issues.

Just to put that in context, here's the kernel patch to add extended
epoch support to XFS. It's completely untested as I haven't done any
userspace code changes to enable the feature. However, it should
give you an indication of how far the simple act of changing the
kernel time representation spread through the filesystem. This does
not include any of the VFS infrastructure to specifying the range of
supported timestamps.  It survives some smoke testing, but dies when
the online defragmenter starts using the bulkstat and swap extent
ioctls (the assert in xfs_inode_time_from_epoch() fires), so I
probably don't have that all sorted correctly yet...

To test extended epoch support, however, I need to some fstests that
define and validate the behaviour of the new syscalls - until we get
those we can't validate that the filesystem follows the spec
properly. I also suspect we are going to need an interface to query
the supported range of timestamps from a filesystem so that we can
test boundary conditions in an automated fashion....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

xfs: support timestamps beyond Unix epochs

From: Dave Chinner <dchinner@redhat.com>

The 32 bit second counters in timestamps are too small to represent
time beyond the unix epoch (jan 2038) correctly. Extend the on-disk
format for a timestamp to include an 8-bit epoch counter so that we
can extend time for up to 255 Unix epochs. This should be good for
representing timestamps from 1970 to somewhere around 19,000 A.D....

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/time.h            |  7 ------
 fs/xfs/xfs_bmap_util.c   | 35 +++++++++++++++++-----------
 fs/xfs/xfs_dinode.h      | 48 ++++++++++++++++++++++++++++++++++++++-
 fs/xfs/xfs_fs.h          |  9 +++++++-
 fs/xfs/xfs_fsops.c       |  5 +++-
 fs/xfs/xfs_inode.c       | 16 ++++++++++---
 fs/xfs/xfs_inode_buf.c   |  8 +++++++
 fs/xfs/xfs_ioctl32.c     |  3 +++
 fs/xfs/xfs_ioctl32.h     |  5 +++-
 fs/xfs/xfs_iops.c        | 59 +++++++++++++++++++++++++++++++-----------------
 fs/xfs/xfs_itable.c      | 12 ++++++++++
 fs/xfs/xfs_log_format.h  |  4 ++++
 fs/xfs/xfs_sb.h          | 12 +++++++++-
 fs/xfs/xfs_trans_inode.c |  2 +-
 14 files changed, 175 insertions(+), 50 deletions(-)

diff --git a/fs/xfs/time.h b/fs/xfs/time.h
index 387e695..9f38d60 100644
--- a/fs/xfs/time.h
+++ b/fs/xfs/time.h
@@ -21,16 +21,9 @@
 #include <linux/sched.h>
 #include <linux/time.h>
 
-typedef struct timespec timespec_t;
-
 static inline void delay(long ticks)
 {
 	schedule_timeout_uninterruptible(ticks);
 }
 
-static inline void nanotime(struct timespec *tvp)
-{
-	*tvp = CURRENT_TIME;
-}
-
 #endif /* __XFS_SUPPORT_TIME_H__ */
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 703b3ec..dbc9a74 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1686,6 +1686,7 @@ xfs_swap_extents(
 	int		aforkblks = 0;
 	int		taforkblks = 0;
 	__uint64_t	tmp;
+	struct timespec tv;
 
 	tempifp = kmem_alloc(sizeof(xfs_ifork_t), KM_MAYFAIL);
 	if (!tempifp) {
@@ -1746,25 +1747,33 @@ xfs_swap_extents(
 	}
 
 	/*
-	 * Compare the current change & modify times with that
-	 * passed in.  If they differ, we abort this swap.
-	 * This is the mechanism used to ensure the calling
-	 * process that the file was not changed out from
+	 * Compare the current change & modify times with that passed in.  If
+	 * they differ, we abort this swap.  This is the mechanism used to
+	 * ensure the calling process that the file was not changed out from
 	 * under it.
 	 */
-	if ((sbp->bs_ctime.tv_sec != VFS_I(ip)->i_ctime.tv_sec) ||
-	    (sbp->bs_ctime.tv_nsec != VFS_I(ip)->i_ctime.tv_nsec) ||
-	    (sbp->bs_mtime.tv_sec != VFS_I(ip)->i_mtime.tv_sec) ||
-	    (sbp->bs_mtime.tv_nsec != VFS_I(ip)->i_mtime.tv_nsec)) {
+	tv.tv_sec = xfs_inode_time_from_epoch(sbp->bs_ctime.tv_sec,
+						sbp->bs_ctime_epoch);
+	tv.tv_nsec = sbp->bs_ctime.tv_nsec;
+	if (timespec_compare(&tv, &VFS_I(ip)->i_ctime)) {
 		error = XFS_ERROR(EBUSY);
 		goto out_unlock;
 	}
 
-	/* We need to fail if the file is memory mapped.  Once we have tossed
-	 * all existing pages, the page fault will have no option
-	 * but to go to the filesystem for pages. By making the page fault call
-	 * vop_read (or write in the case of autogrow) they block on the iolock
-	 * until we have switched the extents.
+	tv.tv_sec = xfs_inode_time_from_epoch(sbp->bs_mtime.tv_sec,
+						sbp->bs_mtime_epoch);
+	tv.tv_nsec = sbp->bs_mtime.tv_nsec;
+	if (timespec_compare(&tv, &VFS_I(ip)->i_mtime)) {
+		error = XFS_ERROR(EBUSY);
+		goto out_unlock;
+	}
+
+	/*
+	 * We need to fail if the file is memory mapped.  Once we have tossed
+	 * all existing pages, the page fault will have no option but to go to
+	 * the filesystem for pages. By making the page fault call vop_read (or
+	 * write in the case of autogrow) they block on the iolock until we have
+	 * switched the extents.
 	 */
 	if (VN_MAPPED(VFS_I(ip))) {
 		error = XFS_ERROR(EBUSY);
diff --git a/fs/xfs/xfs_dinode.h b/fs/xfs/xfs_dinode.h
index 623bbe8..79f94722 100644
--- a/fs/xfs/xfs_dinode.h
+++ b/fs/xfs/xfs_dinode.h
@@ -21,11 +21,53 @@
 #define	XFS_DINODE_MAGIC		0x494e	/* 'IN' */
 #define XFS_DINODE_GOOD_VERSION(v)	((v) >= 1 && (v) <= 3)
 
+/*
+ * Inode timestamps get more complex when we consider supporting times beyond
+ * the standard unix epoch of Jan 2038. The struct xfs_timestamp cannot support
+ * more than a single extension by playing sign games, and that is still not
+ * reliable. We also can't extend the timestamp structure because there is no
+ * free space around them in the on-disk inode.
+ *
+ * Hence the simplest thing to do is to add an epoch counter for each timestamp
+ * in the inode. This can be a single byte for each timestamp and make use of
+ * a hole we currently pad. This gives us another 255 epochs range for the
+ * timestamps, but requires a superblock feature bit to indicate that these
+ * fields have meaning and can be non-zero.
+ *
+ * Provide wrapper functions for converting the kernel inode time format to
+ * the on-disk fields. The nanosecond counter is unlikely to change in future,
+ * so it's mostly just for the second+epoch counter conversion.
+ */
 typedef struct xfs_timestamp {
 	__be32		t_sec;		/* timestamp seconds */
 	__be32		t_nsec;		/* timestamp nanoseconds */
 } xfs_timestamp_t;
 
+static inline __uint8_t
+xfs_timestamp_epoch(
+	struct timespec		*time)
+{
+	/* will be zero until the extended struct inode_time is introduced */
+	return 0;
+}
+
+static inline __int32_t
+xfs_timestamp_sec(
+	struct timespec		*time)
+{
+	return time->tv_sec;
+}
+
+static inline __kernel_time_t
+xfs_inode_time_from_epoch(
+	__uint8_t	epoch,
+	__int32_t	seconds)
+{
+	/* need to handle non-zero epoch when struct inode_time is introduced */
+	ASSERT(epoch == 0);
+	return seconds;
+}
+
 /*
  * On-disk inode structure.
  *
@@ -54,7 +96,11 @@ typedef struct xfs_dinode {
 	__be32		di_nlink;	/* number of links to file */
 	__be16		di_projid_lo;	/* lower part of owner's project id */
 	__be16		di_projid_hi;	/* higher part owner's project id */
-	__u8		di_pad[6];	/* unused, zeroed space */
+	__u8		di_atime_epoch;	/* access time epoch */
+	__u8		di_mtime_epoch;	/* modify time epoch */
+	__u8		di_ctime_epoch;	/* change time epoch */
+	__u8		di_crtime_epoch;/* create time epoch */
+	__u8		di_pad[2];	/* unused, zeroed space */
 	__be16		di_flushiter;	/* incremented on flush */
 	xfs_timestamp_t	di_atime;	/* time last accessed */
 	xfs_timestamp_t	di_mtime;	/* time last modified */
diff --git a/fs/xfs/xfs_fs.h b/fs/xfs/xfs_fs.h
index d34703d..fb0a0ea 100644
--- a/fs/xfs/xfs_fs.h
+++ b/fs/xfs/xfs_fs.h
@@ -239,6 +239,7 @@ typedef struct xfs_fsop_resblks {
 #define XFS_FSOP_GEOM_FLAGS_V5SB	0x8000	/* version 5 superblock */
 #define XFS_FSOP_GEOM_FLAGS_FTYPE	0x10000	/* inode directory types */
 #define XFS_FSOP_GEOM_FLAGS_FINOBT	0x20000	/* free inode btree */
+#define XFS_FSOP_GEOM_FLAGS_EPOCH	0x40000	/* timestamp epochs */
 
 /*
  * Minimum and maximum sizes need for growth checks.
@@ -280,6 +281,9 @@ typedef struct xfs_growfs_rt {
 
 /*
  * Structures returned from ioctl XFS_IOC_FSBULKSTAT & XFS_IOC_FSBULKSTAT_SINGLE
+ *
+ * Time epoch structures are only used if the XFS_FSOP_GEOM_FLAGS_EPOCH flag is
+ * asserted in the geometry output.
  */
 typedef struct xfs_bstime {
 	time_t		tv_sec;		/* seconds		*/
@@ -307,7 +311,10 @@ typedef struct xfs_bstat {
 #define	bs_projid	bs_projid_lo	/* (previously just bs_projid)	*/
 	__u16		bs_forkoff;	/* inode fork offset in bytes	*/
 	__u16		bs_projid_hi;	/* higher part of project id	*/
-	unsigned char	bs_pad[10];	/* pad space, unused		*/
+	__u8		bs_atime_epoch;	/* access time epoch */
+	__u8		bs_mtime_epoch;	/* modify time epoch */
+	__u8		bs_ctime_epoch;	/* change time epoch */
+	unsigned char	bs_pad[7];	/* pad space, unused		*/
 	__u32		bs_dmevmask;	/* DMIG event mask		*/
 	__u16		bs_dmstate;	/* DMIG state info		*/
 	__u16		bs_aextents;	/* attribute number of extents	*/
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index d229556..7b8db57 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -103,7 +103,10 @@ xfs_fs_geometry(
 			(xfs_sb_version_hasftype(&mp->m_sb) ?
 				XFS_FSOP_GEOM_FLAGS_FTYPE : 0) |
 			(xfs_sb_version_hasfinobt(&mp->m_sb) ?
-				XFS_FSOP_GEOM_FLAGS_FINOBT : 0);
+				XFS_FSOP_GEOM_FLAGS_FINOBT : 0) |
+			(xfs_sb_version_hasepoch(&mp->m_sb) ?
+				XFS_FSOP_GEOM_FLAGS_EPOCH : 0);
+
 		geo->logsectsize = xfs_sb_version_hassector(&mp->m_sb) ?
 				mp->m_sb.sb_logsectsize : BBSIZE;
 		geo->rtsectsize = mp->m_sb.sb_blocksize;
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index a6115fe..eecae93 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -654,7 +654,8 @@ xfs_ialloc(
 	xfs_inode_t	*ip;
 	uint		flags;
 	int		error;
-	timespec_t	tv;
+	struct timespec	tv;
+	bool		has_epoch;
 
 	/*
 	 * Call the space management code to pick
@@ -720,12 +721,19 @@ xfs_ialloc(
 	ip->i_d.di_nextents = 0;
 	ASSERT(ip->i_d.di_nblocks == 0);
 
-	nanotime(&tv);
-	ip->i_d.di_mtime.t_sec = (__int32_t)tv.tv_sec;
+	has_epoch = xfs_sb_version_hasepoch(&mp->m_sb);
+	tv = current_fs_time(mp->m_super);
+	ip->i_d.di_mtime.t_sec = xfs_timestamp_sec(&tv);
 	ip->i_d.di_mtime.t_nsec = (__int32_t)tv.tv_nsec;
 	ip->i_d.di_atime = ip->i_d.di_mtime;
 	ip->i_d.di_ctime = ip->i_d.di_mtime;
 
+	if (has_epoch) {
+		ip->i_d.di_mtime_epoch = xfs_timestamp_epoch(&tv);
+		ip->i_d.di_atime_epoch = ip->i_d.di_mtime_epoch;
+		ip->i_d.di_ctime_epoch = ip->i_d.di_mtime_epoch;
+	}
+
 	/*
 	 * di_gen will have been taken care of in xfs_iread.
 	 */
@@ -743,6 +751,8 @@ xfs_ialloc(
 		ip->i_d.di_flags2 = 0;
 		memset(&(ip->i_d.di_pad2[0]), 0, sizeof(ip->i_d.di_pad2));
 		ip->i_d.di_crtime = ip->i_d.di_mtime;
+		if (has_epoch)
+			ip->i_d.di_crtime_epoch = ip->i_d.di_mtime_epoch;
 	}
 
 
diff --git a/fs/xfs/xfs_inode_buf.c b/fs/xfs/xfs_inode_buf.c
index cb35ae4..0459e3d 100644
--- a/fs/xfs/xfs_inode_buf.c
+++ b/fs/xfs/xfs_inode_buf.c
@@ -208,6 +208,10 @@ xfs_dinode_from_disk(
 	to->di_nlink = be32_to_cpu(from->di_nlink);
 	to->di_projid_lo = be16_to_cpu(from->di_projid_lo);
 	to->di_projid_hi = be16_to_cpu(from->di_projid_hi);
+	to->di_atime_epoch = from->di_atime_epoch;
+	to->di_mtime_epoch = from->di_mtime_epoch;
+	to->di_ctime_epoch = from->di_ctime_epoch;
+	to->di_crtime_epoch = from->di_crtime_epoch;
 	memcpy(to->di_pad, from->di_pad, sizeof(to->di_pad));
 	to->di_flushiter = be16_to_cpu(from->di_flushiter);
 	to->di_atime.t_sec = be32_to_cpu(from->di_atime.t_sec);
@@ -255,6 +259,10 @@ xfs_dinode_to_disk(
 	to->di_nlink = cpu_to_be32(from->di_nlink);
 	to->di_projid_lo = cpu_to_be16(from->di_projid_lo);
 	to->di_projid_hi = cpu_to_be16(from->di_projid_hi);
+	to->di_atime_epoch = from->di_atime_epoch;
+	to->di_mtime_epoch = from->di_mtime_epoch;
+	to->di_ctime_epoch = from->di_ctime_epoch;
+	to->di_crtime_epoch = from->di_crtime_epoch;
 	memcpy(to->di_pad, from->di_pad, sizeof(to->di_pad));
 	to->di_atime.t_sec = cpu_to_be32(from->di_atime.t_sec);
 	to->di_atime.t_nsec = cpu_to_be32(from->di_atime.t_nsec);
diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
index 944d5ba..215324f 100644
--- a/fs/xfs/xfs_ioctl32.c
+++ b/fs/xfs/xfs_ioctl32.c
@@ -161,6 +161,9 @@ xfs_ioctl32_bstat_copyin(
 	    get_user(bstat->bs_gen,	&bstat32->bs_gen)	||
 	    get_user(bstat->bs_projid_lo, &bstat32->bs_projid_lo) ||
 	    get_user(bstat->bs_projid_hi, &bstat32->bs_projid_hi) ||
+	    get_user(bstat->bs_atime_epoch, &bstat32->bs_atime_epoch) ||
+	    get_user(bstat->bs_mtime_epoch, &bstat32->bs_mtime_epoch) ||
+	    get_user(bstat->bs_ctime_epoch, &bstat32->bs_ctime_epoch) ||
 	    get_user(bstat->bs_dmevmask, &bstat32->bs_dmevmask)	||
 	    get_user(bstat->bs_dmstate,	&bstat32->bs_dmstate)	||
 	    get_user(bstat->bs_aextents, &bstat32->bs_aextents))
diff --git a/fs/xfs/xfs_ioctl32.h b/fs/xfs/xfs_ioctl32.h
index 80f4060..2a35c62 100644
--- a/fs/xfs/xfs_ioctl32.h
+++ b/fs/xfs/xfs_ioctl32.h
@@ -68,7 +68,10 @@ typedef struct compat_xfs_bstat {
 	__u16		bs_projid_lo;	/* lower part of project id	*/
 #define	bs_projid	bs_projid_lo	/* (previously just bs_projid)	*/
 	__u16		bs_projid_hi;	/* high part of project id	*/
-	unsigned char	bs_pad[12];	/* pad space, unused		*/
+	__u8		bs_atime_epoch;	/* access time epoch */
+	__u8		bs_mtime_epoch;	/* modify time epoch */
+	__u8		bs_ctime_epoch;	/* change time epoch */
+	unsigned char	bs_pad[9];	/* pad space, unused		*/
 	__u32		bs_dmevmask;	/* DMIG event mask		*/
 	__u16		bs_dmstate;	/* DMIG state info		*/
 	__u16		bs_aextents;	/* attribute number of extents	*/
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 205613a..0588381 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -505,23 +505,34 @@ xfs_setattr_time(
 	struct iattr		*iattr)
 {
 	struct inode		*inode = VFS_I(ip);
+	bool			has_epoch;
 
 	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
 
+	has_epoch = xfs_sb_version_hasepoch(&ip->i_mount->m_sb);
 	if (iattr->ia_valid & ATTR_ATIME) {
 		inode->i_atime = iattr->ia_atime;
-		ip->i_d.di_atime.t_sec = iattr->ia_atime.tv_sec;
-		ip->i_d.di_atime.t_nsec = iattr->ia_atime.tv_nsec;
+		ip->i_d.di_atime.t_sec = xfs_timestamp_sec(&iattr->ia_atime);
+		ip->i_d.di_atime.t_nsec = (__int32_t)iattr->ia_atime.tv_nsec;
+		if (has_epoch)
+			ip->i_d.di_atime_epoch =
+					xfs_timestamp_epoch(&iattr->ia_atime);
 	}
 	if (iattr->ia_valid & ATTR_CTIME) {
 		inode->i_ctime = iattr->ia_ctime;
-		ip->i_d.di_ctime.t_sec = iattr->ia_ctime.tv_sec;
-		ip->i_d.di_ctime.t_nsec = iattr->ia_ctime.tv_nsec;
+		ip->i_d.di_ctime.t_sec = xfs_timestamp_sec(&iattr->ia_ctime);
+		ip->i_d.di_ctime.t_nsec = (__int32_t)iattr->ia_ctime.tv_nsec;
+		if (has_epoch)
+			ip->i_d.di_ctime_epoch =
+					xfs_timestamp_epoch(&iattr->ia_ctime);
 	}
 	if (iattr->ia_valid & ATTR_MTIME) {
 		inode->i_mtime = iattr->ia_mtime;
-		ip->i_d.di_mtime.t_sec = iattr->ia_mtime.tv_sec;
-		ip->i_d.di_mtime.t_nsec = iattr->ia_mtime.tv_nsec;
+		ip->i_d.di_mtime.t_sec = xfs_timestamp_sec(&iattr->ia_mtime);
+		ip->i_d.di_mtime.t_nsec = (__int32_t)iattr->ia_mtime.tv_nsec;
+		if (has_epoch)
+			ip->i_d.di_mtime_epoch =
+					xfs_timestamp_epoch(&iattr->ia_mtime);
 	}
 }
 
@@ -963,6 +974,7 @@ xfs_vn_update_time(
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_trans	*tp;
 	int			error;
+	struct iattr		iattr = {0};
 
 	trace_xfs_update_time(ip);
 
@@ -975,20 +987,19 @@ xfs_vn_update_time(
 
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
 	if (flags & S_CTIME) {
-		inode->i_ctime = *now;
-		ip->i_d.di_ctime.t_sec = (__int32_t)now->tv_sec;
-		ip->i_d.di_ctime.t_nsec = (__int32_t)now->tv_nsec;
+		iattr.ia_valid |= ATTR_CTIME;
+		iattr.ia_ctime = *now;
 	}
 	if (flags & S_MTIME) {
-		inode->i_mtime = *now;
-		ip->i_d.di_mtime.t_sec = (__int32_t)now->tv_sec;
-		ip->i_d.di_mtime.t_nsec = (__int32_t)now->tv_nsec;
+		iattr.ia_valid |= ATTR_MTIME;
+		iattr.ia_mtime = *now;
 	}
 	if (flags & S_ATIME) {
-		inode->i_atime = *now;
-		ip->i_d.di_atime.t_sec = (__int32_t)now->tv_sec;
-		ip->i_d.di_atime.t_nsec = (__int32_t)now->tv_nsec;
+		iattr.ia_valid |= ATTR_ATIME;
+		iattr.ia_atime = *now;
 	}
+	xfs_setattr_time(ip, &iattr);
+
 	xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
 	xfs_trans_log_inode(tp, ip, XFS_ILOG_TIMESTAMP);
 	return -xfs_trans_commit(tp, 0);
@@ -1239,12 +1250,18 @@ xfs_setup_inode(
 
 	inode->i_generation = ip->i_d.di_gen;
 	i_size_write(inode, ip->i_d.di_size);
-	inode->i_atime.tv_sec	= ip->i_d.di_atime.t_sec;
-	inode->i_atime.tv_nsec	= ip->i_d.di_atime.t_nsec;
-	inode->i_mtime.tv_sec	= ip->i_d.di_mtime.t_sec;
-	inode->i_mtime.tv_nsec	= ip->i_d.di_mtime.t_nsec;
-	inode->i_ctime.tv_sec	= ip->i_d.di_ctime.t_sec;
-	inode->i_ctime.tv_nsec	= ip->i_d.di_ctime.t_nsec;
+	inode->i_atime.tv_sec = xfs_inode_time_from_epoch(
+						ip->i_d.di_atime_epoch,
+						ip->i_d.di_atime.t_sec);
+	inode->i_atime.tv_nsec = ip->i_d.di_atime.t_nsec;
+	inode->i_mtime.tv_sec = xfs_inode_time_from_epoch(
+						ip->i_d.di_mtime_epoch,
+						ip->i_d.di_mtime.t_sec);
+	inode->i_mtime.tv_nsec = ip->i_d.di_mtime.t_nsec;
+	inode->i_ctime.tv_sec = xfs_inode_time_from_epoch(
+						ip->i_d.di_ctime_epoch,
+						ip->i_d.di_ctime.t_sec);
+	inode->i_ctime.tv_nsec = ip->i_d.di_ctime.t_nsec;
 	xfs_diflags_to_iflags(inode, ip);
 
 	ip->d_ops = ip->i_mount->m_nondir_inode_ops;
diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
index cb64f22..e902418 100644
--- a/fs/xfs/xfs_itable.c
+++ b/fs/xfs/xfs_itable.c
@@ -97,12 +97,24 @@ xfs_bulkstat_one_int(
 	buf->bs_uid = dic->di_uid;
 	buf->bs_gid = dic->di_gid;
 	buf->bs_size = dic->di_size;
+
+	/* timestamp epochs are emitted only when configured */
 	buf->bs_atime.tv_sec = dic->di_atime.t_sec;
 	buf->bs_atime.tv_nsec = dic->di_atime.t_nsec;
 	buf->bs_mtime.tv_sec = dic->di_mtime.t_sec;
 	buf->bs_mtime.tv_nsec = dic->di_mtime.t_nsec;
 	buf->bs_ctime.tv_sec = dic->di_ctime.t_sec;
 	buf->bs_ctime.tv_nsec = dic->di_ctime.t_nsec;
+	if (xfs_sb_version_hasepoch(&mp->m_sb)) {
+		buf->bs_atime_epoch = dic->di_atime_epoch;
+		buf->bs_mtime_epoch = dic->di_mtime_epoch;
+		buf->bs_ctime_epoch = dic->di_ctime_epoch;
+	} else {
+		buf->bs_atime_epoch = 0;
+		buf->bs_mtime_epoch = 0;
+		buf->bs_ctime_epoch = 0;
+	}
+
 	buf->bs_xflags = xfs_ip2xflags(ip);
 	buf->bs_extsize = dic->di_extsize << mp->m_sb.sb_blocklog;
 	buf->bs_extents = dic->di_nextents;
diff --git a/fs/xfs/xfs_log_format.h b/fs/xfs/xfs_log_format.h
index f0969c7..abac6ad 100644
--- a/fs/xfs/xfs_log_format.h
+++ b/fs/xfs/xfs_log_format.h
@@ -374,6 +374,10 @@ typedef struct xfs_icdinode {
 	__uint32_t	di_nlink;	/* number of links to file */
 	__uint16_t	di_projid_lo;	/* lower part of owner's project id */
 	__uint16_t	di_projid_hi;	/* higher part of owner's project id */
+	__uint8_t	di_atime_epoch;	/* access time epoch */
+	__uint8_t	di_mtime_epoch;	/* modify time epoch */
+	__uint8_t	di_ctime_epoch;	/* change time epoch */
+	__uint8_t	di_crtime_epoch;/* create time epoch */
 	__uint8_t	di_pad[6];	/* unused, zeroed space */
 	__uint16_t	di_flushiter;	/* incremented on flush */
 	xfs_ictimestamp_t di_atime;	/* time last accessed */
diff --git a/fs/xfs/xfs_sb.h b/fs/xfs/xfs_sb.h
index c43c2d6..1b3ccd8 100644
--- a/fs/xfs/xfs_sb.h
+++ b/fs/xfs/xfs_sb.h
@@ -509,8 +509,11 @@ xfs_sb_has_ro_compat_feature(
 }
 
 #define XFS_SB_FEAT_INCOMPAT_FTYPE	(1 << 0)	/* filetype in dirent */
+#define XFS_SB_FEAT_INCOMPAT_EPOCH	(1 << 1)	/* Time beyond 2038 */
 #define XFS_SB_FEAT_INCOMPAT_ALL \
-		(XFS_SB_FEAT_INCOMPAT_FTYPE)
+		(XFS_SB_FEAT_INCOMPAT_FTYPE | \
+		 XFS_SB_FEAT_INCOMPAT_EPOCH | \
+		 0)
 
 #define XFS_SB_FEAT_INCOMPAT_UNKNOWN	~XFS_SB_FEAT_INCOMPAT_ALL
 static inline bool
@@ -558,6 +561,13 @@ static inline int xfs_sb_version_hasfinobt(xfs_sb_t *sbp)
 		(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_FINOBT);
 }
 
+static inline int xfs_sb_version_hasepoch(xfs_sb_t *sbp)
+{
+	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5) &&
+		(sbp->sb_features_incompat & XFS_SB_FEAT_INCOMPAT_EPOCH);
+}
+
+
 /*
  * end of superblock version macros
  */
diff --git a/fs/xfs/xfs_trans_inode.c b/fs/xfs/xfs_trans_inode.c
index 50c3f56..cdb4d86 100644
--- a/fs/xfs/xfs_trans_inode.c
+++ b/fs/xfs/xfs_trans_inode.c
@@ -70,7 +70,7 @@ xfs_trans_ichgtime(
 	int			flags)
 {
 	struct inode		*inode = VFS_I(ip);
-	timespec_t		tv;
+	struct timespec		tv;
 
 	ASSERT(tp);
 	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-01 19:56                 ` Arnd Bergmann
  2014-06-01 20:26                   ` H. Peter Anvin
@ 2014-06-02  1:36                   ` Nicolas Pitre
  2014-06-02  2:22                     ` Dave Chinner
  2014-06-02 10:56                     ` Arnd Bergmann
  1 sibling, 2 replies; 124+ messages in thread
From: Nicolas Pitre @ 2014-06-02  1:36 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: H. Peter Anvin, Dave Chinner, linux-kernel, linux-arch, joseph,
	john.stultz, hch, tglx, geert, lftan, linux-fsdevel, xfs

On Sun, 1 Jun 2014, Arnd Bergmann wrote:

> On Saturday 31 May 2014 11:46:16 Nicolas Pitre wrote:
> > > readonly if not in reality than in practice.
> > 
> > For those (legacy) filesystems with a signed 32-bit timestamps, any 
> > attempt to create a timestamp past Jan 19 03:14:06 2038 UTC should be 
> > (silently) clamped to 0x7fffffff and that value (the last representable 
> > time) used as an overflow indicator.  The filesystem driver should 
> > convert that value into a corresponding overflow value for whatever 
> > kernel internal time representation being used when read back, and this 
> > should be propagated up to user space.  It should not be a hard error 
> > otherwise, as you rightfully stated, everything non read-only would come 
> > to a halt on that day.
> 
> I don't think there is much of a difference between not being able to
> write at all and all newly written files having the same timestamp,
> causing random things to break differently.

Well, in one case you have a crash certitude. In the other case you have 
some probability that your system might still be usable.

> The clamp to the maximum supported time stamp sounds like a reasonable
> choice for 'utimens' and related syscalls for the case of someone
> setting an arbitrary future date beyond what the file system can
> represent. Then again, I don't see a reason why that shouldn't just
> cause an error to be returned.

Resiliance is better than outright failure.

> For actually running kernels beyond 2038, the best idea I've seen so
> far is to disallow all broken code at compile time. I don't see
> a choice but to audit the entire kernel for invalid uses on both
> 32 and 64 bit in the next few years. A lot of code will get changed
> in the process so we can actually keep running 32-bit kernels and
> file systems, but other code will likely go away:
> 
> * any system calls that pass a time_t, timeval or timespec on
>   32-bit systems return -ENOSYS, to ensure all user land uses
>   the replacements we will put into place
> * The definition of 'time_t', 'timval' and 'timespec' can be hidden
>   from the kernel, and all code using it left out.
> * ext2 and ext3 file system code will have to be disabled, but that's
>   file since ext4 can mount old file systems.

Syscalls and libs can be "fixed".  Existing filesystem content might 
not.  So if you need to mount some old media in read-write mode after 
2038 and that happens to content an ext2 or similarly limited filesystem 
then it'd better just "work".  Having the kernel refuse to modify the 
filesystem would be unacceptable.


Nicolas

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02  1:36                   ` Nicolas Pitre
@ 2014-06-02  2:22                     ` Dave Chinner
  2014-06-02  7:09                       ` Geert Uytterhoeven
  2014-06-02 10:56                     ` Arnd Bergmann
  1 sibling, 1 reply; 124+ messages in thread
From: Dave Chinner @ 2014-06-02  2:22 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Arnd Bergmann, H. Peter Anvin, linux-kernel, linux-arch, joseph,
	john.stultz, hch, tglx, geert, lftan, linux-fsdevel, xfs

On Sun, Jun 01, 2014 at 09:36:26PM -0400, Nicolas Pitre wrote:
> On Sun, 1 Jun 2014, Arnd Bergmann wrote:
> > On Saturday 31 May 2014 11:46:16 Nicolas Pitre wrote:
> > For actually running kernels beyond 2038, the best idea I've seen so
> > far is to disallow all broken code at compile time. I don't see
> > a choice but to audit the entire kernel for invalid uses on both
> > 32 and 64 bit in the next few years. A lot of code will get changed
> > in the process so we can actually keep running 32-bit kernels and
> > file systems, but other code will likely go away:
> > 
> > * any system calls that pass a time_t, timeval or timespec on
> >   32-bit systems return -ENOSYS, to ensure all user land uses
> >   the replacements we will put into place
> > * The definition of 'time_t', 'timval' and 'timespec' can be hidden
> >   from the kernel, and all code using it left out.
> > * ext2 and ext3 file system code will have to be disabled, but that's
> >   file since ext4 can mount old file systems.
> 
> Syscalls and libs can be "fixed".  Existing filesystem content might 
> not.  So if you need to mount some old media in read-write mode after 
> 2038 and that happens to content an ext2 or similarly limited filesystem 
> then it'd better just "work".  Having the kernel refuse to modify the 
> filesystem would be unacceptable.

We can already tell the VFS/filesystems not to update timestamps:

	inode->i_flags |= S_NOATIME | S_NOCMTIME;

Just enforce that everywhere (i.e. notify_change()) rather than just
on the IO path and the "legacy filesystem timestamp" problem is
"solved".

New interfaces need to return errors when an out-of-range parameter
is set. And right now, >epoch dates are out of range for most
filesystems, and so we need to handle that condition appropriately.
Silent date overflow == filesystem corruption, and as such I'm going
to error out such conditions in the filesystem regardless of what
the userspace API says.

Filesystems place all sorts of userspace visible limits on storage -
ever tried to create a file >16TB on ext4? The on-disk format
doesn't support it, so it returns an out of range error (E2BIG, I
think) if you try. XFS, OTOH, handles this just fine and so it
continues to work. It's exactly the same with timestamps - there's a
physical limit to what can sanely be stored in any given filesystem
and it's an *error condition* to go beyond that limit....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02  2:22                     ` Dave Chinner
@ 2014-06-02  7:09                       ` Geert Uytterhoeven
  0 siblings, 0 replies; 124+ messages in thread
From: Geert Uytterhoeven @ 2014-06-02  7:09 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Nicolas Pitre, Arnd Bergmann, H. Peter Anvin, linux-kernel,
	Linux-Arch, Joseph S. Myers, John Stultz, Christoph Hellwig,
	Thomas Gleixner, Ley Foon Tan, Linux FS Devel, xfs

On Mon, Jun 2, 2014 at 4:22 AM, Dave Chinner <david@fromorbit.com> wrote:
> Filesystems place all sorts of userspace visible limits on storage -
> ever tried to create a file >16TB on ext4? The on-disk format
> doesn't support it, so it returns an out of range error (E2BIG, I
> think) if you try. XFS, OTOH, handles this just fine and so it
> continues to work. It's exactly the same with timestamps - there's a
> physical limit to what can sanely be stored in any given filesystem
> and it's an *error condition* to go beyond that limit....

This comparison doesn't fly.
File sizes do not depend on the current time (except for the increase of
megapixels in your new camera ;-).
Writing a 15 GiB file to ext4 is not something that magically stops working
tomorrow.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 17/32] ubifs: convert to struct inode_time
  2014-05-30 20:01 ` [RFC 17/32] ubifs: " Arnd Bergmann
@ 2014-06-02  7:54   ` Artem Bityutskiy
  0 siblings, 0 replies; 124+ messages in thread
From: Artem Bityutskiy @ 2014-06-02  7:54 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linux-arch, joseph, john.stultz, hch, tglx, geert,
	lftan, hpa, linux-fsdevel, Adrian Hunter, linux-mtd

On Fri, 2014-05-30 at 22:01 +0200, Arnd Bergmann wrote:
> ubifs uses 64-bit integers for inode timestamps, which will work
> practicall forever, but the VFS uses struct timespec for timestamps,
> which is only good until 2038 on 32-bit CPUs.
> 
> This gets us one small step closer to lifting the VFS limit by using
> struct inode_time in ubifs.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> Cc: Artem Bityutskiy <dedekind1@gmail.com>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: linux-mtd@lists.infradead.org

Looks fine from UBIFS POW, thanks!

-- 
Best Regards,
Artem Bityutskiy


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 25/32] gfs2: convert to struct inode_time
  2014-05-30 20:01 ` [RFC 25/32] gfs2: " Arnd Bergmann
@ 2014-06-02  9:52   ` Steven Whitehouse
  0 siblings, 0 replies; 124+ messages in thread
From: Steven Whitehouse @ 2014-06-02  9:52 UTC (permalink / raw)
  To: Arnd Bergmann, linux-kernel
  Cc: linux-arch, joseph, john.stultz, hch, tglx, geert, lftan, hpa,
	linux-fsdevel, cluster-devel

Hi,

On 30/05/14 21:01, Arnd Bergmann wrote:
> gfs2 uses 64-bit integers for inode timestamps, which will work
> basically forever, but the VFS uses struct timespec for timestamps,
> which is only good until 2038 on 32-bit CPUs.
>
> This gets us one small step closer to lifting the VFS limit by using
> struct inode_time in gfs2.
>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> Cc: Steven Whitehouse <swhiteho@redhat.com>
> Cc: cluster-devel@redhat.com
> ---
>   fs/gfs2/dir.c   | 6 +++---
>   fs/gfs2/glops.c | 4 ++--
>   2 files changed, 5 insertions(+), 5 deletions(-)
Subject to deciding the internal representation of struct inode_time, 
this looks good to me.
Acked-by: Steven Whitehouse <swhiteho@redhat.com>

Steve.
> diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c
> index 1a349f9..ec57538 100644
> --- a/fs/gfs2/dir.c
> +++ b/fs/gfs2/dir.c
> @@ -835,7 +835,7 @@ static struct gfs2_leaf *new_leaf(struct inode *inode, struct buffer_head **pbh,
>   	struct gfs2_leaf *leaf;
>   	struct gfs2_dirent *dent;
>   	struct qstr name = { .name = "" };
> -	struct timespec tv = CURRENT_TIME;
> +	struct inode_time tv = CURRENT_TIME;
>   
>   	error = gfs2_alloc_blocks(ip, &bn, &n, 0, NULL);
>   	if (error)
> @@ -1716,7 +1716,7 @@ int gfs2_dir_add(struct inode *inode, const struct qstr *name,
>   	struct gfs2_inode *ip = GFS2_I(inode);
>   	struct buffer_head *bh = da->bh;
>   	struct gfs2_dirent *dent = da->dent;
> -	struct timespec tv;
> +	struct inode_time tv;
>   	struct gfs2_leaf *leaf;
>   	int error;
>   
> @@ -1794,7 +1794,7 @@ int gfs2_dir_del(struct gfs2_inode *dip, const struct dentry *dentry)
>   	const struct qstr *name = &dentry->d_name;
>   	struct gfs2_dirent *dent, *prev = NULL;
>   	struct buffer_head *bh;
> -	struct timespec tv = CURRENT_TIME;
> +	struct inode_time tv = CURRENT_TIME;
>   
>   	/* Returns _either_ the entry (if its first in block) or the
>   	   previous entry otherwise */
> diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
> index fc11007..b55308f 100644
> --- a/fs/gfs2/glops.c
> +++ b/fs/gfs2/glops.c
> @@ -318,7 +318,7 @@ static void gfs2_set_nlink(struct inode *inode, u32 nlink)
>   static int gfs2_dinode_in(struct gfs2_inode *ip, const void *buf)
>   {
>   	const struct gfs2_dinode *str = buf;
> -	struct timespec atime;
> +	struct inode_time atime;
>   	u16 height, depth;
>   
>   	if (unlikely(ip->i_no_addr != be64_to_cpu(str->di_num.no_addr)))
> @@ -341,7 +341,7 @@ static int gfs2_dinode_in(struct gfs2_inode *ip, const void *buf)
>   	gfs2_set_inode_blocks(&ip->i_inode, be64_to_cpu(str->di_blocks));
>   	atime.tv_sec = be64_to_cpu(str->di_atime);
>   	atime.tv_nsec = be32_to_cpu(str->di_atime_nsec);
> -	if (timespec_compare(&ip->i_inode.i_atime, &atime) < 0)
> +	if (inode_time_compare(&ip->i_inode.i_atime, &atime) < 0)
>   		ip->i_inode.i_atime = atime;
>   	ip->i_inode.i_mtime.tv_sec = be64_to_cpu(str->di_mtime);
>   	ip->i_inode.i_mtime.tv_nsec = be32_to_cpu(str->di_mtime_nsec);


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02  1:36                   ` Nicolas Pitre
  2014-06-02  2:22                     ` Dave Chinner
@ 2014-06-02 10:56                     ` Arnd Bergmann
  2014-06-02 11:57                       ` Theodore Ts'o
  2014-06-02 15:04                       ` Chuck Lever
  1 sibling, 2 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-02 10:56 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: H. Peter Anvin, Dave Chinner, linux-kernel, linux-arch, joseph,
	john.stultz, hch, tglx, geert, lftan, linux-fsdevel, xfs

On Sunday 01 June 2014 21:36:26 Nicolas Pitre wrote:
> 
> > For actually running kernels beyond 2038, the best idea I've seen so
> > far is to disallow all broken code at compile time. I don't see
> > a choice but to audit the entire kernel for invalid uses on both
> > 32 and 64 bit in the next few years. A lot of code will get changed
> > in the process so we can actually keep running 32-bit kernels and
> > file systems, but other code will likely go away:
> > 
> > * any system calls that pass a time_t, timeval or timespec on
> >   32-bit systems return -ENOSYS, to ensure all user land uses
> >   the replacements we will put into place
> > * The definition of 'time_t', 'timval' and 'timespec' can be hidden
> >   from the kernel, and all code using it left out.
> > * ext2 and ext3 file system code will have to be disabled, but that's
> >   file since ext4 can mount old file systems.
> 
> Syscalls and libs can be "fixed".  Existing filesystem content might 
> not.  So if you need to mount some old media in read-write mode after 
> 2038 and that happens to content an ext2 or similarly limited filesystem 
> then it'd better just "work".  Having the kernel refuse to modify the 
> filesystem would be unacceptable.

I think you misunderstood what I suggested: the intent is to avoid
seeing things break in 2038 by making them break much earlier. We have
a solution for ext2 file systems, it's called ext4, and we just need
to ensure that everybody knows they have to migrate eventually.

At some point before the mid 2030ies, you should no longer be able to
build a kernel that has support for ext2 or any other module that will
run into bugs later. Until then (rather sooner than later), I'd like
to get to the point where you can choose whether to include those
modules at build time or not, and then get everybody to turn off that
option and fix the bugs they run into. You wouldn't need that for a
2014-generation long-term support disto (rhel 7, sles 12, debian 7,
ubuntu 14.04, ...), but perhaps for the next generation, or the
one after that.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-01 20:26                   ` H. Peter Anvin
@ 2014-06-02 11:02                     ` Arnd Bergmann
  0 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-02 11:02 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Nicolas Pitre, Dave Chinner, linux-kernel, linux-arch, joseph,
	john.stultz, hch, tglx, geert, lftan, linux-fsdevel, xfs

On Sunday 01 June 2014 13:26:03 H. Peter Anvin wrote:
> Perhaps we should make this a kernel command line option instead, with the
> settings: error out on outside the standard window, or a date indicating the
> earliest date that should be recognized and do windowing (0 for no windowing,
> 1970 for retconning the Unix epoch as unsigned...)

What's wrong with compile-time errors? We have a pretty good understanding
of how time values are passed in the kernel, and we know they will all break
in 2038 for 32-bit kernels unless we do something about it.
 
> But again, the kernel is probably the least problem here...
 
I agree the glibc side is harder than this, but we have to get the kernel
into shape first (at the minimum we have to do the APIs), and there is enough
work to do here.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02  0:28             ` Dave Chinner
@ 2014-06-02 11:35               ` Roger Willcocks
  2014-06-02 11:43               ` Arnd Bergmann
  1 sibling, 0 replies; 124+ messages in thread
From: Roger Willcocks @ 2014-06-02 11:35 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Arnd Bergmann, linux-arch, linux-kernel, lftan, hch, john.stultz,
	H. Peter Anvin, linux-fsdevel, geert, tglx, xfs, joseph


On Mon, 2014-06-02 at 10:28 +1000, Dave Chinner wrote:

> 
> The 32 bit second counters in timestamps are too small to represent
> time beyond the unix epoch (jan 2038) correctly. Extend the on-disk
> format for a timestamp to include an 8-bit epoch counter so that we
> can extend time for up to 255 Unix epochs. This should be good for
> representing timestamps from 1970 to somewhere around 19,000 A.D....
> 

I assume you're using an 'epoch' variable and not simply using the
padding byte as an eight-bit prefix to the existing 32-bit counter
because the existing counter is signed ?

For long term sanity it might make more sense for the eight-bit value to
be a simple (sign-extended) prefix from 1970.

So if the feature bit is set it's a 40-bit signed time, which is good
for 1970 +/- 17400 years or so.

--
Roger






^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02  0:28             ` Dave Chinner
  2014-06-02 11:35               ` Roger Willcocks
@ 2014-06-02 11:43               ` Arnd Bergmann
  2014-06-03  0:32                 ` Dave Chinner
  1 sibling, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-02 11:43 UTC (permalink / raw)
  To: Dave Chinner
  Cc: H. Peter Anvin, linux-kernel, linux-arch, joseph, john.stultz,
	hch, tglx, geert, lftan, linux-fsdevel, xfs

On Monday 02 June 2014 10:28:22 Dave Chinner wrote:
> On Sun, Jun 01, 2014 at 10:24:37AM +1000, Dave Chinner wrote:
> > On Sat, May 31, 2014 at 05:37:52PM +0200, Arnd Bergmann wrote:
> > > In my list at http://kernelnewbies.org/y2038, I found that almost
> > > all file systems at least times until 2106, because they treat
> > > the on-disk value as unsigned on 64-bit systems, or they use
> > > a completely different representation. My guess is that somebody
> > > earlier spent a lot of work on making that happen.
> > > 
> > > The exceptions are:
> > > 
> > > * exofs uses signed values, which can probably be changed to be
> > >   consistent with the others.
> > > * isofs has a bug that limits it until 2027 on architectures with
> > >   a signed 'char' type (otherwise it's 2155).
> > > * udf can represent times for many thousands of years through a
> > >   16-bit year representation, but the code to convert to epoch
> > >   uses a const array that ends at 2038.
> > > * afs uses signed seconds and can probably be fixed
> > > * coda relies on user space time representation getting passed
> > >   through an ioctl.
> > > * I miscategorized xfs/ext2/ext3 as having unsigned 32-bit seconds,
> > >   where they really use signed.
> > > 
> > > I was confused about XFS since I didn't noticed that there are
> > > separate xfs_ictimestamp_t and xfs_timestamp_t types, so I expected
> > > XFS to also use the 1970-2106 time range on 64-bit systems today.
> > 
> > You've missed an awful lot more than just the implications for the
> > core kernel code.
> > 
> > There's a good chance such changes propagate to APIs elsewhere in
> > the filesystems, because something you haven't realised is that XFS
> > effectively exposes the on-disk timestamp format directly to
> > userspace via the bulkstat interface (see struct xfs_bstat). It also
> > affects the XFS open-by-handle ioctl and the swap extent ioctl used
> > by the online defragmenter.

I really didn't look at them at all, as ioctl is very late on my
mental list of things to change. I do realize that a lot of drivers
and file systems do have ioctls that pass time values and we need to
address them one by one.

I just looked at the ioctls you mentioned but don't see how open-by-handle
is affected by this. Can you point me to what you mean?

> Just to put that in context, here's the kernel patch to add extended
> epoch support to XFS. It's completely untested as I haven't done any
> userspace code changes to enable the feature. However, it should
> give you an indication of how far the simple act of changing the
> kernel time representation spread through the filesystem. This does
> not include any of the VFS infrastructure to specifying the range of
> supported timestamps.  It survives some smoke testing, but dies when
> the online defragmenter starts using the bulkstat and swap extent
> ioctls (the assert in xfs_inode_time_from_epoch() fires), so I
> probably don't have that all sorted correctly yet...
> 
> To test extended epoch support, however, I need to some fstests that
> define and validate the behaviour of the new syscalls - until we get
> those we can't validate that the filesystem follows the spec
> properly. I also suspect we are going to need an interface to query
> the supported range of timestamps from a filesystem so that we can
> test boundary conditions in an automated fashion....

Thanks a lot for having an initial look at this yourself!

I'd still consider the two problems largely orthogonal. My patch set
(at least with the 64-bit tv_sec) just gets 32-bit kernels to behave
more like 64-bit kernels regarding inode time stamps, which does
impact all the file systems that the a 64-bit time or the NFS
unsigned epoch (1970-2106), while your patch extends the file
system internal epoch (1901-2038 for XFS) so it can be used by
anything that knows how to handle larger than 32-bit second values
(either 64-bit kernel or 32-bit with inode_time patch).

> diff --git a/fs/xfs/xfs_dinode.h b/fs/xfs/xfs_dinode.h
> index 623bbe8..79f94722 100644
> --- a/fs/xfs/xfs_dinode.h
> +++ b/fs/xfs/xfs_dinode.h
> @@ -21,11 +21,53 @@
>  #define        XFS_DINODE_MAGIC                0x494e  /* 'IN' */
>  #define XFS_DINODE_GOOD_VERSION(v)     ((v) >= 1 && (v) <= 3)
>  
> +/*
> + * Inode timestamps get more complex when we consider supporting times beyond
> + * the standard unix epoch of Jan 2038. The struct xfs_timestamp cannot support
> + * more than a single extension by playing sign games, and that is still not
> + * reliable. We also can't extend the timestamp structure because there is no
> + * free space around them in the on-disk inode.
> + *
> + * Hence the simplest thing to do is to add an epoch counter for each timestamp
> + * in the inode. This can be a single byte for each timestamp and make use of
> + * a hole we currently pad. This gives us another 255 epochs range for the
> + * timestamps, but requires a superblock feature bit to indicate that these
> + * fields have meaning and can be non-zero.

Nice trick!

> +static inline __uint8_t
> +xfs_timestamp_epoch(
> +       struct timespec         *time)
> +{
> +       /* will be zero until the extended struct inode_time is introduced */
> +       return 0;
> +}
> +
> +static inline __int32_t
> +xfs_timestamp_sec(
> +       struct timespec         *time)
> +{
> +       return time->tv_sec;
> +}
> +
> +static inline __kernel_time_t
> +xfs_inode_time_from_epoch(
> +       __uint8_t       epoch,
> +       __int32_t       seconds)
> +{
> +       /* need to handle non-zero epoch when struct inode_time is introduced */
> +       ASSERT(epoch == 0);
> +       return seconds;
> +}

Why don't you already implement epoch conversion for 64-bit kernels that
are able to represent the time today? This is how ext4 does it (I mean
the sizeof() trick, not the bit stuffing they do):

static inline __le32 ext4_encode_extra_time(struct inode_time *time)
{
       return cpu_to_le32((sizeof(time->tv_sec) > 4 ?
                           (time->tv_sec >> 32) & EXT4_EPOCH_MASK : 0) |
                          ((time->tv_nsec << EXT4_EPOCH_BITS) & EXT4_NSEC_MASK));
}

static inline void ext4_decode_extra_time(struct inode_time *time, __le32 extra)
{
       if (sizeof(time->tv_sec) > 4)
               time->tv_sec |= (__u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK)
                               << 32;
       time->tv_nsec = (le32_to_cpu(extra) & EXT4_NSEC_MASK) >> EXT4_EPOCH_BITS;
}

I guess if there is general agreement on introducing 'struct inode_time',
we can skip that intermediate step.

> @@ -509,8 +509,11 @@ xfs_sb_has_ro_compat_feature(
>  }
>  
>  #define XFS_SB_FEAT_INCOMPAT_FTYPE     (1 << 0)        /* filetype in dirent */
> +#define XFS_SB_FEAT_INCOMPAT_EPOCH     (1 << 1)        /* Time beyond 2038 */
>  #define XFS_SB_FEAT_INCOMPAT_ALL \
> -               (XFS_SB_FEAT_INCOMPAT_FTYPE)
> +               (XFS_SB_FEAT_INCOMPAT_FTYPE | \
> +                XFS_SB_FEAT_INCOMPAT_EPOCH | \
> +                0)
>  
>  #define XFS_SB_FEAT_INCOMPAT_UNKNOWN   ~XFS_SB_FEAT_INCOMPAT_ALL

How does this flag get set? Do you have to manually change it in the
superblock? Since most of the time I'd suspect you wouldn't actually
use it for the foreseeable future, would it make sense to have a mount
option that allows it to be set, but doesn't actually change the
superblock until the first inode gets written with a nonzero epoch?

That way, you'd still be able to mount it with an older kernel but
also be forward compatible with time moving on.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 10:56                     ` Arnd Bergmann
@ 2014-06-02 11:57                       ` Theodore Ts'o
  2014-06-02 12:38                         ` Arnd Bergmann
                                           ` (2 more replies)
  2014-06-02 15:04                       ` Chuck Lever
  1 sibling, 3 replies; 124+ messages in thread
From: Theodore Ts'o @ 2014-06-02 11:57 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Nicolas Pitre, H. Peter Anvin, Dave Chinner, linux-kernel,
	linux-arch, joseph, john.stultz, hch, tglx, geert, lftan,
	linux-fsdevel, xfs

On Mon, Jun 02, 2014 at 12:56:42PM +0200, Arnd Bergmann wrote:
> 
> I think you misunderstood what I suggested: the intent is to avoid
> seeing things break in 2038 by making them break much earlier. We have
> a solution for ext2 file systems, it's called ext4, and we just need
> to ensure that everybody knows they have to migrate eventually.
> 
> At some point before the mid 2030ies, you should no longer be able to
> build a kernel that has support for ext2 or any other module that will
> run into bugs later....

Even for ext4, it's not quite so simple as that.  You only have
support for times post 2038 if you are using an inode size > 128
bytes.  There are a very, very large number of machines which even
today, are using 128 byte inodes with ext4 for performance reasons.

The vast majority of those machines which I know of can probably move
to 256 byte inodes relatively easily, since hard drive replacement
cycles are order 5-6 years tops, so I'm not that concerned, but it
just goes to show this is a very complicated problem.

And even if we're talking about flash and embedded devices, the good
news is if you assume that 10 years is enough time for people to
update their embedded OS builds, and that the vast majority of
deployed devices will probably only be in service for 10-15 years, we
do have enough time to make file system format changes, although
admittedly we can't afford to dilly-dally.

Regards,

					- Ted

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 11:57                       ` Theodore Ts'o
@ 2014-06-02 12:38                         ` Arnd Bergmann
  2014-06-02 13:15                           ` Theodore Ts'o
  2014-06-02 12:52                         ` Arnd Bergmann
  2014-06-02 14:52                         ` H. Peter Anvin
  2 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-02 12:38 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Nicolas Pitre, H. Peter Anvin, Dave Chinner, linux-kernel,
	linux-arch, joseph, john.stultz, hch, tglx, geert, lftan,
	linux-fsdevel, xfs

On Monday 02 June 2014 07:57:37 Theodore Ts'o wrote:
> On Mon, Jun 02, 2014 at 12:56:42PM +0200, Arnd Bergmann wrote:
> > 
> > I think you misunderstood what I suggested: the intent is to avoid
> > seeing things break in 2038 by making them break much earlier. We have
> > a solution for ext2 file systems, it's called ext4, and we just need
> > to ensure that everybody knows they have to migrate eventually.
> > 
> > At some point before the mid 2030ies, you should no longer be able to
> > build a kernel that has support for ext2 or any other module that will
> > run into bugs later....
> 
> Even for ext4, it's not quite so simple as that.  You only have
> support for times post 2038 if you are using an inode size > 128
> bytes.  There are a very, very large number of machines which even
> today, are using 128 byte inodes with ext4 for performance reasons.
> 
> The vast majority of those machines which I know of can probably move
> to 256 byte inodes relatively easily, since hard drive replacement
> cycles are order 5-6 years tops, so I'm not that concerned, but it
> just goes to show this is a very complicated problem.

Ok, I see.

I also now noticed this comment above EXT4_FITS_IN_INODE():

"For new inodes we always reserve enough space for the kernel's known
extended fields, but for inodes created with an old kernel this might
not have been the case. None of the extended inode fields is critical
for correct filesystem operation."

Do we have to worry about this for inodes that contain extended
attributes and that get updated after 2038?

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 11:57                       ` Theodore Ts'o
  2014-06-02 12:38                         ` Arnd Bergmann
@ 2014-06-02 12:52                         ` Arnd Bergmann
  2014-06-02 13:07                           ` Theodore Ts'o
  2014-06-02 14:52                         ` H. Peter Anvin
  2 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-02 12:52 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Nicolas Pitre, H. Peter Anvin, Dave Chinner, linux-kernel,
	linux-arch, joseph, john.stultz, hch, tglx, geert, lftan,
	linux-fsdevel, xfs

On Monday 02 June 2014 07:57:37 Theodore Ts'o wrote:
> On Mon, Jun 02, 2014 at 12:56:42PM +0200, Arnd Bergmann wrote:
> > 
> > I think you misunderstood what I suggested: the intent is to avoid
> > seeing things break in 2038 by making them break much earlier. We have
> > a solution for ext2 file systems, it's called ext4, and we just need
> > to ensure that everybody knows they have to migrate eventually.
> > 
> > At some point before the mid 2030ies, you should no longer be able to
> > build a kernel that has support for ext2 or any other module that will
> > run into bugs later....
> 
> Even for ext4, it's not quite so simple as that.  You only have
> support for times post 2038 if you are using an inode size > 128
> bytes.  There are a very, very large number of machines which even
> today, are using 128 byte inodes with ext4 for performance reasons.
> 
> The vast majority of those machines which I know of can probably move
> to 256 byte inodes relatively easily, since hard drive replacement
> cycles are order 5-6 years tops, so I'm not that concerned, but it
> just goes to show this is a very complicated problem.

One stupid question about the current code:

static inline void ext4_decode_extra_time(struct inode_time *time, __le32 extra)
{                               
       if (sizeof(time->tv_sec) > 4)
               time->tv_sec |= (__u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK)
                               << 32;
       time->tv_nsec = (le32_to_cpu(extra) & EXT4_NSEC_MASK) >> EXT4_EPOCH_BITS;
}       

#define EXT4_EINODE_GET_XTIME(xtime, einode, raw_inode)                        \
do {                                                                           \
        if (EXT4_FITS_IN_INODE(raw_inode, einode, xtime))                      \
                (einode)->xtime.tv_sec =                                       \
                        (signed)le32_to_cpu((raw_inode)->xtime);               \
        else                                                                   \
                (einode)->xtime.tv_sec = 0;                                    \
        if (EXT4_FITS_IN_INODE(raw_inode, einode, xtime ## _extra))            \
                ext4_decode_extra_time(&(einode)->xtime,                       \
                                       raw_inode->xtime ## _extra);            \
        else                                                                   \
                (einode)->xtime.tv_nsec = 0;                                   \
} while (0)

For a time between 2038 and 2106, this looks like xtime.tv_sec is
negative when ext4_decode_extra_time gets called, so the '|=' operator
doesn't actually do anything. Shouldn't that be '+='?

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 12:52                         ` Arnd Bergmann
@ 2014-06-02 13:07                           ` Theodore Ts'o
  2014-06-02 15:01                             ` Arnd Bergmann
  0 siblings, 1 reply; 124+ messages in thread
From: Theodore Ts'o @ 2014-06-02 13:07 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Nicolas Pitre, H. Peter Anvin, Dave Chinner, linux-kernel,
	linux-arch, joseph, john.stultz, hch, tglx, geert, lftan,
	linux-fsdevel, xfs

Yes, there are some ongoing dicussions about changing the post-2038
encoding of the timestamp in ext4, which is why this hasn't been fixed
yet.  The main thing that's been missing is time for me to review the
patches, and a good way of writing regression tests that will work (or
at least not fail) on build environments with a 32-bit time_t and
32-bit-only capable versions of functions such as gmtime(3).

And given current discussions, I may want to think about some kind of
superblock flag to allow the use of a 32-bit unsigned encoding for
file systems using a 128-byte inode, with a way of setting that flag
after scanning the file system to make sure there are no times that
are previous to January 1, 1970.  (Or more generally, allow any epoch
to be defined using a 64-bit time_t offset stored in the superblock...)

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 12:38                         ` Arnd Bergmann
@ 2014-06-02 13:15                           ` Theodore Ts'o
  0 siblings, 0 replies; 124+ messages in thread
From: Theodore Ts'o @ 2014-06-02 13:15 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Nicolas Pitre, H. Peter Anvin, Dave Chinner, linux-kernel,
	linux-arch, joseph, john.stultz, hch, tglx, geert, lftan,
	linux-fsdevel, xfs

On Mon, Jun 02, 2014 at 02:38:09PM +0200, Arnd Bergmann wrote:
> 
> "For new inodes we always reserve enough space for the kernel's known
> extended fields, but for inodes created with an old kernel this might
> not have been the case. None of the extended inode fields is critical
> for correct filesystem operation."
> 
> Do we have to worry about this for inodes that contain extended
> attributes and that get updated after 2038?

In practice, the extended timestamps was one of the first things added
to ext4, so the vast majority of ext4 file systems with inode sizes >
128 bytes will have room for the extended timestamps.  There are some
legacy ext3 file systems with 256-byte inodes (enabled for fast
sotrage of SELinux xattrs) that in theory, could have been converted
to ext4 and had enough xattrs so that the extended timestamps couldn't
be added.  That would be a vanishingly small use case, and in
practice, it's not likely to be the case for the embedded market.

I could imagine someone worrying about file systems originally
formatted using RHEL 4 post-2038 (perhaps running in a VM), but I
don't work for IBM any more, and hopefully even IBM would just tell
such customers that they need to suck it up, and do a
backup/reformat/restore pass.

Cheers,
						- Ted

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
                   ` (33 preceding siblings ...)
  2014-05-31 14:51 ` Richard Cochran
@ 2014-06-02 13:52 ` Joseph S. Myers
  2014-06-02 19:19   ` Arnd Bergmann
  34 siblings, 1 reply; 124+ messages in thread
From: Joseph S. Myers @ 2014-06-02 13:52 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linux-arch, john.stultz, hch, tglx, geert, lftan,
	hpa, linux-fsdevel, ceph-devel, cluster-devel, coda, codalist,
	fuse-devel, linux-afs, linux-btrfs, linux-cifs, linux-ext4,
	linux-f2fs-devel, linux-mtd, linux-nfs, linux-ntfs-dev,
	linux-scsi, logfs, ocfs2-devel, reiserfs-devel, samba-technical,
	xfs

On Fri, 30 May 2014, Arnd Bergmann wrote:

> a) is this the right approach in general? The previous discussion
>    pointed this way, but there may be other opinions.

The syscall changes seem like the sort of thing I'd expect, although 
patches adding new syscalls or otherwise affecting the kernel/userspace 
interface (as opposed to those relating to an individual filesystem) 
should go to linux-api as well as other relevant lists.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-05-31  5:54           ` Dave Chinner
  2014-05-31  8:41             ` H. Peter Anvin
@ 2014-06-02 14:00             ` Joseph S. Myers
  1 sibling, 0 replies; 124+ messages in thread
From: Joseph S. Myers @ 2014-06-02 14:00 UTC (permalink / raw)
  To: Dave Chinner
  Cc: H. Peter Anvin, Arnd Bergmann, linux-kernel, linux-arch,
	john.stultz, hch, tglx, geert, lftan, linux-fsdevel, xfs

On Sat, 31 May 2014, Dave Chinner wrote:

> If we are changing the in-kernel timestamp to have a greater dynamic
> range that anything we current support on disk, then we need support
> for all filesystems for similar translation and constraint. The
> filesystems need to be able to tell the kernel what they timestamp
> range they support, and then the kernel needs to follow those
> guidelines. And if the filesystem is mounted on a kernel that
> doesn't support the current filesystem's timestamp format, then at
> minimum that filesystem cannot do anything that writes a
> timestamp....
> 
> Put simply: the filesystem defines the timestamp range that can be
> used safely, not the userspace API. If the filesystem can't support
> the date it is handed then that is an out-of-range error. Since
> when have we accepted that it's OK to handle out-of-range data with
> silent overflows or corruption of the data that we are attempting to
> store? We're defining a new API to support a wider date range -
> there is nothing that prevents us from saying ERANGE can be returned
> to a timestamp that the file cannot store correctly....

I don't see anything new about this issue.  All problems that could arise 
from the kernel being able to represent a timestamp some filesystems can't 
are problems that already apply with 64-bit kernels using 64-bit time_t 
internally.  So while as part of Y2038-preparedness we do need a clear 
understanding of which filesystems have what timestamp limits and what 
happens with timestamps beyond those limits, I think this is a separate 
strand of the problem - one that applies to both 32-bit and 64-bit systems 
- from the more general issue for 32-bit systems.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 11:57                       ` Theodore Ts'o
  2014-06-02 12:38                         ` Arnd Bergmann
  2014-06-02 12:52                         ` Arnd Bergmann
@ 2014-06-02 14:52                         ` H. Peter Anvin
  2 siblings, 0 replies; 124+ messages in thread
From: H. Peter Anvin @ 2014-06-02 14:52 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Arnd Bergmann, Nicolas Pitre, Dave Chinner, linux-kernel,
	linux-arch, joseph, john.stultz, hch, tglx, geert, lftan,
	linux-fsdevel, xfs


> On Jun 2, 2014, at 4:57, "Theodore Ts'o" <tytso@mit.edu> wrote:
> 
>> On Mon, Jun 02, 2014 at 12:56:42PM +0200, Arnd Bergmann wrote:
>> 
>> I think you misunderstood what I suggested: the intent is to avoid
>> seeing things break in 2038 by making them break much earlier. We have
>> a solution for ext2 file systems, it's called ext4, and we just need
>> to ensure that everybody knows they have to migrate eventually.
>> 
>> At some point before the mid 2030ies, you should no longer be able to
>> build a kernel that has support for ext2 or any other module that will
>> run into bugs later....
> 
> Even for ext4, it's not quite so simple as that.  You only have
> support for times post 2038 if you are using an inode size > 128
> bytes.  There are a very, very large number of machines which even
> today, are using 128 byte inodes with ext4 for performance reasons.
> 
> The vast majority of those machines which I know of can probably move
> to 256 byte inodes relatively easily, since hard drive replacement
> cycles are order 5-6 years tops, so I'm not that concerned, but it
> just goes to show this is a very complicated problem.
> 
> And even if we're talking about flash and embedded devices, the good
> news is if you assume that 10 years is enough time for people to
> update their embedded OS builds, and that the vast majority of
> deployed devices will probably only be in service for 10-15 years, we
> do have enough time to make file system format changes, although
> admittedly we can't afford to dilly-dally.

I have a number of file systems older than any device they are sitting on.  RAID allows individual disks to be swapped out, and when all disks have been swapped out, extend the file system online.  The system doesn't even have to be taken offline in the process if it is possible to physically get to the drives with the system powered (e.g. hot plug bays), which is really damned nice.

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 13:07                           ` Theodore Ts'o
@ 2014-06-02 15:01                             ` Arnd Bergmann
  0 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-02 15:01 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Nicolas Pitre, H. Peter Anvin, Dave Chinner, linux-kernel,
	linux-arch, joseph, john.stultz, hch, tglx, geert, lftan,
	linux-fsdevel, xfs

On Monday 02 June 2014 09:07:00 Theodore Ts'o wrote:
> Yes, there are some ongoing dicussions about changing the post-2038
> encoding of the timestamp in ext4, which is why this hasn't been fixed
> yet.  The main thing that's been missing is time for me to review the
> patches, and a good way of writing regression tests that will work (or
> at least not fail) on build environments with a 32-bit time_t and
> 32-bit-only capable versions of functions such as gmtime(3).
> 
> And given current discussions, I may want to think about some kind of
> superblock flag to allow the use of a 32-bit unsigned encoding for
> file systems using a 128-byte inode, with a way of setting that flag
> after scanning the file system to make sure there are no times that
> are previous to January 1, 1970.  (Or more generally, allow any epoch
> to be defined using a 64-bit time_t offset stored in the superblock...)

FWIW, I've gone through the other file system implementations once
more. The most common pattern I've encountered is to have a read_inode
function with

	inode->i_mtime = le32_to_cpu(raw_inode->mtime);

which results in interpreting the time as 'signed' on 32-bit
kernels, but as 'unsigned' on 64-bit kernels. This could have been
done intentionally to extend the valid time range to 2106 on 64-bit
kernels, but it seems more likely that the code was written with
no thought given to 64-bit time_t at all. I see this pattern on
p9fs (old protocol only), afs, bfs, ceph, efs, freevxfs, hpfs, jffs2,
jfs, minix, nfsv2/v3 (this was clearly intentional and is
spelled out in the RFC), qnx4, qnx6, reiserfs, squashfs, sysv,
and ufs (protocol version 1 only).

The other behavior I see is to treat the on-disk 32-bit value
as signed on both 32-bit and 64-bit kernels:

	inode->i_mtime = (signed)le32_to_cpu(raw_inode->mtime);

this seems to be done intentionally in all cases, to maintain
compatibility between 32-bit and 64-bit kernels, but it's
relatively rare: exofs, ext2/3/4 (good old inodes) and xfs
are the only ones doing this.

In case of ext2/3/4, the sign handlign was introduced here:
http://www.spinics.net/lists/linux-ext4/msg01758.html

exofs and xfs seem to have done it like this for all of git
history.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 10:56                     ` Arnd Bergmann
  2014-06-02 11:57                       ` Theodore Ts'o
@ 2014-06-02 15:04                       ` Chuck Lever
  2014-06-02 15:31                         ` Theodore Ts'o
                                           ` (2 more replies)
  1 sibling, 3 replies; 124+ messages in thread
From: Chuck Lever @ 2014-06-02 15:04 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Nicolas Pitre, H. Peter Anvin, Dave Chinner, LKML Kernel,
	linux-arch, joseph, john.stultz, Christoph Hellwig, tglx, geert,
	lftan, linux-fsdevel, xfs, Linux NFS Mailing List


On Jun 2, 2014, at 6:56 AM, Arnd Bergmann <arnd@arndb.de> wrote:

> On Sunday 01 June 2014 21:36:26 Nicolas Pitre wrote:
>> 
>>> For actually running kernels beyond 2038, the best idea I've seen so
>>> far is to disallow all broken code at compile time. I don't see
>>> a choice but to audit the entire kernel for invalid uses on both
>>> 32 and 64 bit in the next few years. A lot of code will get changed
>>> in the process so we can actually keep running 32-bit kernels and
>>> file systems, but other code will likely go away:
>>> 
>>> * any system calls that pass a time_t, timeval or timespec on
>>>  32-bit systems return -ENOSYS, to ensure all user land uses
>>>  the replacements we will put into place
>>> * The definition of 'time_t', 'timval' and 'timespec' can be hidden
>>>  from the kernel, and all code using it left out.
>>> * ext2 and ext3 file system code will have to be disabled, but that's
>>>  file since ext4 can mount old file systems.
>> 
>> Syscalls and libs can be "fixed".  Existing filesystem content might 
>> not.  So if you need to mount some old media in read-write mode after 
>> 2038 and that happens to content an ext2 or similarly limited filesystem 
>> then it'd better just "work".  Having the kernel refuse to modify the 
>> filesystem would be unacceptable.
> 
> I think you misunderstood what I suggested: the intent is to avoid
> seeing things break in 2038 by making them break much earlier. We have
> a solution for ext2 file systems, it's called ext4, and we just need
> to ensure that everybody knows they have to migrate eventually.
> 
> At some point before the mid 2030ies, you should no longer be able to
> build a kernel that has support for ext2 or any other module that will
> run into bugs later. Until then (rather sooner than later), I'd like
> to get to the point where you can choose whether to include those
> modules at build time or not, and then get everybody to turn off that
> option and fix the bugs they run into. You wouldn't need that for a
> 2014-generation long-term support disto (rhel 7, sles 12, debian 7,
> ubuntu 14.04, ...), but perhaps for the next generation, or the
> one after that.

I’m wondering what should be done about NFS. A solution for NFS should
match any scheme that is considered for local file systems, IMO.

NFSv2/3 timestamps are a pair of unsigned 32-bit values: one value for
seconds since midnight GMT Jan 1, 1970, and one value for nanoseconds.
(See the definition of nfstime3 in RFC 1813).

NFSv4 uses a signed 64-bit value where zero represents midnight UTC
on January 1, 1970, and an unsigned 32-bit value for nanoseconds. (See
the definition of nfstime4 in RFC 5661).

The NFSv4 protocol is probably not problematic, and NFSv3 should be out
of the picture by 2038. But if changes are planned for dealing _now_
with timestamp issues, compatibility with NFSv3 is a consideration.

It is already the case that, via NFSv3, the Linux NFS client transmits
timestamps earlier than 1970 as large positive numbers. Try this with
xfstests generic/258.

Maybe nfs3_proc_setattr() should recognize pre-epoch timestamps and
timestamps larger than can be represented in an unsigned 32-bit field
and return an immediate error to the requesting application (like EINVAL).

If the Linux NFS server encounters a local file with a timestamp that
cannot be represented via a u32, should it also return NFS3ERR_INVAL?

RFC 1813 does not provide guidance on the behavior nor does it suggest
a particular error status code. The Solaris 11 server appears to return
NFS3ERR_INVAL in this case.

An alternative would be to “cap” the timestamps transmitted via NFSv3 by
Linux, so that a pre-epoch timestamp is transmitted as zero, and a large
timestamp is transmitted as UINT_MAX.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 15:04                       ` Chuck Lever
@ 2014-06-02 15:31                         ` Theodore Ts'o
  2014-06-02 17:12                           ` H. Peter Anvin
  2014-06-02 18:52                         ` Arnd Bergmann
  2014-06-02 18:58                         ` Roger Willcocks
  2 siblings, 1 reply; 124+ messages in thread
From: Theodore Ts'o @ 2014-06-02 15:31 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Arnd Bergmann, Nicolas Pitre, H. Peter Anvin, Dave Chinner,
	LKML Kernel, linux-arch, joseph, john.stultz, Christoph Hellwig,
	tglx, geert, lftan, linux-fsdevel, xfs, Linux NFS Mailing List

On Mon, Jun 02, 2014 at 11:04:23AM -0400, Chuck Lever wrote:
> I’m wondering what should be done about NFS. A solution for NFS should
> match any scheme that is considered for local file systems, IMO.
> 
> An alternative would be to “cap” the timestamps transmitted via NFSv3 by
> Linux, so that a pre-epoch timestamp is transmitted as zero, and a large
> timestamp is transmitted as UINT_MAX.


I wonder if it would make sense to try to promulgate via the Austin
group, and possibly the C standards committee the concept of a bit
pattern (that might commonly be INT_MAX or UINT_MAX) that means "time
unknown", or "time indefinite" or "we couldn't encode the time".

We would then teach gmtime(3) and asctime(3) to print some appropriate
message, and we could teach programs like find (with the -mtime)
option, make, tmpwatch, et. al., that they can't make any presumption
about the comparibility of any timestamp which has a value of
TIME_UNDEFINIED.

It would be problematic for time(2) or gettimeofday(2) to return
TIME_UNDEFINED, since there are programs that care about time ticking
forward, but I could imagine a new interface which would be permitted
to return a flag indicating that we don't know the current time
(because the CMOS battery had run down, etc.) so instead we're going
to be counting the number of seconds since the system was booted.

    	      	      	    	       - Ted

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 15:31                         ` Theodore Ts'o
@ 2014-06-02 17:12                           ` H. Peter Anvin
  2014-06-02 18:50                             ` Arnd Bergmann
  2014-06-02 22:29                             ` Theodore Ts'o
  0 siblings, 2 replies; 124+ messages in thread
From: H. Peter Anvin @ 2014-06-02 17:12 UTC (permalink / raw)
  To: Theodore Ts'o, Chuck Lever, Arnd Bergmann, Nicolas Pitre,
	Dave Chinner, LKML Kernel, linux-arch, joseph, john.stultz,
	Christoph Hellwig, tglx, geert, lftan, linux-fsdevel, xfs,
	Linux NFS Mailing List

On 06/02/2014 08:31 AM, Theodore Ts'o wrote:
> 
> I wonder if it would make sense to try to promulgate via the Austin
> group, and possibly the C standards committee the concept of a bit
> pattern (that might commonly be INT_MAX or UINT_MAX) that means "time
> unknown", or "time indefinite" or "we couldn't encode the time".
> 

(time_t)-1 already has this meaning for some calls (e.g. time(2)).
However, this also means Wed Dec 31 23:59:59 UTC 1969, and unfortunately
something similar applies to all possible bit patterns, certainly within
the range of an int.

> We would then teach gmtime(3) and asctime(3) to print some appropriate
> message, and we could teach programs like find (with the -mtime)
> option, make, tmpwatch, et. al., that they can't make any presumption
> about the comparibility of any timestamp which has a value of
> TIME_UNDEFINIED.
> 
> It would be problematic for time(2) or gettimeofday(2) to return
> TIME_UNDEFINED, since there are programs that care about time ticking
> forward, but I could imagine a new interface which would be permitted
> to return a flag indicating that we don't know the current time
> (because the CMOS battery had run down, etc.) so instead we're going
> to be counting the number of seconds since the system was booted.

This assumes that we actually know that that is the case, which may be
an aggressive assumption.

	-hpa



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 17:12                           ` H. Peter Anvin
@ 2014-06-02 18:50                             ` Arnd Bergmann
  2014-06-02 22:29                             ` Theodore Ts'o
  1 sibling, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-02 18:50 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Theodore Ts'o, Chuck Lever, Nicolas Pitre, Dave Chinner,
	LKML Kernel, linux-arch, joseph, john.stultz, Christoph Hellwig,
	tglx, geert, lftan, linux-fsdevel, xfs, Linux NFS Mailing List

On Monday 02 June 2014 10:12:37 H. Peter Anvin wrote:
> On 06/02/2014 08:31 AM, Theodore Ts'o wrote:
> > 
> > I wonder if it would make sense to try to promulgate via the Austin
> > group, and possibly the C standards committee the concept of a bit
> > pattern (that might commonly be INT_MAX or UINT_MAX) that means "time
> > unknown", or "time indefinite" or "we couldn't encode the time".
> > 
> 
> (time_t)-1 already has this meaning for some calls (e.g. time(2)).
> However, this also means Wed Dec 31 23:59:59 UTC 1969, and unfortunately
> something similar applies to all possible bit patterns, certainly within
> the range of an int.

Worse than Wed Dec 31 23:59:59 UTC 1969, on NFSv3 it also means
"Sun Feb  7 07:28:15 CET 2106", and that is much harder to distinguish
from a real future date.

If we had the choice, I'd go for something like 1, i.e.
"Thu Jan  1 01:00:01 CET 1970".

> > We would then teach gmtime(3) and asctime(3) to print some appropriate
> > message, and we could teach programs like find (with the -mtime)
> > option, make, tmpwatch, et. al., that they can't make any presumption
> > about the comparibility of any timestamp which has a value of
> > TIME_UNDEFINIED.
> > 
> > It would be problematic for time(2) or gettimeofday(2) to return
> > TIME_UNDEFINED, since there are programs that care about time ticking
> > forward, but I could imagine a new interface which would be permitted
> > to return a flag indicating that we don't know the current time
> > (because the CMOS battery had run down, etc.) so instead we're going
> > to be counting the number of seconds since the system was booted.
> 
> This assumes that we actually know that that is the case, which may be
> an aggressive assumption.

It's harder for time(2), but for the inode case, we can definitely
detect when the file system specific representation overflows
or underflows, which may be be at a number of very different points
of time.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 15:04                       ` Chuck Lever
  2014-06-02 15:31                         ` Theodore Ts'o
@ 2014-06-02 18:52                         ` Arnd Bergmann
  2014-06-02 18:58                         ` Roger Willcocks
  2 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-02 18:52 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Nicolas Pitre, H. Peter Anvin, Dave Chinner, LKML Kernel,
	linux-arch, joseph, john.stultz, Christoph Hellwig, tglx, geert,
	lftan, linux-fsdevel, xfs, Linux NFS Mailing List

On Monday 02 June 2014 11:04:23 Chuck Lever wrote:
> I’m wondering what should be done about NFS. A solution for NFS should
> match any scheme that is considered for local file systems, IMO.
> 
> NFSv2/3 timestamps are a pair of unsigned 32-bit values: one value for
> seconds since midnight GMT Jan 1, 1970, and one value for nanoseconds.
> (See the definition of nfstime3 in RFC 1813).
> 
> NFSv4 uses a signed 64-bit value where zero represents midnight UTC
> on January 1, 1970, and an unsigned 32-bit value for nanoseconds. (See
> the definition of nfstime4 in RFC 5661).
> 
> The NFSv4 protocol is probably not problematic, and NFSv3 should be out
> of the picture by 2038. But if changes are planned for dealing _now_
> with timestamp issues, compatibility with NFSv3 is a consideration.
> 
> It is already the case that, via NFSv3, the Linux NFS client transmits
> timestamps earlier than 1970 as large positive numbers. Try this with
> xfstests generic/258.

If I read the code correctly, a pre-1970 timestamp will be sent as
a large unsigned integer, but received as a post-2038 timestamp on
64-bit kernels, both in the nfs client and server code.

This behavior is clearly wrong, but it's the same bug that we have
in lots of other file systems, and it makes sense to have the
same fix everywhere, at lease the cases where we know what interpretation
we actually want. NFS has the luxury of having an actual specification
saying that the value is unsigned. For most of the legacy file systems,
we can only make a guess at how other OSs would interpret the same
numbers.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 15:04                       ` Chuck Lever
  2014-06-02 15:31                         ` Theodore Ts'o
  2014-06-02 18:52                         ` Arnd Bergmann
@ 2014-06-02 18:58                         ` Roger Willcocks
  2014-06-02 19:04                           ` Chuck Lever
  2 siblings, 1 reply; 124+ messages in thread
From: Roger Willcocks @ 2014-06-02 18:58 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Arnd Bergmann, Nicolas Pitre, linux-arch, Linux NFS Mailing List,
	LKML Kernel, lftan, Christoph Hellwig, john.stultz,
	H. Peter Anvin, linux-fsdevel, geert, tglx, xfs, joseph


On Mon, 2014-06-02 at 11:04 -0400, Chuck Lever wrote:

> NFSv2/3 timestamps are a pair of unsigned 32-bit values: one value for
> seconds since midnight GMT Jan 1, 1970, and one value for nanoseconds.
> (See the definition of nfstime3 in RFC 1813).
> 

nfstime3 could be extended by redefining the otherwise unused
nanoseconds bits{31,30} as seconds{33,32}, to give a (signed) 34-bit
seconds field and an unsigned 30-bit nanoseconds field.

This could represent 1970 +/- 272 years.

Servers could indicate they can understand the extended time format by
adding a new FSINFO capability - FSF3_CANSETTIME_EX.

Clients would use a new SET_TO_CLIENT_TIME_EX time_how enum when sending
timestamps so old servers would be protected from new clients.

Old clients don't need to be protected from new servers because the
on-the-wire bit pattern for dates between 1970 and 2106 stays the same,
so they're no worse off than they were before.

Arguably the new server ought to clamp out-of-range timestamps before
sending them to old clients but that would need per-client state (and
nfs3 is stateless.)

--
Roger



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 18:58                         ` Roger Willcocks
@ 2014-06-02 19:04                           ` Chuck Lever
  2014-06-02 19:10                             ` Arnd Bergmann
  0 siblings, 1 reply; 124+ messages in thread
From: Chuck Lever @ 2014-06-02 19:04 UTC (permalink / raw)
  To: Roger Willcocks
  Cc: Arnd Bergmann, Nicolas Pitre, linux-arch, Linux NFS Mailing List,
	LKML Kernel, lftan, Christoph Hellwig, john.stultz,
	H. Peter Anvin, linux-fsdevel, geert, tglx, xfs, joseph


On Jun 2, 2014, at 2:58 PM, Roger Willcocks <roger@filmlight.ltd.uk> wrote:

> 
> On Mon, 2014-06-02 at 11:04 -0400, Chuck Lever wrote:
> 
>> NFSv2/3 timestamps are a pair of unsigned 32-bit values: one value for
>> seconds since midnight GMT Jan 1, 1970, and one value for nanoseconds.
>> (See the definition of nfstime3 in RFC 1813).
>> 
> 
> nfstime3 could be extended by redefining the otherwise unused
> nanoseconds bits{31,30} as seconds{33,32}, to give a (signed) 34-bit
> seconds field and an unsigned 30-bit nanoseconds field.
> 
> This could represent 1970 +/- 272 years.
> 
> Servers could indicate they can understand the extended time format by
> adding a new FSINFO capability - FSF3_CANSETTIME_EX.
> 
> Clients would use a new SET_TO_CLIENT_TIME_EX time_how enum when sending
> timestamps so old servers would be protected from new clients.

You would have to get the IETF’s NFSv4 working group to sign off on
this change. Otherwise, Linux would be the only NFSv3 implementation
that supports the extension.

But I suspect the answer you’d get is “Use NFSv4.”

> Old clients don't need to be protected from new servers because the
> on-the-wire bit pattern for dates between 1970 and 2106 stays the same,
> so they're no worse off than they were before.
> 
> Arguably the new server ought to clamp out-of-range timestamps before
> sending them to old clients but that would need per-client state (and
> nfs3 is stateless.)

There’s no reliable way in NFSv3 for clients and servers to identify
the software running on the peer.

Practically speaking, you should assume that the NFSv3 protocol is never
going to change.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 19:04                           ` Chuck Lever
@ 2014-06-02 19:10                             ` Arnd Bergmann
  0 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-02 19:10 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Roger Willcocks, Nicolas Pitre, linux-arch,
	Linux NFS Mailing List, LKML Kernel, lftan, Christoph Hellwig,
	john.stultz, H. Peter Anvin, linux-fsdevel, geert, tglx, xfs,
	joseph

On Monday 02 June 2014 15:04:27 Chuck Lever wrote:
> On Jun 2, 2014, at 2:58 PM, Roger Willcocks <roger@filmlight.ltd.uk> wrote:
> 
> > 
> > On Mon, 2014-06-02 at 11:04 -0400, Chuck Lever wrote:
> > 
> >> NFSv2/3 timestamps are a pair of unsigned 32-bit values: one value for
> >> seconds since midnight GMT Jan 1, 1970, and one value for nanoseconds.
> >> (See the definition of nfstime3 in RFC 1813).
> >> 
> > 
> > nfstime3 could be extended by redefining the otherwise unused
> > nanoseconds bits{31,30} as seconds{33,32}, to give a (signed) 34-bit
> > seconds field and an unsigned 30-bit nanoseconds field.
> > 
> > This could represent 1970 +/- 272 years.
> > 
> > Servers could indicate they can understand the extended time format by
> > adding a new FSINFO capability - FSF3_CANSETTIME_EX.
> > 
> > Clients would use a new SET_TO_CLIENT_TIME_EX time_how enum when sending
> > timestamps so old servers would be protected from new clients.
> 
> You would have to get the IETF’s NFSv4 working group to sign off on
> this change. Otherwise, Linux would be the only NFSv3 implementation
> that supports the extension.
> 
> But I suspect the answer you’d get is “Use NFSv4.”

While I've never dealt with an NFS standardization, I'd assume this is
a workable answer. The NFSv2 and NFSv3 definition clearly defines a valid
range of times until 2106 using unsigned seconds, and that should really
give enough time to migrate to something better (not necessarily NFSv4).

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-06-02 13:52 ` Joseph S. Myers
@ 2014-06-02 19:19   ` Arnd Bergmann
  2014-06-02 19:26     ` H. Peter Anvin
  2014-06-02 21:02     ` Joseph S. Myers
  0 siblings, 2 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-02 19:19 UTC (permalink / raw)
  To: Joseph S. Myers
  Cc: linux-kernel, linux-arch, john.stultz, hch, tglx, geert, lftan,
	hpa, linux-fsdevel, ceph-devel, cluster-devel, coda, codalist,
	fuse-devel, linux-afs, linux-btrfs, linux-cifs, linux-ext4,
	linux-f2fs-devel, linux-mtd, linux-nfs, linux-ntfs-dev,
	linux-scsi, logfs, ocfs2-devel, reiserfs-devel, samba-technical,
	xfs

On Monday 02 June 2014 13:52:19 Joseph S. Myers wrote:
> On Fri, 30 May 2014, Arnd Bergmann wrote:
> 
> > a) is this the right approach in general? The previous discussion
> >    pointed this way, but there may be other opinions.
> 
> The syscall changes seem like the sort of thing I'd expect, although 
> patches adding new syscalls or otherwise affecting the kernel/userspace 
> interface (as opposed to those relating to an individual filesystem) 
> should go to linux-api as well as other relevant lists.

Ok. Sorry about missing linux-api, I confused it with linux-arch, which
may not be as relevant here, except for the one question whether we
actually want to have the new ABI on all 32-bit architectures or only
as an opt-in for those that expect to stay around for another 24 years.

Two more questions for you:

- are you (and others) happy with adding this type of stat syscall
  (fstatat64/fstat64) as opposed to the more generic xstat that has
  been discussed in the past and that never made it through the bike-
  shedding discussion?

- once we have enough buy-in from reviewers to merge this initial
  series, should we proceed to define rest of the syscall ABI
  (minus driver ioctls) so glibc and kernel can do the conversion
  on top of that, or should we better try to do things one syscall
  family at a time and actually get the kernel to handle them
  correctly internally?

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-06-02 19:19   ` Arnd Bergmann
@ 2014-06-02 19:26     ` H. Peter Anvin
  2014-06-02 19:55       ` Arnd Bergmann
  2014-06-02 21:02     ` Joseph S. Myers
  1 sibling, 1 reply; 124+ messages in thread
From: H. Peter Anvin @ 2014-06-02 19:26 UTC (permalink / raw)
  To: Arnd Bergmann, Joseph S. Myers
  Cc: linux-kernel, linux-arch, john.stultz, hch, tglx, geert, lftan,
	linux-fsdevel, ceph-devel, cluster-devel, coda, codalist,
	fuse-devel, linux-afs, linux-btrfs, linux-cifs, linux-ext4,
	linux-f2fs-devel, linux-mtd, linux-nfs, linux-ntfs-dev,
	linux-scsi, logfs, ocfs2-devel, reiserfs-devel, samba-technical,
	xfs

On 06/02/2014 12:19 PM, Arnd Bergmann wrote:
> On Monday 02 June 2014 13:52:19 Joseph S. Myers wrote:
>> On Fri, 30 May 2014, Arnd Bergmann wrote:
>>
>>> a) is this the right approach in general? The previous discussion
>>>    pointed this way, but there may be other opinions.
>>
>> The syscall changes seem like the sort of thing I'd expect, although 
>> patches adding new syscalls or otherwise affecting the kernel/userspace 
>> interface (as opposed to those relating to an individual filesystem) 
>> should go to linux-api as well as other relevant lists.
> 
> Ok. Sorry about missing linux-api, I confused it with linux-arch, which
> may not be as relevant here, except for the one question whether we
> actually want to have the new ABI on all 32-bit architectures or only
> as an opt-in for those that expect to stay around for another 24 years.
> 
> Two more questions for you:
> 
> - are you (and others) happy with adding this type of stat syscall
>   (fstatat64/fstat64) as opposed to the more generic xstat that has
>   been discussed in the past and that never made it through the bike-
>   shedding discussion?
> 
> - once we have enough buy-in from reviewers to merge this initial
>   series, should we proceed to define rest of the syscall ABI
>   (minus driver ioctls) so glibc and kernel can do the conversion
>   on top of that, or should we better try to do things one syscall
>   family at a time and actually get the kernel to handle them
>   correctly internally?
> 

The bit that is really going to hurt is every single ioctl that uses a
timespec.

Honestly, though, I really don't understand the point with "struct
inode_time".  It seems like the zeroeth-order thing is to change the
kernel internal version of struct timespec to have a 64-bit time... it
isn't just about inodes.  We then should be explicit about the external
uses of time, and use accessors.

	-hpa



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-06-02 19:26     ` H. Peter Anvin
@ 2014-06-02 19:55       ` Arnd Bergmann
  2014-06-02 21:57         ` H. Peter Anvin
  0 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-02 19:55 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Joseph S. Myers, linux-kernel, linux-arch, john.stultz, hch,
	tglx, geert, lftan, linux-fsdevel, ceph-devel, cluster-devel,
	coda, codalist, fuse-devel, linux-afs, linux-btrfs, linux-cifs,
	linux-ext4, linux-f2fs-devel, linux-mtd, linux-nfs,
	linux-ntfs-dev, linux-scsi, logfs, ocfs2-devel, reiserfs-devel,
	samba-technical, xfs

On Monday 02 June 2014 12:26:22 H. Peter Anvin wrote:
> On 06/02/2014 12:19 PM, Arnd Bergmann wrote:
> > On Monday 02 June 2014 13:52:19 Joseph S. Myers wrote:
> >> On Fri, 30 May 2014, Arnd Bergmann wrote:
> >>
> >>> a) is this the right approach in general? The previous discussion
> >>>    pointed this way, but there may be other opinions.
> >>
> >> The syscall changes seem like the sort of thing I'd expect, although 
> >> patches adding new syscalls or otherwise affecting the kernel/userspace 
> >> interface (as opposed to those relating to an individual filesystem) 
> >> should go to linux-api as well as other relevant lists.
> > 
> > Ok. Sorry about missing linux-api, I confused it with linux-arch, which
> > may not be as relevant here, except for the one question whether we
> > actually want to have the new ABI on all 32-bit architectures or only
> > as an opt-in for those that expect to stay around for another 24 years.
> > 
> > Two more questions for you:
> > 
> > - are you (and others) happy with adding this type of stat syscall
> >   (fstatat64/fstat64) as opposed to the more generic xstat that has
> >   been discussed in the past and that never made it through the bike-
> >   shedding discussion?
> > 
> > - once we have enough buy-in from reviewers to merge this initial
> >   series, should we proceed to define rest of the syscall ABI
> >   (minus driver ioctls) so glibc and kernel can do the conversion
> >   on top of that, or should we better try to do things one syscall
> >   family at a time and actually get the kernel to handle them
> >   correctly internally?
> > 
> 
> The bit that is really going to hurt is every single ioctl that uses a
> timespec.
> 
> Honestly, though, I really don't understand the point with "struct
> inode_time".  It seems like the zeroeth-order thing is to change the
> kernel internal version of struct timespec to have a 64-bit time... it
> isn't just about inodes.  We then should be explicit about the external
> uses of time, and use accessors.

I picked these because they are fairly isolated from all other uses,
in particular since inode times are the only things where we really
care about times in the distant past or future (decades away as opposed
to things that happened between boot and shutdown).

For other kernel-internal uses, we may be better off migrating to
a completely different representation, such as nanoseconds since
boot or the architecture specific ktime_t, but this is really something
to decide for each subsystem.

I just tried building an arm32 kernel with a s64 time_t, and that
failed horribly, I get linker errors for missing 64-bit divides
and lots of warnings for code that expects time_t pointers to
functions taking a 'long' or vice versa. I also think the only
way to maintain ABI compatibility is to separate the internal uses
from the interface, which means auditing all code in the end.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-06-02 19:19   ` Arnd Bergmann
  2014-06-02 19:26     ` H. Peter Anvin
@ 2014-06-02 21:02     ` Joseph S. Myers
  2014-06-04 15:05       ` Arnd Bergmann
  1 sibling, 1 reply; 124+ messages in thread
From: Joseph S. Myers @ 2014-06-02 21:02 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linux-arch, john.stultz, hch, tglx, geert, lftan,
	hpa, linux-fsdevel, ceph-devel, cluster-devel, coda, codalist,
	fuse-devel, linux-afs, linux-btrfs, linux-cifs, linux-ext4,
	linux-f2fs-devel, linux-mtd, linux-nfs, linux-ntfs-dev,
	linux-scsi, logfs, ocfs2-devel, reiserfs-devel, samba-technical,
	xfs

On Mon, 2 Jun 2014, Arnd Bergmann wrote:

> Ok. Sorry about missing linux-api, I confused it with linux-arch, which
> may not be as relevant here, except for the one question whether we
> actually want to have the new ABI on all 32-bit architectures or only
> as an opt-in for those that expect to stay around for another 24 years.

For glibc I think it will make the most sense to add the support for 
64-bit time_t across all architectures that currently have 32-bit time_t 
(with the new interfaces having fallback support to implementation in 
terms of the 32-bit kernel interfaces, if the 64-bit syscalls are 
unavailable either at runtime or in the kernel headers against which glibc 
is compiled - this fallback code will of course need to check for overflow 
when passing a time value to the kernel, hopefully with error handling 
consistent with whatever the kernel ends up doing when a filesystem can't 
support a timestamp).  If some architectures don't provide the new 
interfaces in the kernel then that will mean the fallback code in glibc 
can't be removed until glibc support for those architectures is removed 
(as opposed to removing it when glibc no longer supports kernels predating 
the kernel support).

> Two more questions for you:
> 
> - are you (and others) happy with adding this type of stat syscall
>   (fstatat64/fstat64) as opposed to the more generic xstat that has
>   been discussed in the past and that never made it through the bike-
>   shedding discussion?

I am.

> - once we have enough buy-in from reviewers to merge this initial
>   series, should we proceed to define rest of the syscall ABI
>   (minus driver ioctls) so glibc and kernel can do the conversion
>   on top of that, or should we better try to do things one syscall
>   family at a time and actually get the kernel to handle them
>   correctly internally?

I don't have any comments on that ordering question.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-06-02 19:55       ` Arnd Bergmann
@ 2014-06-02 21:57         ` H. Peter Anvin
  2014-06-03 14:22           ` Arnd Bergmann
  0 siblings, 1 reply; 124+ messages in thread
From: H. Peter Anvin @ 2014-06-02 21:57 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Joseph S. Myers, linux-kernel, linux-arch, john.stultz, hch,
	tglx, geert, lftan, linux-fsdevel, ceph-devel, cluster-devel,
	coda, codalist, fuse-devel, linux-afs, linux-btrfs, linux-cifs,
	linux-ext4, linux-f2fs-devel, linux-mtd, linux-nfs,
	linux-ntfs-dev, linux-scsi, logfs, ocfs2-devel, reiserfs-devel,
	samba-technical, xfs

On 06/02/2014 12:55 PM, Arnd Bergmann wrote:
>>
>> The bit that is really going to hurt is every single ioctl that uses a
>> timespec.
>>
>> Honestly, though, I really don't understand the point with "struct
>> inode_time".  It seems like the zeroeth-order thing is to change the
>> kernel internal version of struct timespec to have a 64-bit time... it
>> isn't just about inodes.  We then should be explicit about the external
>> uses of time, and use accessors.
> 
> I picked these because they are fairly isolated from all other uses,
> in particular since inode times are the only things where we really
> care about times in the distant past or future (decades away as opposed
> to things that happened between boot and shutdown).
> 

If nothing else, I would expect to be able to set the system time to
weird values for testing.  So I'm not so sure I agree with that...

> For other kernel-internal uses, we may be better off migrating to
> a completely different representation, such as nanoseconds since
> boot or the architecture specific ktime_t, but this is really something
> to decide for each subsystem.

Having a bunch of different time representations in the kernel seems
like a real headache...

	-hpa



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 17:12                           ` H. Peter Anvin
  2014-06-02 18:50                             ` Arnd Bergmann
@ 2014-06-02 22:29                             ` Theodore Ts'o
  2014-06-02 22:32                               ` H. Peter Anvin
  1 sibling, 1 reply; 124+ messages in thread
From: Theodore Ts'o @ 2014-06-02 22:29 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Chuck Lever, Arnd Bergmann, Nicolas Pitre, Dave Chinner,
	LKML Kernel, linux-arch, joseph, john.stultz, Christoph Hellwig,
	tglx, geert, lftan, linux-fsdevel, xfs, Linux NFS Mailing List

On Mon, Jun 02, 2014 at 10:12:37AM -0700, H. Peter Anvin wrote:
> > It would be problematic for time(2) or gettimeofday(2) to return
> > TIME_UNDEFINED, since there are programs that care about time ticking
> > forward, but I could imagine a new interface which would be permitted
> > to return a flag indicating that we don't know the current time
> > (because the CMOS battery had run down, etc.) so instead we're going
> > to be counting the number of seconds since the system was booted.
> 
> This assumes that we actually know that that is the case, which may be
> an aggressive assumption.

We won't know if the RTC clock is wrong, true --- but the kernel will
know if (a) the hardware doesn't have RTC clock at all, or if (b) the
RTC clock is ticking some time that can't be encoded using the current
time_t type.  So in that case, the fallback would be to be for the
kernel to tick starting with time_t == 0 when the system is initially
booted, and the "time indefinite flag" would be set.

Now assume that we have a new system call, gettimestampofday(2), which
returns a new timestamp structure which has a 64-bit ts_sec field, the
ts_nsec field (ala struct timespec), and a ts_flags field, where the
kernel could signal things like "time invalid", or "time can't be
encoded in the legacy time_t type", or "I'm not sure if the time is
correct" --- i.e., because the RTC battery isn't working.

Not all hardware might be able to support the last, of course, but if
the battery is low, or the system has been exposed to very low
temperatures (or large amounts of cosmic radiation, etc.)  the RTC
time may just be plain wrong.  No system is going to be perfect, but
it should be possible to make htings better, at for certain classes of
hardware.

And since we are already returning (time_t) -1 in some cases, we might
as well try to make things a bit more formal.

					- Ted


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 22:29                             ` Theodore Ts'o
@ 2014-06-02 22:32                               ` H. Peter Anvin
  2014-06-02 23:32                                 ` Theodore Ts'o
  0 siblings, 1 reply; 124+ messages in thread
From: H. Peter Anvin @ 2014-06-02 22:32 UTC (permalink / raw)
  To: Theodore Ts'o, Chuck Lever, Arnd Bergmann, Nicolas Pitre,
	Dave Chinner, LKML Kernel, linux-arch, joseph, john.stultz,
	Christoph Hellwig, tglx, geert, lftan, linux-fsdevel, xfs,
	Linux NFS Mailing List

On 06/02/2014 03:29 PM, Theodore Ts'o wrote:
> 
> And since we are already returning (time_t) -1 in some cases, we might
> as well try to make things a bit more formal.
> 

Are we?  I am not aware of *Linux* actually using that.

	-hpa



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 22:32                               ` H. Peter Anvin
@ 2014-06-02 23:32                                 ` Theodore Ts'o
  2014-06-02 23:33                                   ` H. Peter Anvin
  2014-06-03 13:09                                   ` Roger Willcocks
  0 siblings, 2 replies; 124+ messages in thread
From: Theodore Ts'o @ 2014-06-02 23:32 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Chuck Lever, Arnd Bergmann, Nicolas Pitre, Dave Chinner,
	LKML Kernel, linux-arch, joseph, john.stultz, Christoph Hellwig,
	tglx, geert, lftan, linux-fsdevel, xfs, Linux NFS Mailing List

On Mon, Jun 02, 2014 at 03:32:35PM -0700, H. Peter Anvin wrote:
> On 06/02/2014 03:29 PM, Theodore Ts'o wrote:
> > 
> > And since we are already returning (time_t) -1 in some cases, we might
> > as well try to make things a bit more formal.
> > 
> 
> Are we?  I am not aware of *Linux* actually using that.

Linux's time(2) can return (time_t) -1 and set errno to EFAULT, per
the Posix specification:

SYSCALL_DEFINE1(time, time_t __user *, tloc)
{
	time_t i = get_seconds();

	if (tloc) {
		if (put_user(i,tloc))
			return -EFAULT;
	}
	force_successful_syscall_return();
	return i;
}

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 23:32                                 ` Theodore Ts'o
@ 2014-06-02 23:33                                   ` H. Peter Anvin
  2014-06-03 13:09                                   ` Roger Willcocks
  1 sibling, 0 replies; 124+ messages in thread
From: H. Peter Anvin @ 2014-06-02 23:33 UTC (permalink / raw)
  To: Theodore Ts'o, Chuck Lever, Arnd Bergmann, Nicolas Pitre,
	Dave Chinner, LKML Kernel, linux-arch, joseph, john.stultz,
	Christoph Hellwig, tglx, geert, lftan, linux-fsdevel, xfs,
	Linux NFS Mailing List

On 06/02/2014 04:32 PM, Theodore Ts'o wrote:
> On Mon, Jun 02, 2014 at 03:32:35PM -0700, H. Peter Anvin wrote:
>> On 06/02/2014 03:29 PM, Theodore Ts'o wrote:
>>>
>>> And since we are already returning (time_t) -1 in some cases, we might
>>> as well try to make things a bit more formal.
>>>
>>
>> Are we?  I am not aware of *Linux* actually using that.
> 
> Linux's time(2) can return (time_t) -1 and set errno to EFAULT, per
> the Posix specification:
> 
> SYSCALL_DEFINE1(time, time_t __user *, tloc)
> {
> 	time_t i = get_seconds();
> 
> 	if (tloc) {
> 		if (put_user(i,tloc))
> 			return -EFAULT;
> 	}
> 	force_successful_syscall_return();
> 	return i;
> }
> 

OK, I guess I should have said... other than for -EFAULT.

I just don't know of anyone using time(2) with an argument other than NULL.

	-hpa



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 11:43               ` Arnd Bergmann
@ 2014-06-03  0:32                 ` Dave Chinner
  2014-06-03  7:33                   ` Arnd Bergmann
  0 siblings, 1 reply; 124+ messages in thread
From: Dave Chinner @ 2014-06-03  0:32 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: H. Peter Anvin, linux-kernel, linux-arch, joseph, john.stultz,
	hch, tglx, geert, lftan, linux-fsdevel, xfs

On Mon, Jun 02, 2014 at 01:43:44PM +0200, Arnd Bergmann wrote:
> On Monday 02 June 2014 10:28:22 Dave Chinner wrote:
> > On Sun, Jun 01, 2014 at 10:24:37AM +1000, Dave Chinner wrote:
> > > On Sat, May 31, 2014 at 05:37:52PM +0200, Arnd Bergmann wrote:
> > > > In my list at http://kernelnewbies.org/y2038, I found that almost
> > > > all file systems at least times until 2106, because they treat
> > > > the on-disk value as unsigned on 64-bit systems, or they use
> > > > a completely different representation. My guess is that somebody
> > > > earlier spent a lot of work on making that happen.
> > > > 
> > > > The exceptions are:
> > > > 
> > > > * exofs uses signed values, which can probably be changed to be
> > > >   consistent with the others.
> > > > * isofs has a bug that limits it until 2027 on architectures with
> > > >   a signed 'char' type (otherwise it's 2155).
> > > > * udf can represent times for many thousands of years through a
> > > >   16-bit year representation, but the code to convert to epoch
> > > >   uses a const array that ends at 2038.
> > > > * afs uses signed seconds and can probably be fixed
> > > > * coda relies on user space time representation getting passed
> > > >   through an ioctl.
> > > > * I miscategorized xfs/ext2/ext3 as having unsigned 32-bit seconds,
> > > >   where they really use signed.
> > > > 
> > > > I was confused about XFS since I didn't noticed that there are
> > > > separate xfs_ictimestamp_t and xfs_timestamp_t types, so I expected
> > > > XFS to also use the 1970-2106 time range on 64-bit systems today.
> > > 
> > > You've missed an awful lot more than just the implications for the
> > > core kernel code.
> > > 
> > > There's a good chance such changes propagate to APIs elsewhere in
> > > the filesystems, because something you haven't realised is that XFS
> > > effectively exposes the on-disk timestamp format directly to
> > > userspace via the bulkstat interface (see struct xfs_bstat). It also
> > > affects the XFS open-by-handle ioctl and the swap extent ioctl used
> > > by the online defragmenter.
> 
> I really didn't look at them at all, as ioctl is very late on my
> mental list of things to change. I do realize that a lot of drivers
> and file systems do have ioctls that pass time values and we need to
> address them one by one.
> 
> I just looked at the ioctls you mentioned but don't see how open-by-handle
> is affected by this. Can you point me to what you mean?

Sorry, I misremembered how some of the XFS open-by-handle code works
in userspace (XFS has a pretty rich open-by-handle ioctl() interface
that predates the kernel syscalls by at least 10 years).  Basically
there is code in userspace that uses the information returned from
bulkstat to construct file handles to pass to the open-by-handle
ioctls. xfs_fsr then uses the combination of open-by-handle from the
bulkstat output and the bulkstat output to feed into the swap extent
ioctls....

i.e. the filesystem's idea of what time is is passed to userspace as
an opaque cookie in this case, but it is not used directly by the
open-by-handle interfaces like I implied it was.

> > Just to put that in context, here's the kernel patch to add extended
> > epoch support to XFS. It's completely untested as I haven't done any
> > userspace code changes to enable the feature. However, it should
> > give you an indication of how far the simple act of changing the
> > kernel time representation spread through the filesystem. This does
> > not include any of the VFS infrastructure to specifying the range of
> > supported timestamps.  It survives some smoke testing, but dies when
> > the online defragmenter starts using the bulkstat and swap extent
> > ioctls (the assert in xfs_inode_time_from_epoch() fires), so I
> > probably don't have that all sorted correctly yet...
> > 
> > To test extended epoch support, however, I need to some fstests that
> > define and validate the behaviour of the new syscalls - until we get
> > those we can't validate that the filesystem follows the spec
> > properly. I also suspect we are going to need an interface to query
> > the supported range of timestamps from a filesystem so that we can
> > test boundary conditions in an automated fashion....
> 
> Thanks a lot for having an initial look at this yourself!
> 
> I'd still consider the two problems largely orthogonal.

Depends how you look at it. You can't extend the kernel's idea of
time without permanent storage being able to specify the supported
bounds - that's a non-negotiable aspect of introducing extended
epoch timestamp support.

The actual addition of extended timestamp support to each individual
filesystem is orthoganol to the introduction of the struct
inode_time, but doing this addition properly is dependent on the VFS
infrastructure being there in the first place.

> My patch set
> (at least with the 64-bit tv_sec) just gets 32-bit kernels to behave
> more like 64-bit kernels regarding inode time stamps, which does
> impact all the file systems that the a 64-bit time or the NFS
> unsigned epoch (1970-2106), while your patch extends the file
> system internal epoch (1901-2038 for XFS) so it can be used by
> anything that knows how to handle larger than 32-bit second values
> (either 64-bit kernel or 32-bit with inode_time patch).

Right, but the issue is that 64 bit second counters are broken right
now because most filesystems can't support more than 32 bit values.
So it doesn't matter whether it's 32 bit or 64 bit machines, just
adding explicit support for >32 bit second counters without doing
anything else just extends that brokenness into the indefinite
future.

If we don't fix it now (i.e in the new user API and supporting
infrastructure), then we'll *never be able to fix it* and we'll be
stuck with timestamps that do really weird things when you pass
arbitrary future dates to the kernel.

> > diff --git a/fs/xfs/xfs_dinode.h b/fs/xfs/xfs_dinode.h
> > index 623bbe8..79f94722 100644
> > --- a/fs/xfs/xfs_dinode.h
> > +++ b/fs/xfs/xfs_dinode.h
> > @@ -21,11 +21,53 @@
> >  #define        XFS_DINODE_MAGIC                0x494e  /* 'IN' */
> >  #define XFS_DINODE_GOOD_VERSION(v)     ((v) >= 1 && (v) <= 3)
> >  
> > +/*
> > + * Inode timestamps get more complex when we consider supporting times beyond
> > + * the standard unix epoch of Jan 2038. The struct xfs_timestamp cannot support
> > + * more than a single extension by playing sign games, and that is still not
> > + * reliable. We also can't extend the timestamp structure because there is no
> > + * free space around them in the on-disk inode.
> > + *
> > + * Hence the simplest thing to do is to add an epoch counter for each timestamp
> > + * in the inode. This can be a single byte for each timestamp and make use of
> > + * a hole we currently pad. This gives us another 255 epochs range for the
> > + * timestamps, but requires a superblock feature bit to indicate that these
> > + * fields have meaning and can be non-zero.
> 
> Nice trick!

It's a pretty common way of extending the range of a variable for
on-disk formats. The on-disk format is completely disconnected from
the in-memory representation, so it's "easy" to play games like this
within the on-disk format.

If you look closely at ext4, you'll see all the lo/hi variables
where extension of 16->32 bits or 32->48 bits has occurred from
the ext2/3 variable formats... ;)

> 
> > +static inline __uint8_t
> > +xfs_timestamp_epoch(
> > +       struct timespec         *time)
> > +{
> > +       /* will be zero until the extended struct inode_time is introduced */
> > +       return 0;
> > +}
> > +
> > +static inline __int32_t
> > +xfs_timestamp_sec(
> > +       struct timespec         *time)
> > +{
> > +       return time->tv_sec;
> > +}
> > +
> > +static inline __kernel_time_t
> > +xfs_inode_time_from_epoch(
> > +       __uint8_t       epoch,
> > +       __int32_t       seconds)
> > +{
> > +       /* need to handle non-zero epoch when struct inode_time is introduced */
> > +       ASSERT(epoch == 0);
> > +       return seconds;
> > +}
> 
> Why don't you already implement epoch conversion for 64-bit kernels that
> are able to represent the time today?

Because I wasn't trying to solve the entire problem, just
demonstrate the infrastructure needed to support extended
timestamps.....

> This is how ext4 does it (I mean
> the sizeof() trick, not the bit stuffing they do):
....
> I guess if there is general agreement on introducing 'struct inode_time',
> we can skip that intermediate step.

Also, I don't like the concept of having filesystems that will work
on 64 bit but not 32 bit machines. Over the past 10 years, we've
managed to remove most of those differences from the VFS and XFS,
so adding new distinctions between 32/64 bit machines is not the
direction I want to head in.

As it is, I'm expecting to do this only after the struct inode_time
and the superblock "time range" infrastructure have been added to
the kernel and VFS.  If that change is not made, then we've still
only got 32 bit time....

> > @@ -509,8 +509,11 @@ xfs_sb_has_ro_compat_feature(
> >  }
> >  
> >  #define XFS_SB_FEAT_INCOMPAT_FTYPE     (1 << 0)        /* filetype in dirent */
> > +#define XFS_SB_FEAT_INCOMPAT_EPOCH     (1 << 1)        /* Time beyond 2038 */
> >  #define XFS_SB_FEAT_INCOMPAT_ALL \
> > -               (XFS_SB_FEAT_INCOMPAT_FTYPE)
> > +               (XFS_SB_FEAT_INCOMPAT_FTYPE | \
> > +                XFS_SB_FEAT_INCOMPAT_EPOCH | \
> > +                0)
> >  
> >  #define XFS_SB_FEAT_INCOMPAT_UNKNOWN   ~XFS_SB_FEAT_INCOMPAT_ALL
> 
> How does this flag get set?

mkfs.xfs

> Do you have to manually change it in the
> superblock? Since most of the time I'd suspect you wouldn't actually
> use it for the foreseeable future, would it make sense to have a mount
> option that allows it to be set, but doesn't actually change the
> superblock until the first inode gets written with a nonzero epoch?

Yes, we could set the flag on the first timestamp that goes beyond
the current epoch, but that has two problems:

	1. filesystem silently becomes incompatible with older
	kernels so failed upgrade rollbacks become problematic; and

	2. It adds unecessary complexity, as this will end up being
	the default behaviour for all new filesystems within a year.
	Then we end up with a mount option and conversion functions
	that never get used but we have to support for years....

> That way, you'd still be able to mount it with an older kernel but
> also be forward compatible with time moving on.

We've got plenty of time to roll this out so I don't see any need
for putting in place temporary support mechanisms that unnecessarily
complicate the code.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-03  0:32                 ` Dave Chinner
@ 2014-06-03  7:33                   ` Arnd Bergmann
  2014-06-03  8:41                     ` Dave Chinner
  0 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-03  7:33 UTC (permalink / raw)
  To: Dave Chinner
  Cc: H. Peter Anvin, linux-kernel, linux-arch, joseph, john.stultz,
	hch, tglx, geert, lftan, linux-fsdevel, xfs

On Tuesday 03 June 2014 10:32:27 Dave Chinner wrote:
> On Mon, Jun 02, 2014 at 01:43:44PM +0200, Arnd Bergmann wrote:
> > On Monday 02 June 2014 10:28:22 Dave Chinner wrote:
> > > On Sun, Jun 01, 2014 at 10:24:37AM +1000, Dave Chinner wrote:
> > > > On Sat, May 31, 2014 at 05:37:52PM +0200, Arnd Bergmann wrote:
> > > > > In my list at http://kernelnewbies.org/y2038, I found that almost
> > > > > all file systems at least times until 2106, because they treat
> > > > > the on-disk value as unsigned on 64-bit systems, or they use
> > > > > a completely different representation. My guess is that somebody
> > > > > earlier spent a lot of work on making that happen.
> > > > > 
> > > > > The exceptions are:
> > > > > 
> > > > > * exofs uses signed values, which can probably be changed to be
> > > > >   consistent with the others.
> > > > > * isofs has a bug that limits it until 2027 on architectures with
> > > > >   a signed 'char' type (otherwise it's 2155).
> > > > > * udf can represent times for many thousands of years through a
> > > > >   16-bit year representation, but the code to convert to epoch
> > > > >   uses a const array that ends at 2038.
> > > > > * afs uses signed seconds and can probably be fixed
> > > > > * coda relies on user space time representation getting passed
> > > > >   through an ioctl.
> > > > > * I miscategorized xfs/ext2/ext3 as having unsigned 32-bit seconds,
> > > > >   where they really use signed.
> > > > > 
> > > > > I was confused about XFS since I didn't noticed that there are
> > > > > separate xfs_ictimestamp_t and xfs_timestamp_t types, so I expected
> > > > > XFS to also use the 1970-2106 time range on 64-bit systems today.
> > > > 
> > > > You've missed an awful lot more than just the implications for the
> > > > core kernel code.
> > > > 
> > > > There's a good chance such changes propagate to APIs elsewhere in
> > > > the filesystems, because something you haven't realised is that XFS
> > > > effectively exposes the on-disk timestamp format directly to
> > > > userspace via the bulkstat interface (see struct xfs_bstat). It also
> > > > affects the XFS open-by-handle ioctl and the swap extent ioctl used
> > > > by the online defragmenter.
> > 
> > I really didn't look at them at all, as ioctl is very late on my
> > mental list of things to change. I do realize that a lot of drivers
> > and file systems do have ioctls that pass time values and we need to
> > address them one by one.
> > 
> > I just looked at the ioctls you mentioned but don't see how open-by-handle
> > is affected by this. Can you point me to what you mean?
> 
> Sorry, I misremembered how some of the XFS open-by-handle code works
> in userspace (XFS has a pretty rich open-by-handle ioctl() interface
> that predates the kernel syscalls by at least 10 years).  Basically
> there is code in userspace that uses the information returned from
> bulkstat to construct file handles to pass to the open-by-handle
> ioctls. xfs_fsr then uses the combination of open-by-handle from the
> bulkstat output and the bulkstat output to feed into the swap extent
> ioctls....
> 
> i.e. the filesystem's idea of what time is is passed to userspace as
> an opaque cookie in this case, but it is not used directly by the
> open-by-handle interfaces like I implied it was.

Ok, I see.

> > My patch set
> > (at least with the 64-bit tv_sec) just gets 32-bit kernels to behave
> > more like 64-bit kernels regarding inode time stamps, which does
> > impact all the file systems that the a 64-bit time or the NFS
> > unsigned epoch (1970-2106), while your patch extends the file
> > system internal epoch (1901-2038 for XFS) so it can be used by
> > anything that knows how to handle larger than 32-bit second values
> > (either 64-bit kernel or 32-bit with inode_time patch).
> 
> Right, but the issue is that 64 bit second counters are broken right
> now because most filesystems can't support more than 32 bit values.
> So it doesn't matter whether it's 32 bit or 64 bit machines, just
> adding explicit support for >32 bit second counters without doing
> anything else just extends that brokenness into the indefinite
> future.

Of course, "most filesystems" are obsolete, and most of the modern
file systems already support >32 bit timestamps: ext4, btrfs, cifs,
f2fs, 9p, nfsv4, ntfs, gfs2, ocfs2, fuse, ufs2. Everything else
except xfs, ext2/3 and exofs uses the nfsv3 interpretation on
64-bit systems, which interprets time stamps with the high bit
set as years 2038-2106 rather than 1903-1969.

> If we don't fix it now (i.e in the new user API and supporting
> infrastructure), then we'll *never be able to fix it* and we'll be
> stuck with timestamps that do really weird things when you pass
> arbitrary future dates to the kernel.

We already have that. I agree it's fixable and we should fix it,
but I don't see how this is different from what we had 20 years
ago when Linux on Alpha first introduced a 64-bit time_t. It's
been this way on every 64-bit Linux system since.

> > This is how ext4 does it (I mean
> > the sizeof() trick, not the bit stuffing they do):
> ....
> > I guess if there is general agreement on introducing 'struct inode_time',
> > we can skip that intermediate step.
> 
> Also, I don't like the concept of having filesystems that will work
> on 64 bit but not 32 bit machines. Over the past 10 years, we've
> managed to remove most of those differences from the VFS and XFS,
> so adding new distinctions between 32/64 bit machines is not the
> direction I want to head in.
> 
> As it is, I'm expecting to do this only after the struct inode_time
> and the superblock "time range" infrastructure have been added to
> the kernel and VFS.  If that change is not made, then we've still
> only got 32 bit time....

Ok.

> > Do you have to manually change it in the
> > superblock? Since most of the time I'd suspect you wouldn't actually
> > use it for the foreseeable future, would it make sense to have a mount
> > option that allows it to be set, but doesn't actually change the
> > superblock until the first inode gets written with a nonzero epoch?
> 
> Yes, we could set the flag on the first timestamp that goes beyond
> the current epoch, but that has two problems:
> 
> 	1. filesystem silently becomes incompatible with older
> 	kernels so failed upgrade rollbacks become problematic; and
> 
> 	2. It adds unecessary complexity, as this will end up being
> 	the default behaviour for all new filesystems within a year.
> 	Then we end up with a mount option and conversion functions
> 	that never get used but we have to support for years....
> 
> > That way, you'd still be able to mount it with an older kernel but
> > also be forward compatible with time moving on.
> 
> We've got plenty of time to roll this out so I don't see any need
> for putting in place temporary support mechanisms that unnecessarily
> complicate the code.

Ok, fair enough.

	Arnd


^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-03  7:33                   ` Arnd Bergmann
@ 2014-06-03  8:41                     ` Dave Chinner
  2014-06-03  9:16                       ` Arnd Bergmann
  0 siblings, 1 reply; 124+ messages in thread
From: Dave Chinner @ 2014-06-03  8:41 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: H. Peter Anvin, linux-kernel, linux-arch, joseph, john.stultz,
	hch, tglx, geert, lftan, linux-fsdevel, xfs

On Tue, Jun 03, 2014 at 09:33:36AM +0200, Arnd Bergmann wrote:
> On Tuesday 03 June 2014 10:32:27 Dave Chinner wrote:
> > On Mon, Jun 02, 2014 at 01:43:44PM +0200, Arnd Bergmann wrote:
> > > On Monday 02 June 2014 10:28:22 Dave Chinner wrote:
> > > > On Sun, Jun 01, 2014 at 10:24:37AM +1000, Dave Chinner wrote:
> > > > > On Sat, May 31, 2014 at 05:37:52PM +0200, Arnd Bergmann wrote:
> > > My patch set
> > > (at least with the 64-bit tv_sec) just gets 32-bit kernels to behave
> > > more like 64-bit kernels regarding inode time stamps, which does
> > > impact all the file systems that the a 64-bit time or the NFS
> > > unsigned epoch (1970-2106), while your patch extends the file
> > > system internal epoch (1901-2038 for XFS) so it can be used by
> > > anything that knows how to handle larger than 32-bit second values
> > > (either 64-bit kernel or 32-bit with inode_time patch).
> > 
> > Right, but the issue is that 64 bit second counters are broken right
> > now because most filesystems can't support more than 32 bit values.
> > So it doesn't matter whether it's 32 bit or 64 bit machines, just
> > adding explicit support for >32 bit second counters without doing
> > anything else just extends that brokenness into the indefinite
> > future.
> 
> Of course, "most filesystems" are obsolete, and most of the modern
> file systems already support >32 bit timestamps: ext4, btrfs, cifs,
> f2fs, 9p, nfsv4, ntfs, gfs2, ocfs2, fuse, ufs2. Everything else
> except xfs, ext2/3 and exofs uses the nfsv3 interpretation on
> 64-bit systems, which interprets time stamps with the high bit
> set as years 2038-2106 rather than 1903-1969.

I'm not sure that's an entirely correct representation - the
remainder of the 32 bit-only timestamp filesystems don't actively
interpret the time stamp at all - it's just an opaque 32 bit value.
hence the interpretation of the value is dependent on whether the
kernel treats it as signed or unsigned....

> > infrastructure), then we'll *never be able to fix it* and we'll be
> > stuck with timestamps that do really weird things when you pass
> > arbitrary future dates to the kernel.
> 
> We already have that. I agree it's fixable and we should fix it,
> but I don't see how this is different from what we had 20 years
> ago when Linux on Alpha first introduced a 64-bit time_t. It's
> been this way on every 64-bit Linux system since.

I see it differently: we've got 20 years more experience than when
the 64 bit time_t was introduced. That experience tells us that best
practices for API design are to range check every input to prevent
unintended side effects from occurring due to out-of-range data....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-03  8:41                     ` Dave Chinner
@ 2014-06-03  9:16                       ` Arnd Bergmann
  0 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-03  9:16 UTC (permalink / raw)
  To: Dave Chinner
  Cc: H. Peter Anvin, linux-kernel, linux-arch, joseph, john.stultz,
	hch, tglx, geert, lftan, linux-fsdevel, xfs

On Tuesday 03 June 2014 18:41:30 Dave Chinner wrote:
> On Tue, Jun 03, 2014 at 09:33:36AM +0200, Arnd Bergmann wrote:
> > On Tuesday 03 June 2014 10:32:27 Dave Chinner wrote:
> > > On Mon, Jun 02, 2014 at 01:43:44PM +0200, Arnd Bergmann wrote:
> > > > On Monday 02 June 2014 10:28:22 Dave Chinner wrote:
> > > > > On Sun, Jun 01, 2014 at 10:24:37AM +1000, Dave Chinner wrote:
> > > > > > On Sat, May 31, 2014 at 05:37:52PM +0200, Arnd Bergmann wrote:
> > > > My patch set
> > > > (at least with the 64-bit tv_sec) just gets 32-bit kernels to behave
> > > > more like 64-bit kernels regarding inode time stamps, which does
> > > > impact all the file systems that the a 64-bit time or the NFS
> > > > unsigned epoch (1970-2106), while your patch extends the file
> > > > system internal epoch (1901-2038 for XFS) so it can be used by
> > > > anything that knows how to handle larger than 32-bit second values
> > > > (either 64-bit kernel or 32-bit with inode_time patch).
> > > 
> > > Right, but the issue is that 64 bit second counters are broken right
> > > now because most filesystems can't support more than 32 bit values.
> > > So it doesn't matter whether it's 32 bit or 64 bit machines, just
> > > adding explicit support for >32 bit second counters without doing
> > > anything else just extends that brokenness into the indefinite
> > > future.
> > 
> > Of course, "most filesystems" are obsolete, and most of the modern
> > file systems already support >32 bit timestamps: ext4, btrfs, cifs,
> > f2fs, 9p, nfsv4, ntfs, gfs2, ocfs2, fuse, ufs2. Everything else
> > except xfs, ext2/3 and exofs uses the nfsv3 interpretation on
> > 64-bit systems, which interprets time stamps with the high bit
> > set as years 2038-2106 rather than 1903-1969.
> 
> I'm not sure that's an entirely correct representation - the
> remainder of the 32 bit-only timestamp filesystems don't actively
> interpret the time stamp at all - it's just an opaque 32 bit value.
> hence the interpretation of the value is dependent on whether the
> kernel treats it as signed or unsigned....

As I mentioned elsewhere in the thread, I don't the way it's handled
is intentional, but it's definitely the file system code that does
the assignment to the timeval and decides on the interpretation, doing
either

	inode->i_mtime.tv_sec = (signed)le32_to_cpu(raw_inode.mtime);

or

	inode->i_mtime.tv_sec = le32_to_cpu(raw_inode.mtime);


	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-05-31 14:30 ` [RFC 00/32] making inode time stamps y2038 ready Vyacheslav Dubeyko
@ 2014-06-03 12:21   ` Arnd Bergmann
  0 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-03 12:21 UTC (permalink / raw)
  To: Vyacheslav Dubeyko
  Cc: linux-kernel, linux-arch, joseph, john.stultz, hch, tglx, geert,
	lftan, hpa, linux-fsdevel, ceph-devel, cluster-devel, coda,
	codalist, fuse-devel, linux-afs, linux-btrfs, linux-cifs,
	linux-ext4, linux-f2fs-devel, linux-mtd, linux-nfs,
	linux-ntfs-dev, linux-scsi, logfs, ocfs2-devel, reiserfs-devel,
	samba-technical, xfs

On Saturday 31 May 2014 18:30:49 Vyacheslav Dubeyko wrote:
> By the way, what about NILFS2? Is NILFS2 ready for suggested approach
> without any changes?

nilfs2 and a lot of other file systems don't need any changes for
this, because they don't assign the inode time stamp fields to
a 'struct timespec'.

FWIW, nilfs2 uses a 64-bit seconds value, which is always safe and
can represent the full range of user space timespec on all machines.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 11/32] xfs: convert to struct inode_time
  2014-06-02 23:32                                 ` Theodore Ts'o
  2014-06-02 23:33                                   ` H. Peter Anvin
@ 2014-06-03 13:09                                   ` Roger Willcocks
  1 sibling, 0 replies; 124+ messages in thread
From: Roger Willcocks @ 2014-06-03 13:09 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: H. Peter Anvin, Nicolas Pitre, linux-arch,
	Linux NFS Mailing List, Arnd Bergmann, LKML Kernel, xfs,
	Christoph Hellwig, Chuck Lever, john.stultz, lftan,
	linux-fsdevel, geert, tglx, joseph


On Mon, 2014-06-02 at 19:32 -0400, Theodore Ts'o wrote:

> Linux's time(2) can return (time_t) -1 and set errno to EFAULT, per
> the Posix specification:
> 
> SYSCALL_DEFINE1(time, time_t __user *, tloc)
> {
> 	time_t i = get_seconds();
> 
> 	if (tloc) {
> 		if (put_user(i,tloc))
> 			return -EFAULT;
> 	}
> 	force_successful_syscall_return();
> 	return i;
> }

get_seconds() returns an unsigned long so there's potential for overflow
here.

--
Roger




^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-06-02 21:57         ` H. Peter Anvin
@ 2014-06-03 14:22           ` Arnd Bergmann
  2014-06-03 14:33             ` Joseph S. Myers
  2014-06-03 21:38             ` Dave Chinner
  0 siblings, 2 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-03 14:22 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Joseph S. Myers, linux-kernel, linux-arch, john.stultz, hch,
	tglx, geert, lftan, linux-fsdevel, ceph-devel, cluster-devel,
	coda, codalist, fuse-devel, linux-afs, linux-btrfs, linux-cifs,
	linux-ext4, linux-f2fs-devel, linux-mtd, linux-nfs,
	linux-ntfs-dev, linux-scsi, logfs, ocfs2-devel, reiserfs-devel,
	samba-technical, xfs

On Monday 02 June 2014 14:57:26 H. Peter Anvin wrote:
> On 06/02/2014 12:55 PM, Arnd Bergmann wrote:
> >>
> >> The bit that is really going to hurt is every single ioctl that uses a
> >> timespec.
> >>
> >> Honestly, though, I really don't understand the point with "struct
> >> inode_time".  It seems like the zeroeth-order thing is to change the
> >> kernel internal version of struct timespec to have a 64-bit time... it
> >> isn't just about inodes.  We then should be explicit about the external
> >> uses of time, and use accessors.
> > 
> > I picked these because they are fairly isolated from all other uses,
> > in particular since inode times are the only things where we really
> > care about times in the distant past or future (decades away as opposed
> > to things that happened between boot and shutdown).
> > 
> 
> If nothing else, I would expect to be able to set the system time to
> weird values for testing.  So I'm not so sure I agree with that...

I think John Stultz and Thomas Gleixner have already started looking
at how the timekeeping code can be updated. Once that is done, we should
be able to add a functional 64-bit gettimeofday/settimeofday syscall
pair. While I definitely agree this is one of the most basic things to
have, it's also not an area of the kernel that is easy to change.

> > For other kernel-internal uses, we may be better off migrating to
> > a completely different representation, such as nanoseconds since
> > boot or the architecture specific ktime_t, but this is really something
> > to decide for each subsystem.
> 
> Having a bunch of different time representations in the kernel seems
> like a real headache...

We already have time_t, ktime_t, timeval, timespec, compat_timespec,
clock_t, cputime_t, cputime64_t, tm, nanoseconds, jiffies, jiffies64,
and lots of driver or file system specific representations. I'm all for
removing a bunch of these from the kernel, but my feeling is that this is
one of the cases where we first have to add new ones in order to remove
those that are already there.
To complicate things further, we also have various times bases
(realtime/utc, realtime/tai, monotonic, monotonic_raw, boottime, ...),
and at least for the timespec values we pass around, it's not always
obvious which one is used, of if that's the right one.

We probably don't want to add a lot of new representations, and it's
possible that we can change most of the internal code we have to
ktime_t and then convert that to whatever user space wants at the
interfaces.

The possible uses I can see for non-ktime_t types in the kernel are:
* inodes need 96 bit timestamps to represent the full range of values
  that can be stored in a file system, you made a convincing argument
  for that. Almost everything else can fit into 64 bit on a 32-bit
  kernel, in theory also on a 64-bit kernel if we want that.
* A number of interfaces pass relative timespecs: nanosleep(), poll(),
  select(), sigtimedwait(), alarm(), futex() and probably more. There is
  nothing wrong with the use of timespec here, and it may be good to
  annotate that by using a new type (e.g. struct timeout) that is defined
  as compatible with the current timespec.
* For new user interfaces, we need a new type such as the
  __kernel_timespec64 I introduced, so it doesn't clash with the normal
  user timespec that may be smaller, depending on the libc.
* A lot of drivers will need new ioctl commands, and for drivers that
  just need time stamps (audio, v4l, sockets, ...) it may be more
  efficient and more correct to use a new timestamp_t (e.g. boot time
  64-bit nanoseconds) than __kernel_timespec64, which is not normally
  monotonic and requires a normalization step. If we end up introducing
  such a type in the user interface, we can also start using it in the
  kernel.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-06-03 14:22           ` Arnd Bergmann
@ 2014-06-03 14:33             ` Joseph S. Myers
  2014-06-03 14:37               ` Arnd Bergmann
  2014-06-03 21:38             ` Dave Chinner
  1 sibling, 1 reply; 124+ messages in thread
From: Joseph S. Myers @ 2014-06-03 14:33 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: H. Peter Anvin, linux-kernel, linux-arch, john.stultz, hch, tglx,
	geert, lftan, linux-fsdevel, ceph-devel, cluster-devel, coda,
	codalist, fuse-devel, linux-afs, linux-btrfs, linux-cifs,
	linux-ext4, linux-f2fs-devel, linux-mtd, linux-nfs,
	linux-ntfs-dev, linux-scsi, logfs, ocfs2-devel, reiserfs-devel,
	samba-technical, xfs

On Tue, 3 Jun 2014, Arnd Bergmann wrote:

> I think John Stultz and Thomas Gleixner have already started looking
> at how the timekeeping code can be updated. Once that is done, we should
> be able to add a functional 64-bit gettimeofday/settimeofday syscall
> pair. While I definitely agree this is one of the most basic things to
> have, it's also not an area of the kernel that is easy to change.

64-bit clock_gettime / clock_settime instead of gettimeofday / 
settimeofday should avoid the need for the kernel to have a 64-bit version 
of struct timeval.  (Userspace 64-bit gettimeofday / settimeofday would 
need to use a combination of the syscalls if the tz pointer is non-NULL.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-06-03 14:33             ` Joseph S. Myers
@ 2014-06-03 14:37               ` Arnd Bergmann
  0 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-03 14:37 UTC (permalink / raw)
  To: Joseph S. Myers
  Cc: H. Peter Anvin, linux-kernel, linux-arch, john.stultz, hch, tglx,
	geert, lftan, linux-fsdevel, ceph-devel, cluster-devel, coda,
	codalist, fuse-devel, linux-afs, linux-btrfs, linux-cifs,
	linux-ext4, linux-f2fs-devel, linux-mtd, linux-nfs,
	linux-ntfs-dev, linux-scsi, logfs, ocfs2-devel, reiserfs-devel,
	samba-technical, xfs

On Tuesday 03 June 2014 14:33:10 Joseph S. Myers wrote:
> On Tue, 3 Jun 2014, Arnd Bergmann wrote:
> 
> > I think John Stultz and Thomas Gleixner have already started looking
> > at how the timekeeping code can be updated. Once that is done, we should
> > be able to add a functional 64-bit gettimeofday/settimeofday syscall
> > pair. While I definitely agree this is one of the most basic things to
> > have, it's also not an area of the kernel that is easy to change.
> 
> 64-bit clock_gettime / clock_settime instead of gettimeofday / 
> settimeofday should avoid the need for the kernel to have a 64-bit version 
> of struct timeval.  (Userspace 64-bit gettimeofday / settimeofday would 
> need to use a combination of the syscalls if the tz pointer is non-NULL.)

Yes, that's what I meant.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-06-03 14:22           ` Arnd Bergmann
  2014-06-03 14:33             ` Joseph S. Myers
@ 2014-06-03 21:38             ` Dave Chinner
  2014-06-04 15:03               ` Arnd Bergmann
  1 sibling, 1 reply; 124+ messages in thread
From: Dave Chinner @ 2014-06-03 21:38 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: H. Peter Anvin, Joseph S. Myers, linux-kernel, linux-arch,
	john.stultz, hch, tglx, geert, lftan, linux-fsdevel, ceph-devel,
	cluster-devel, coda, codalist, fuse-devel, linux-afs,
	linux-btrfs, linux-cifs, linux-ext4, linux-f2fs-devel, linux-mtd,
	linux-nfs, linux-ntfs-dev, linux-scsi, logfs, ocfs2-devel,
	reiserfs-devel, samba-technical, xfs

On Tue, Jun 03, 2014 at 04:22:19PM +0200, Arnd Bergmann wrote:
> On Monday 02 June 2014 14:57:26 H. Peter Anvin wrote:
> > On 06/02/2014 12:55 PM, Arnd Bergmann wrote:
> The possible uses I can see for non-ktime_t types in the kernel are:
> * inodes need 96 bit timestamps to represent the full range of values
>   that can be stored in a file system, you made a convincing argument
>   for that. Almost everything else can fit into 64 bit on a 32-bit
>   kernel, in theory also on a 64-bit kernel if we want that.

Just ot be pedantic, inodes don't *need* 96 bit timestamps - some
filesystems can *support up to* 96 bit timestamps. If the kernel
only supports 64 bit timestamps and that's all the kernel can
represent, then the upper bits of the 96 bit on-disk inode
timestamps simply remain zero.

If you move the filesystem between kernels with different time
ranges, then the filesystem needs to be able to tell the kernel what
it's supported range is.  This is where having the VFS limit the
range of supported timestamps is important: the limit is the
min(kernel range, filesystem range). This allows the filesystems
to be indepenent of the kernel time representation, and the kernel
to be independent of the physical filesystem time encoding....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-06-03 21:38             ` Dave Chinner
@ 2014-06-04 15:03               ` Arnd Bergmann
  2014-06-04 17:30                 ` Nicolas Pitre
  0 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-04 15:03 UTC (permalink / raw)
  To: Dave Chinner
  Cc: H. Peter Anvin, Joseph S. Myers, linux-kernel, linux-arch,
	john.stultz, hch, tglx, geert, lftan, linux-fsdevel, ceph-devel,
	cluster-devel, coda, codalist, fuse-devel, linux-afs,
	linux-btrfs, linux-cifs, linux-ext4, linux-f2fs-devel, linux-mtd,
	linux-nfs, linux-ntfs-dev, linux-scsi, logfs, ocfs2-devel,
	reiserfs-devel, samba-technical, xfs

On Tuesday 03 June 2014, Dave Chinner wrote:
> On Tue, Jun 03, 2014 at 04:22:19PM +0200, Arnd Bergmann wrote:
> > On Monday 02 June 2014 14:57:26 H. Peter Anvin wrote:
> > > On 06/02/2014 12:55 PM, Arnd Bergmann wrote:
> > The possible uses I can see for non-ktime_t types in the kernel are:
> > * inodes need 96 bit timestamps to represent the full range of values
> >   that can be stored in a file system, you made a convincing argument
> >   for that. Almost everything else can fit into 64 bit on a 32-bit
> >   kernel, in theory also on a 64-bit kernel if we want that.
> 
> Just ot be pedantic, inodes don't need 96 bit timestamps - some
> filesystems can *support up to* 96 bit timestamps. If the kernel
> only supports 64 bit timestamps and that's all the kernel can
> represent, then the upper bits of the 96 bit on-disk inode
> timestamps simply remain zero.

I meant the reverse: since we have file systems that can store
96-bit timestamps when using 64-bit kernels, we need to extend
32-bit kernels to have the same internal representation so we
can actually read those file systems correctly.

> If you move the filesystem between kernels with different time
> ranges, then the filesystem needs to be able to tell the kernel what
> it's supported range is.  This is where having the VFS limit the
> range of supported timestamps is important: the limit is the
> min(kernel range, filesystem range). This allows the filesystems
> to be indepenent of the kernel time representation, and the kernel
> to be independent of the physical filesystem time encoding....

I agree it makes sense to let the kernel know about the limits
of the file system it accesses, but for the reverse, we're probably
better off just making the kernel representation large enough (i.e.
96 bits) so it can work with any known file system. We need another
check at the user space boundary to turn that into a value that the
user can understand, but that's another problem.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-06-02 21:02     ` Joseph S. Myers
@ 2014-06-04 15:05       ` Arnd Bergmann
  0 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-04 15:05 UTC (permalink / raw)
  To: Joseph S. Myers
  Cc: linux-kernel, linux-arch, john.stultz, hch, tglx, geert, lftan,
	hpa, linux-fsdevel, ceph-devel, cluster-devel, coda, codalist,
	fuse-devel, linux-afs, linux-btrfs, linux-cifs, linux-ext4,
	linux-f2fs-devel, linux-mtd, linux-nfs, linux-ntfs-dev,
	linux-scsi, logfs, ocfs2-devel, reiserfs-devel, samba-technical,
	xfs

On Monday 02 June 2014, Joseph S. Myers wrote:
> On Mon, 2 Jun 2014, Arnd Bergmann wrote:
> 
> > Ok. Sorry about missing linux-api, I confused it with linux-arch, which
> > may not be as relevant here, except for the one question whether we
> > actually want to have the new ABI on all 32-bit architectures or only
> > as an opt-in for those that expect to stay around for another 24 years.
> 
> For glibc I think it will make the most sense to add the support for 
> 64-bit time_t across all architectures that currently have 32-bit time_t 
> (with the new interfaces having fallback support to implementation in 
> terms of the 32-bit kernel interfaces, if the 64-bit syscalls are 
> unavailable either at runtime or in the kernel headers against which glibc 
> is compiled - this fallback code will of course need to check for overflow 
> when passing a time value to the kernel, hopefully with error handling 
> consistent with whatever the kernel ends up doing when a filesystem can't 
> support a timestamp).  If some architectures don't provide the new 
> interfaces in the kernel then that will mean the fallback code in glibc 
> can't be removed until glibc support for those architectures is removed 
> (as opposed to removing it when glibc no longer supports kernels predating 
> the kernel support).

Ok, that's a good reason to just provide the new interfaces on all
architectures right away. Thanks for the insight!

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-06-04 15:03               ` Arnd Bergmann
@ 2014-06-04 17:30                 ` Nicolas Pitre
  2014-06-04 19:24                   ` Arnd Bergmann
  0 siblings, 1 reply; 124+ messages in thread
From: Nicolas Pitre @ 2014-06-04 17:30 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Dave Chinner, hch, linux-mtd, H. Peter Anvin, logfs, linux-afs,
	Joseph S. Myers, linux-arch, linux-cifs, linux-scsi, ceph-devel,
	cluster-devel, coda, geert, linux-ext4, codalist, fuse-devel,
	reiserfs-devel, xfs, john.stultz, tglx, linux-nfs,
	linux-ntfs-dev, samba-technical, linux-kernel, linux-f2fs-devel,
	ocfs2-devel, linux-fsdevel, lftan, linux-btrfs

On Wed, 4 Jun 2014, Arnd Bergmann wrote:

> On Tuesday 03 June 2014, Dave Chinner wrote:
> > Just ot be pedantic, inodes don't need 96 bit timestamps - some
> > filesystems can *support up to* 96 bit timestamps. If the kernel
> > only supports 64 bit timestamps and that's all the kernel can
> > represent, then the upper bits of the 96 bit on-disk inode
> > timestamps simply remain zero.
> 
> I meant the reverse: since we have file systems that can store
> 96-bit timestamps when using 64-bit kernels, we need to extend
> 32-bit kernels to have the same internal representation so we
> can actually read those file systems correctly.
> 
> > If you move the filesystem between kernels with different time
> > ranges, then the filesystem needs to be able to tell the kernel what
> > it's supported range is.  This is where having the VFS limit the
> > range of supported timestamps is important: the limit is the
> > min(kernel range, filesystem range). This allows the filesystems
> > to be indepenent of the kernel time representation, and the kernel
> > to be independent of the physical filesystem time encoding....
> 
> I agree it makes sense to let the kernel know about the limits
> of the file system it accesses, but for the reverse, we're probably
> better off just making the kernel representation large enough (i.e.
> 96 bits) so it can work with any known file system.

Depends...  96 bit handling may get prohibitive on 32-bit archs.

The important point here is for the kernel to be able to represent the 
time _range_ used by any known filesystem, not necessarily the time 
_precision_.

For example, a 64 bit representation can be made of 40 bits for seconds 
spanning 34865 years, and 24 bits for fractional seconds providing 
precision down to 60 nanosecs.  That ought to be plenty good on 32 bit 
systems while still being cheap to handle.


Nicolas

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-06-04 17:30                 ` Nicolas Pitre
@ 2014-06-04 19:24                   ` Arnd Bergmann
  2014-06-05  0:10                     ` H. Peter Anvin
  0 siblings, 1 reply; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-04 19:24 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Dave Chinner, hch, linux-mtd, H. Peter Anvin, logfs, linux-afs,
	Joseph S. Myers, linux-arch, linux-cifs, linux-scsi, ceph-devel,
	cluster-devel, coda, geert, linux-ext4, codalist, fuse-devel,
	reiserfs-devel, xfs, john.stultz, tglx, linux-nfs,
	linux-ntfs-dev, samba-technical, linux-kernel, linux-f2fs-devel,
	ocfs2-devel, linux-fsdevel, lftan, linux-btrfs

On Wednesday 04 June 2014 13:30:32 Nicolas Pitre wrote:
> On Wed, 4 Jun 2014, Arnd Bergmann wrote:
> 
> > On Tuesday 03 June 2014, Dave Chinner wrote:
> > > Just ot be pedantic, inodes don't need 96 bit timestamps - some
> > > filesystems can *support up to* 96 bit timestamps. If the kernel
> > > only supports 64 bit timestamps and that's all the kernel can
> > > represent, then the upper bits of the 96 bit on-disk inode
> > > timestamps simply remain zero.
> > 
> > I meant the reverse: since we have file systems that can store
> > 96-bit timestamps when using 64-bit kernels, we need to extend
> > 32-bit kernels to have the same internal representation so we
> > can actually read those file systems correctly.
> > 
> > > If you move the filesystem between kernels with different time
> > > ranges, then the filesystem needs to be able to tell the kernel what
> > > it's supported range is.  This is where having the VFS limit the
> > > range of supported timestamps is important: the limit is the
> > > min(kernel range, filesystem range). This allows the filesystems
> > > to be indepenent of the kernel time representation, and the kernel
> > > to be independent of the physical filesystem time encoding....
> > 
> > I agree it makes sense to let the kernel know about the limits
> > of the file system it accesses, but for the reverse, we're probably
> > better off just making the kernel representation large enough (i.e.
> > 96 bits) so it can work with any known file system.
> 
> Depends...  96 bit handling may get prohibitive on 32-bit archs.
> 
> The important point here is for the kernel to be able to represent the 
> time _range_ used by any known filesystem, not necessarily the time 
> _precision_.
> 
> For example, a 64 bit representation can be made of 40 bits for seconds 
> spanning 34865 years, and 24 bits for fractional seconds providing 
> precision down to 60 nanosecs.  That ought to be plenty good on 32 bit 
> systems while still being cheap to handle.

I have checked earlier that we don't do any computation on inode
time stamps in common code, we just pass them around, so there is
very little runtime overhead. There is a small bit of space overhead
(12 byte) per inode, but that structure is already on the order of
500 bytes.

For other timekeeping stuff in the kernel, I agree that using some
64-bit representation (nanoseconds, 32/32 unsigned seconds/nanoseconds,
...) has advantages, that's exactly the point I was making earlier
against simply extending the internal time_t/timespec to 64-bit
seconds for everything.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-06-04 19:24                   ` Arnd Bergmann
@ 2014-06-05  0:10                     ` H. Peter Anvin
  2014-06-10  9:54                       ` Arnd Bergmann
  0 siblings, 1 reply; 124+ messages in thread
From: H. Peter Anvin @ 2014-06-05  0:10 UTC (permalink / raw)
  To: Arnd Bergmann, Nicolas Pitre
  Cc: Dave Chinner, hch, linux-mtd, logfs, linux-afs, Joseph S. Myers,
	linux-arch, linux-cifs, linux-scsi, ceph-devel, cluster-devel,
	coda, geert, linux-ext4, codalist, fuse-devel, reiserfs-devel,
	xfs, john.stultz, tglx, linux-nfs, linux-ntfs-dev,
	samba-technical, linux-kernel, linux-f2fs-devel, ocfs2-devel,
	linux-fsdevel, lftan, linux-btrfs

On 06/04/2014 12:24 PM, Arnd Bergmann wrote:
> 
> For other timekeeping stuff in the kernel, I agree that using some
> 64-bit representation (nanoseconds, 32/32 unsigned seconds/nanoseconds,
> ...) has advantages, that's exactly the point I was making earlier
> against simply extending the internal time_t/timespec to 64-bit
> seconds for everything.
> 

How much of a performance issue is it to make time_t 64 bits, and for
the bits there are, how hard are they to fix?

	-hpa



^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [RFC 00/32] making inode time stamps y2038 ready
  2014-06-05  0:10                     ` H. Peter Anvin
@ 2014-06-10  9:54                       ` Arnd Bergmann
  0 siblings, 0 replies; 124+ messages in thread
From: Arnd Bergmann @ 2014-06-10  9:54 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Nicolas Pitre, Dave Chinner, hch, linux-mtd, logfs, linux-afs,
	Joseph S. Myers, linux-arch, linux-cifs, linux-scsi, ceph-devel,
	cluster-devel, coda, geert, linux-ext4, codalist, fuse-devel,
	reiserfs-devel, xfs, john.stultz, tglx, linux-nfs,
	linux-ntfs-dev, samba-technical, linux-kernel, linux-f2fs-devel,
	ocfs2-devel, linux-fsdevel, lftan, linux-btrfs

On Wednesday 04 June 2014 17:10:24 H. Peter Anvin wrote:
> On 06/04/2014 12:24 PM, Arnd Bergmann wrote:
> > 
> > For other timekeeping stuff in the kernel, I agree that using some
> > 64-bit representation (nanoseconds, 32/32 unsigned seconds/nanoseconds,
> > ...) has advantages, that's exactly the point I was making earlier
> > against simply extending the internal time_t/timespec to 64-bit
> > seconds for everything.
> > 
> 
> How much of a performance issue is it to make time_t 64 bits, and for
> the bits there are, how hard are they to fix?

Probably very little overhead for most uses, it's more the regression
potential in the less common parts of the kernel I'm worried about.

There is a significant but not overwhelming number of uses of the
main problematic types in the kernel:

arnd@wuerfel:~/arm-soc$ git grep -wl time_t | wc
    188     188    5566
arnd@wuerfel:~/arm-soc$ git grep -wl timeval | wc
    320     320   10353
arnd@wuerfel:~/arm-soc$ git grep -wl timespec | wc
    406     406   10886

I believe we have to audit all of them anyway if we want to change
the kernel to less problematic types and introduce new user
interfaces.

IMHO this work is helped if we change the uses to a new type
as we find the problems. This lets us do the work one subsystem
at a time and avoid accidental ABI changes. I don't care much what
type that will be, and having a 96-bit type will certainly work
well in a lot of cases, but I don't see a strong reason to use
that over other types, especially when they can be more efficient.

	Arnd

^ permalink raw reply	[flat|nested] 124+ messages in thread

end of thread, back to index

Thread overview: 124+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-30 20:01 [RFC 00/32] making inode time stamps y2038 ready Arnd Bergmann
2014-05-30 20:01 ` [RFC 01/32] fs: introduce new 'struct inode_time' Arnd Bergmann
2014-05-31  7:56   ` Geert Uytterhoeven
2014-05-31  8:39     ` Andreas Schwab
2014-05-31 13:19       ` Geert Uytterhoeven
2014-05-31 13:46         ` Andreas Schwab
2014-05-31 14:54       ` Arnd Bergmann
2014-05-31 16:15         ` Geert Uytterhoeven
2014-05-31  9:03   ` H. Peter Anvin
2014-05-31 14:53     ` Arnd Bergmann
2014-05-31 14:55       ` H. Peter Anvin
2014-05-30 20:01 ` [RFC 02/32] uapi: add struct __kernel_timespec{32,64} Arnd Bergmann
2014-05-30 20:18   ` H. Peter Anvin
2014-05-31 15:09     ` Arnd Bergmann
2014-05-30 20:01 ` [RFC 03/32] fs: introduce sys_utimens64at Arnd Bergmann
2014-05-31  9:22   ` Andreas Schwab
2014-05-31 14:55     ` Arnd Bergmann
2014-05-30 20:01 ` [RFC 04/32] fs: introduce sys_newfstat64/sys_newfstatat64 Arnd Bergmann
2014-05-30 20:01 ` [RFC 05/32] arch: hook up new stat and utimes syscalls Arnd Bergmann
2014-05-30 20:01 ` [RFC 06/32] isofs: fix timestamps beyond 2027 Arnd Bergmann
2014-05-31  7:59   ` Geert Uytterhoeven
2014-05-31  8:47     ` H. Peter Anvin
2014-05-30 20:01 ` [RFC 07/32] fs/nfs: convert to struct inode_time Arnd Bergmann
2014-05-30 20:01 ` [RFC 08/32] fs/ceph: convert to 'struct inode_time' Arnd Bergmann
2014-05-30 20:01 ` [RFC 09/32] fs/pstore: convert to struct inode_time Arnd Bergmann
2014-05-30 21:14   ` Kees Cook
2014-05-30 20:01 ` [RFC 10/32] fs/coda: " Arnd Bergmann
2014-05-30 20:01 ` [RFC 11/32] xfs: " Arnd Bergmann
2014-05-31  0:37   ` Dave Chinner
2014-05-31  0:41     ` H. Peter Anvin
2014-05-31  1:14       ` Dave Chinner
2014-05-31  1:22         ` H. Peter Anvin
2014-05-31  5:54           ` Dave Chinner
2014-05-31  8:41             ` H. Peter Anvin
2014-05-31 15:46               ` Nicolas Pitre
2014-06-01 19:56                 ` Arnd Bergmann
2014-06-01 20:26                   ` H. Peter Anvin
2014-06-02 11:02                     ` Arnd Bergmann
2014-06-02  1:36                   ` Nicolas Pitre
2014-06-02  2:22                     ` Dave Chinner
2014-06-02  7:09                       ` Geert Uytterhoeven
2014-06-02 10:56                     ` Arnd Bergmann
2014-06-02 11:57                       ` Theodore Ts'o
2014-06-02 12:38                         ` Arnd Bergmann
2014-06-02 13:15                           ` Theodore Ts'o
2014-06-02 12:52                         ` Arnd Bergmann
2014-06-02 13:07                           ` Theodore Ts'o
2014-06-02 15:01                             ` Arnd Bergmann
2014-06-02 14:52                         ` H. Peter Anvin
2014-06-02 15:04                       ` Chuck Lever
2014-06-02 15:31                         ` Theodore Ts'o
2014-06-02 17:12                           ` H. Peter Anvin
2014-06-02 18:50                             ` Arnd Bergmann
2014-06-02 22:29                             ` Theodore Ts'o
2014-06-02 22:32                               ` H. Peter Anvin
2014-06-02 23:32                                 ` Theodore Ts'o
2014-06-02 23:33                                   ` H. Peter Anvin
2014-06-03 13:09                                   ` Roger Willcocks
2014-06-02 18:52                         ` Arnd Bergmann
2014-06-02 18:58                         ` Roger Willcocks
2014-06-02 19:04                           ` Chuck Lever
2014-06-02 19:10                             ` Arnd Bergmann
2014-06-01  0:39               ` Dave Chinner
2014-06-02 14:00             ` Joseph S. Myers
2014-05-31 15:37         ` Arnd Bergmann
2014-06-01  0:24           ` Dave Chinner
2014-06-02  0:28             ` Dave Chinner
2014-06-02 11:35               ` Roger Willcocks
2014-06-02 11:43               ` Arnd Bergmann
2014-06-03  0:32                 ` Dave Chinner
2014-06-03  7:33                   ` Arnd Bergmann
2014-06-03  8:41                     ` Dave Chinner
2014-06-03  9:16                       ` Arnd Bergmann
2014-05-30 20:01 ` [RFC 12/32] btrfs: " Arnd Bergmann
2014-05-30 20:01 ` [RFC 13/32] ext3: " Arnd Bergmann
2014-05-31  9:10   ` H. Peter Anvin
2014-05-31 14:32     ` Arnd Bergmann
2014-05-30 20:01 ` [RFC 14/32] ext4: " Arnd Bergmann
2014-05-30 20:01 ` [RFC 15/32] cifs: " Arnd Bergmann
2014-05-30 20:01 ` [RFC 16/32] ntfs: " Arnd Bergmann
2014-05-30 20:01 ` [RFC 17/32] ubifs: " Arnd Bergmann
2014-06-02  7:54   ` Artem Bityutskiy
2014-05-30 20:01 ` [RFC 18/32] ocfs2: " Arnd Bergmann
2014-05-30 20:01 ` [RFC 19/32] fs/fat: " Arnd Bergmann
2014-05-30 20:01 ` [RFC 20/32] afs: " Arnd Bergmann
2014-05-30 20:01 ` [RFC 21/32] udf: " Arnd Bergmann
2014-05-30 20:01 ` [RFC 22/32] fs: convert simple fs to inode_time Arnd Bergmann
2014-05-30 23:06   ` Greg Kroah-Hartman
2014-05-30 20:01 ` [RFC 23/32] logfs: convert to struct inode_time Arnd Bergmann
2014-05-30 20:01 ` [RFC 24/32] hfs, hfsplus: " Arnd Bergmann
2014-05-31 14:23   ` Vyacheslav Dubeyko
2014-05-30 20:01 ` [RFC 25/32] gfs2: " Arnd Bergmann
2014-06-02  9:52   ` Steven Whitehouse
2014-05-30 20:01 ` [RFC 26/32] reiserfs: " Arnd Bergmann
2014-05-30 20:01 ` [RFC 27/32] jffs2: " Arnd Bergmann
2014-05-30 20:01 ` [RFC 28/32] adfs: " Arnd Bergmann
2014-05-30 20:01 ` [RFC 29/32] f2fs: " Arnd Bergmann
2014-05-30 20:01 ` [RFC 30/32] fuse: " Arnd Bergmann
2014-05-30 20:01 ` [RFC 31/32] scsi: fnic: use current_kernel_time() for timestamp Arnd Bergmann
2014-05-30 20:01 ` [RFC 32/32] fs: use new inode_time definition unconditionally Arnd Bergmann
2014-05-31 14:30 ` [RFC 00/32] making inode time stamps y2038 ready Vyacheslav Dubeyko
2014-06-03 12:21   ` Arnd Bergmann
2014-05-31 14:51 ` Richard Cochran
2014-05-31 15:23   ` Arnd Bergmann
2014-05-31 18:22     ` Richard Cochran
2014-05-31 19:34       ` H. Peter Anvin
2014-06-01  4:46         ` Richard Cochran
2014-06-01  4:44     ` Richard Cochran
2014-06-02 13:52 ` Joseph S. Myers
2014-06-02 19:19   ` Arnd Bergmann
2014-06-02 19:26     ` H. Peter Anvin
2014-06-02 19:55       ` Arnd Bergmann
2014-06-02 21:57         ` H. Peter Anvin
2014-06-03 14:22           ` Arnd Bergmann
2014-06-03 14:33             ` Joseph S. Myers
2014-06-03 14:37               ` Arnd Bergmann
2014-06-03 21:38             ` Dave Chinner
2014-06-04 15:03               ` Arnd Bergmann
2014-06-04 17:30                 ` Nicolas Pitre
2014-06-04 19:24                   ` Arnd Bergmann
2014-06-05  0:10                     ` H. Peter Anvin
2014-06-10  9:54                       ` Arnd Bergmann
2014-06-02 21:02     ` Joseph S. Myers
2014-06-04 15:05       ` Arnd Bergmann

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox