From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751859AbaFDT0i (ORCPT ); Wed, 4 Jun 2014 15:26:38 -0400 Received: from mout.kundenserver.de ([212.227.17.10]:62599 "EHLO mout.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751132AbaFDT0e (ORCPT ); Wed, 4 Jun 2014 15:26:34 -0400 From: Arnd Bergmann To: Nicolas Pitre Cc: Dave Chinner , hch@infradead.org, linux-mtd@lists.infradead.org, "H. Peter Anvin" , logfs@logfs.org, linux-afs@lists.infradead.org, "Joseph S. Myers" , linux-arch@vger.kernel.org, linux-cifs@vger.kernel.org, linux-scsi@vger.kernel.org, ceph-devel@vger.kernel.org, cluster-devel@redhat.com, coda@cs.cmu.edu, geert@linux-m68k.org, linux-ext4@vger.kernel.org, codalist@telemann.coda.cs.cmu.edu, fuse-devel@lists.sourceforge.net, reiserfs-devel@vger.kernel.org, xfs@oss.sgi.com, john.stultz@linaro.org, tglx@linutronix.de, linux-nfs@vger.kernel.org, linux-ntfs-dev@lists.sourceforge.net, samba-technical@lists.samba.org, linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, ocfs2-devel@oss.oracle.com, linux-fsdevel@vger.kernel.org, lftan@altera.com, linux-btrfs@vger.kernel.org Subject: Re: [RFC 00/32] making inode time stamps y2038 ready Date: Wed, 04 Jun 2014 21:24:42 +0200 Message-ID: <8770583.6XeZxCxOY8@wuerfel> User-Agent: KMail/4.11.5 (Linux/3.11.0-18-generic; KDE/4.11.5; x86_64; ; ) In-Reply-To: References: <1401480116-1973111-1-git-send-email-arnd@arndb.de> <201406041703.47592.arnd@arndb.de> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Provags-ID: V02:K0:DM/lJrJE4mb6T+iFyCL5XBEfoTbYuOXvde6IKvCGOVz G55a3rrYOgf/FNXs0QU64TfDtDIUUWMfGh13elMZbO1rGrrAEg WDWFL6ziSZQCzZE7GmFTeRgbaUc0cLksPQudEA0QrLVCPKvcPy Cqd7r/0pYyfuKZ7kfVvM6xQgYN4QcioF4uYb5pmxgIZ/+MDuH/ uyEiUHVHTKVaIHPpCBPyXHgy55+cgzKqvX24lMmjUyS+b4cOt3 M9BdU4dIwjMKy2cUEyK94ex5FZ14BnAIBl6BxOtMmU4gt2qhcX K2+j3uLaokuufl//ecz94gIUH+90XftVgCzc/cwstCBgC36cWZ TlJ2MJ5Y4O+ZqNmO84Is= Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wednesday 04 June 2014 13:30:32 Nicolas Pitre wrote: > On Wed, 4 Jun 2014, Arnd Bergmann wrote: > > > On Tuesday 03 June 2014, Dave Chinner wrote: > > > Just ot be pedantic, inodes don't need 96 bit timestamps - some > > > filesystems can *support up to* 96 bit timestamps. If the kernel > > > only supports 64 bit timestamps and that's all the kernel can > > > represent, then the upper bits of the 96 bit on-disk inode > > > timestamps simply remain zero. > > > > I meant the reverse: since we have file systems that can store > > 96-bit timestamps when using 64-bit kernels, we need to extend > > 32-bit kernels to have the same internal representation so we > > can actually read those file systems correctly. > > > > > If you move the filesystem between kernels with different time > > > ranges, then the filesystem needs to be able to tell the kernel what > > > it's supported range is. This is where having the VFS limit the > > > range of supported timestamps is important: the limit is the > > > min(kernel range, filesystem range). This allows the filesystems > > > to be indepenent of the kernel time representation, and the kernel > > > to be independent of the physical filesystem time encoding.... > > > > I agree it makes sense to let the kernel know about the limits > > of the file system it accesses, but for the reverse, we're probably > > better off just making the kernel representation large enough (i.e. > > 96 bits) so it can work with any known file system. > > Depends... 96 bit handling may get prohibitive on 32-bit archs. > > The important point here is for the kernel to be able to represent the > time _range_ used by any known filesystem, not necessarily the time > _precision_. > > For example, a 64 bit representation can be made of 40 bits for seconds > spanning 34865 years, and 24 bits for fractional seconds providing > precision down to 60 nanosecs. That ought to be plenty good on 32 bit > systems while still being cheap to handle. I have checked earlier that we don't do any computation on inode time stamps in common code, we just pass them around, so there is very little runtime overhead. There is a small bit of space overhead (12 byte) per inode, but that structure is already on the order of 500 bytes. For other timekeeping stuff in the kernel, I agree that using some 64-bit representation (nanoseconds, 32/32 unsigned seconds/nanoseconds, ...) has advantages, that's exactly the point I was making earlier against simply extending the internal time_t/timespec to 64-bit seconds for everything. Arnd