From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934007AbcALIaS (ORCPT ); Tue, 12 Jan 2016 03:30:18 -0500 Received: from ipmail04.adl6.internode.on.net ([150.101.137.141]:63425 "EHLO ipmail04.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759102AbcALIaQ (ORCPT ); Tue, 12 Jan 2016 03:30:16 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2A+EgAguZRWPGu7LXleKAECgw+BP4Jig36BeZ98BotdiUSGCQICAQECgSZNAQEBAQEBBwEBAQFBP0ESAYNgAQEBAwEnExwjBQsIAxgJJQ8FJQMHGhMbiAsHv2UBAQgCASAZhXWFR4RAhH0FjT+JVIg2hRqBZ4dthTGFZYhugnIfgXEqNIQmQAEfBIEnAQEB Date: Tue, 12 Jan 2016 19:29:57 +1100 From: Dave Chinner To: Deepa Dinamani Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, y2038@lists.linaro.org Subject: Re: [RFC 02/15] vfs: Change all structures to support 64 bit time Message-ID: <20160112082957.GG6033@dastard> References: <1452144972-15802-1-git-send-email-deepa.kernel@gmail.com> <1452144972-15802-3-git-send-email-deepa.kernel@gmail.com> <20160110230339.GD10456@dastard> <20160112054235.GA63984@myubuntu.deepaubuntu.g7.internal.cloudapp.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160112054235.GA63984@myubuntu.deepaubuntu.g7.internal.cloudapp.net> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 11, 2016 at 09:42:36PM -0800, Deepa Dinamani wrote: > > On Jan 11, 2016, at 04:33, Dave Chinner wrote: > > > >> On Wed, Jan 06, 2016 at 09:35:59PM -0800, Deepa Dinamani wrote: > >> The current representation of inode times in struct inode, struct iattr, > >> and struct kstat use struct timespec. timespec is not y2038 safe. > >> > >> Use scalar data types (seconds and nanoseconds stored separately) to > >> represent timestamps in struct inode in order to maintain same size for > >> times across 32 bit and 64 bit architectures. > >> In addition, lay them out such that they are packed on a naturally > >> aligned boundary on 64 bit arch as 4 bytes are used to store nsec. > >> This makes each tuple(sec, nscec) use 12 bytes instead of 16 bytes. > >> This will help save RAM space as inode structure is cached in memory. > >> The other structures are transient and do not benefit from these > >> changes. > > > > IMO, this decisions sends the patch series immediately down the > > wrong path. > > There are other things the patch does that I would like to get comments > on: inode_timespec aliases, range check, individual fs changes etc. > These are independent of the inode timestamp representation changes. The inode_timespec alias is part of the problem - AFAICT it exists because you need a representation that is independent of both the old and new in-memory structures. The valid timestamp range stuff in the superblock is absolutely necessary, but that's something that can be done completely independently to the changes for supporting a differnet time storage format. And the fs changes cannot really be commented on until the VFs time representation is sorted out properly... > >TO me, this is a severe case of premature optimisation > > because everything gets way more complex just to save those 8 bytes, > > especially as those holes can be filled simply by changing the > > variable declaration order in the structure and adding a simple > > comment. > > I had tried rearranging the structure and the pahole tool does not show > any difference unless you pack and align the struct to 4 bytes on 64 > bit arch. The change actually saves 16 bytes on x86_64 and adds 12 > bytes on i386. > > Here is the breakdown for struct inode before and after the patch: > > x86_64: > /* size: 544, cachelines: 9, members: 44 */ | > /* size: 528, cachelines: 9, members: 47 */ > /* sum members: 534, holes: 3, sum holes: 10 */ | > /* sum members: 522, holes: 2, sum holes: 6 */ > > i386: > /* size: 328, cachelines: 6, members: 45 */ | > /* size: 340, cachelines: 6, members: 48 */ > /* sum members: 326, holes: 1, sum holes: 2 */ | > /* sum members: 338, holes: 1, sum holes: 2 */ > > According to /proc/slabinfo I estimated savings of 4MB on a lightly > loaded system. > > > And, really, I don't like those VFS_INODE_[GS]ET_XTIME macros at > > all; you've got to touch lots of code(*), making it all shouty and > > harder to read. They seem only to exist because of the above > > structural change requires an abstract timestamp accessor while > > CONFIG_FS_USES_64BIT_TIME exists. > > Given that goes away at the end o > > the series, so should the macro - if we use a struct timespec64 in > > the first place, it isn't even necessary as a temporary construct > > timespec64 was the first option considered here. The problem with using > timespec64 is the long data type to represent nsec. If it were possible > to change timespec64 nsec to int data type then it might be okay to use > that if we are not worried about holes. I do not see why time stamps > should have different representations on a 32 bit vs a 64 bit arch. What's it matter? iot's irrelevant to the problem at hand. Besides, have you looked at the existing timestamp definitions? they use a struct timespec, which on a 64 bit system: struct timespec i_atime; /* 88 16 */ struct timespec i_mtime; /* 104 16 */ struct timespec i_ctime; /* 120 16 */ use 2 longs and are identical to a timespec64 in everything but name. These should just be changed to a timespec64, and we suck up the fact it increases the size of the 32 bit inode as we have to increase it's size to support time > y2038 anyway. This is what I meant about premature optimisation - you've got a wonderfully complex solution to a problem that we don't need to solve to support timestamps >y2038. It's also why it goes down the wrong path at this point - most of the changes are not necessary if all we need to do is a simple timespec -> timespec64 type change and the addition timestamp range limiting in the existing truncation function... > The problem really is that > there is more than one way of updating these attributes(timestamps in > this particular case). The side effect of this is that we don't always > call timespec_trunc() before assigning timestamps which can lead to > inconsistencies between on disk and in memory inode timestamps. That's a problem that can be fixed independently of y2038 support. Indeed, we can be quite lazy about updating timestamps - by intent and design we usually have different timestamps in memory compared to on disk, which is one of the reasons why there are so many different ways to change and update timestamps.... Cheers, Dave. -- Dave Chinner david@fromorbit.com