From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from imap.thunk.org ([74.207.234.97]:53160 "EHLO imap.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725868AbeIERjD (ORCPT ); Wed, 5 Sep 2018 13:39:03 -0400 Date: Wed, 5 Sep 2018 09:08:45 -0400 From: "Theodore Y. Ts'o" To: =?utf-8?B?54Sm5pmT5Yas?= Cc: jlayton@redhat.com, R.E.Wolff@bitwizard.nl, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: POSIX violation by writeback error Message-ID: <20180905130845.GE23909@thunk.org> References: <20180904075347.GH11854@BitWizard.nl> <82ffc434137c2ca47a8edefbe7007f5cbecd1cca.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Wed, Sep 05, 2018 at 04:09:42PM +0800, 焦晓冬 wrote: > Well, since the reader application and the writer application are reading > a same file, they are indeed related. The reader here is expecting > to read the lasted data the writer offers, not any data available. The > reader is surely not expecting to read partially new and partially old data. > Right? And, that `read() should return the lasted write()` by POSIX > supports this expectation. Unix, and therefore Linux's, core assumption is that the primary abstraction is the file. So if you say that all applications which read or write the same file, that's equivalent of saying, "all applications are related". Consider that a text editor can read a config file, or a source file, or any other text file. Consider shell script commands such as "cat", "sort", "uniq". Heck /bin/cp copies any type of file. Does that mean that /bin/cp, as a reader application, is related to all applications on the system. The real problem here is that we're trying to guess the motivations and usage of programs that are reading the file, and there's no good way to do that. It could be that the reader is someone who wants to be informed that file is in page cache, but was never persisted to disk. It could be that the user has figured out something has gone terribly wrong, and is desperately trying to rescue all the data she can by copying it to another disk. In that case, stopping the reader from being able to access the contents is exactly the wrong thing to do if what you care about is preventing data loss. The other thing which you seem to be assuming is that applications which care about precious data won't use fsync(2). And in general, it's been fairly well known for decades that if you care about your data, you have to use fsync(2) or O_DIRECT writes; and you *must* check the error return of both the fsync(2) and the close(2) system calls. Emacs got that right in the mid-1980's --- over 30 years ago. We mocked GNOME and KDE's toy notepad applications for getting this wrong a decade ago, and they've since fixed it. Actually, the GNOME and KDE applications, because they were too lazy to persist the xattr and ACL's, decided it was better to truncate the file and then rewrite it. So if you crashed after the truncate... your data was toast. This was a decade ago, and again, it was considered spectacular bad application programming then, and it's since been fixed. The point here is that there will always be lousy application programs. And it is a genuine systems design question how much should we sacrifice performance and efficiency to accomodate stupid application programs. For example, we could make close(2) imply an fsync(2), and return the error in close(2). But *that* assumes that applications actually check the return value for close(2) --- and there will be those that don't. This would completely trash performance for builds, since it would slow down writing generated files such as all the *.o object files. Which since they are generated files, they aren't precious. So forcing an fsync(2) after writing all of those files will destroy your system performance. - Ted