linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jamie Lokier <jamie@shareable.org>
To: Hans Reiser <reiser@namesys.com>
Cc: Daniel Phillips <phillips@arcor.de>,
	Helge Hafting <helgehaf@aitel.hist.no>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: atomic kernel operations are very tricky to export to user space (was  [RFC] Improved inode number allocation for HTree )
Date: Tue, 11 Mar 2003 23:49:38 +0000	[thread overview]
Message-ID: <20030311234938.GB16507@bjl1.jlokier.co.uk> (raw)
In-Reply-To: <3E6E545F.5060608@namesys.com>

Hans Reiser wrote:
> Allowing arbitrary filesystem operations to be 
> combined into one atomic transaction seems problematic for either user 
> space or the kernel, depending on what you do.
> 
> In general, allowing user space to lock things means that you trust user 
> space  to unlock.  This creates all sorts of trust troubles, and if you 
> force the unlock after some timeout, then the user space application 
> becomes vulnerable to DOS from other processes causing it to exceed the 
> timeout.
>
> Ideas on this are welcome.

You can allow user space to begin a transaction, do some operations
and end a transaction, possibly returning an "abort" result which
means userspace should assume the transaction did not commit any
results and/or whatever was read in the transaction was not reliable.

On the face of it this leaves userspace susceptible to DOS or indeed
fairness/livelock problems.  For example if another program is always
changing a directory entry, how can you read that whole directory
in a transaction?

Fairness/livelock problems are hard to avoid with any kinds of lock.
Even the kernel's internal locks have these problems in corner cases
(for example, remember when gettimeofday()'s clock access had to be
converted from using a spinlock to a sequence lock - and that still
doesn't _guarantee_ there is no problem in principle, it just reduces
the probability in all reasonable scenarios).

However, some remedies can be applied to filesystem transactions.  If
an operation would cause some other task's transaction to eventually
return an abort code, consider sleeping for a short duration.
Randomise that duration.  If the other transaction(s) have been
aborting repeatedly, consider lengthening the sleep duration and/or
specifically waiting for the other transaction to complete, to boost
the other task(s) likilihood of transaction success.  Randomise this
decision too.  If you know something about the type of other
transactions (such as it is trying to implement a read-write lock by
doing atomic operations on bytes in a file), consider exactly what
policy you hope to offer (writer preference?  reader preference?
something in between?)

By which point it has remarkable similarities to the problems of
fairness in the task scheduler, and fairness/livelock in locks.

-- Jamie

  reply	other threads:[~2003-03-11 23:39 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-02-27 17:31 [Bug 417] New: htree much slower than regular ext3 Martin J. Bligh
2003-02-28  2:55 ` Daniel Phillips
2003-02-27 21:00   ` Andreas Dilger
2003-02-28  4:12     ` Daniel Phillips
2003-02-27 21:33       ` Martin J. Bligh
2003-03-13 21:04     ` [Ext2-devel] " Stephen C. Tweedie
2003-03-07 15:46 ` Alex Tomas
2003-03-08 17:38   ` Daniel Phillips
2003-03-07 23:27     ` Theodore Ts'o
2003-03-09 19:26       ` Alex Tomas
2003-03-09  7:08     ` Alex Tomas
2003-03-10 17:58       ` Daniel Phillips
2003-03-10 21:25       ` Theodore Ts'o
2003-03-11 21:57   ` Bill Davidsen
     [not found] ` <20030307214833.00a37e35.akpm@digeo.com>
     [not found]   ` <20030308010424.Z1373@schatzie.adilger.int>
2003-03-09 22:54     ` [Ext2-devel] " Daniel Phillips
2003-03-08 23:19       ` Andrew Morton
2003-03-09 23:10   ` Daniel Phillips
     [not found] ` <20030309184755.ACC80FCA8C@mx12.arcor-online.net>
     [not found]   ` <m3u1ecl5h8.fsf@lexa.home.net>
2003-03-10 20:45     ` [RFC] Improved inode number allocation for HTree Daniel Phillips
     [not found]       ` <3E6D1D25.5000004@namesys.com>
     [not found]         ` <20030311031216.8A31CEFD5F@mx12.arcor-online.net>
2003-03-11 10:45           ` Hans Reiser
2003-03-11 13:00             ` Helge Hafting
2003-03-11 13:41               ` Daniel Phillips
2003-03-11 17:16                 ` Andreas Dilger
2003-03-11 19:39                 ` Helge Hafting
2003-03-11 20:19                   ` Daniel Phillips
2003-03-11 21:25                 ` atomic kernel operations are very tricky to export to user space (was [RFC] Improved inode number allocation for HTree ) Hans Reiser
2003-03-11 23:49                   ` Jamie Lokier [this message]
2003-03-10 20:48     ` [RFC] Improved inode number allocation for HTree Daniel Phillips
2003-03-10 21:04       ` John Bradford
2003-03-10 21:28         ` Andreas Schwab
2003-03-10 21:50           ` Filesystem write priorities, (Was: Re: [RFC] Improved inode number allocation for HTree) John Bradford
2003-03-14 21:55             ` [Ext2-devel] " Stephen C. Tweedie
2003-03-10 21:33         ` [RFC] Improved inode number allocation for HTree Daniel Phillips
2003-03-10 21:47           ` [Ext2-devel] " Bryan O'Sullivan
2003-03-10 22:02             ` Matthew Wilcox
2003-03-11  8:47               ` Jakob Oestergaard
2003-03-11 11:27                 ` John Bradford
2003-03-14 21:57               ` Stephen C. Tweedie
2003-03-15  8:39                 ` jw schultz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030311234938.GB16507@bjl1.jlokier.co.uk \
    --to=jamie@shareable.org \
    --cc=helgehaf@aitel.hist.no \
    --cc=linux-kernel@vger.kernel.org \
    --cc=phillips@arcor.de \
    --cc=reiser@namesys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).