linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@suse.de>
To: Benjamin LaHaise <bcrl@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	Linus Torvalds <torvalds@transmeta.com>,
	linux-kernel@vger.kernel.org, linux-aio@kvack.org
Subject: Re: async-io API registration for 2.5.29
Date: Tue, 30 Jul 2002 23:26:57 +0200	[thread overview]
Message-ID: <20020730212657.GL1181@dualathlon.random> (raw)
In-Reply-To: <20020730125943.B10315@redhat.com>

On Tue, Jul 30, 2002 at 12:59:43PM -0400, Benjamin LaHaise wrote:
> is what the 2.4.18 patches are, as they still cause that VM to OOM under 
> rather trivial io patterns).

I would like if you could reproduce with the aio in my tree after an:

	echo 1 >/proc/sys/vm/vm_gfp_debug

that will give you stack traces that you should send back to me, and I
will tell you exactly what the problem is (if you can reproduce).

> 
> > Really last thing: one of the major reasons I don't like the above code
> > besides the overhead and complexity it introduces is that it doesn't
> > guarantee 100% that it will be forward compatible with 2.5 applications
> > (the syscall 250 looks not to check even for the payload, I guess they
> > changed it because it was too slow to be forward compatible in most
> > cases), the /dev/urandom payload may match the user arguments if you're
> > unlucky and since we can guarantee correct operations by doing a syscall
> > registration, I don't see why we should make it work by luck.
> 
> You haven't looked at the code very closely then.  It checks that the 
> payload matches, and that the caller is coming from the vsyscall pages.  

I didn't noticed the caller needed to came from the vsyscall pages, that
makes it safer but still it's an huge complexity that you apparently
disabled in your test tree because it was harming performance.

> that x86-64 gets wrong by not requiring the vsyscall page to need an 
> mmap into the user's address space: UML cannot emulate vsyscalls by 

I don't want vma overhead in the rbtree, nor in the mm_struct, nor I
want mmap in general to deal with vsyscalls for obvious performance
reasons.

> faking the mmap.

the fix for uml is trivial, the simplest approch is to add a prctl that
disables vsyscalls for a certain process and that cannot be re-enabled
by the userspace (so a one-way prctl), the vsyscall will be swapped with
a vsyscall that invokes the real syscall and uml will trap gettimeofday
syscall like it does on x86. We also discussed some more complicated and
sophisticated approch but I like the prctl that forces the
gettimeofday/time syscalls because that could be used trivially for
strace too (of course ltrace will just show the gettimeofday call
because we pass through glibc, infact uml for 99% of cases could simply
use LD_PRELOAD, but Jeff didn't like it for good reasons: because it's
not transparent enough for userspace and of course it doesn't work with
statically linked binaries).

In short the prctl that redirects the program and all childs to use the
real syscall would be my preferred approch, as said the uml kernel
should still be able to use the vgettimeofday, only the childs (the
userspace running under the uml kernel) will be executed with the prctl
enabled and the fact userspace cannot disable the prctl (once enabled
before execve) will guarantee the system will function correctly.  It
will require a per-task information and a switch_to hack that will
change the fixmap entry and inlvpg if needed.

Now I don't remeber anymore if I just suggested the above prctl way to
Jeff and he just found any weakness in it that could make it not a
feasible way for uml, but in such case he will remind me about it now :)

Infact we will use the same tecnique of using a vsyscall that redirect
to a real syscalls for all kind of vsyscalls that in some hardware may
need to know what cpu they are running on to return the result, this is
never been needed so far but it was one of the possibilities that our
vsyscall design offered.

Andrea

  parent reply	other threads:[~2002-07-30 21:22 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-07-30  5:41 async-io API registration for 2.5.29 Andrea Arcangeli
2002-07-30  8:11 ` Christoph Hellwig
2002-07-30 13:40   ` Linus Torvalds
2002-07-30 13:52     ` Benjamin LaHaise
2002-07-30 16:43   ` Andrea Arcangeli
2002-07-30 16:59     ` Benjamin LaHaise
2002-07-30 19:10       ` Jeff Dike
2002-07-30 18:09         ` Benjamin LaHaise
2002-07-30 18:15           ` Linus Torvalds
2002-07-30 18:31             ` Benjamin LaHaise
2002-07-30 20:57               ` Jeff Dike
2002-07-30 20:47           ` Jeff Dike
2002-07-30 21:26       ` Andrea Arcangeli [this message]
2002-07-30 10:50 ` Rik van Riel
2002-07-30 12:49 ` Benjamin LaHaise
2002-07-30 13:29   ` Suparna Bhattacharya
2002-07-30 21:41   ` Andrea Arcangeli
2002-07-30 21:54     ` [rfc] aio-core for 2.5.29 (Re: async-io API registration for 2.5.29) Benjamin LaHaise
2002-07-31  0:44       ` Andrea Arcangeli
2002-07-31 14:46         ` Benjamin LaHaise
2002-07-31 16:31         ` Charles 'Buck' Krasic
2002-08-01 10:30         ` Pavel Machek
2002-08-01 14:47           ` Benjamin LaHaise
2002-08-01 15:00             ` Chris Friesen
2002-08-01 16:09               ` Linus Torvalds
2002-08-01 17:30                 ` Alan Cox
2002-08-01 16:30                   ` Linus Torvalds
2002-08-01 16:41                     ` [rfc] aio-core for 2.5.29 (Re: async-io API registration for2.5.29) Chris Friesen
2002-08-01 18:01                     ` [rfc] aio-core for 2.5.29 (Re: async-io API registration for 2.5.29) Benjamin LaHaise
2002-08-15 23:54                       ` aio-core why not using SuS? [Re: [rfc] aio-core for 2.5.29 (Re: async-io API registration for 2.5.29)] Andrea Arcangeli
2002-08-16  1:42                         ` Benjamin LaHaise
2002-08-16  1:57                           ` Andrea Arcangeli
2002-08-16  2:00                             ` Benjamin LaHaise
2002-08-16  2:08                               ` Linus Torvalds
2002-08-16  2:16                                 ` Benjamin LaHaise
2002-08-16  2:40                                   ` Andrea Arcangeli
2002-08-16  3:43                                   ` Linus Torvalds
2002-08-16  3:50                                     ` Linus Torvalds
2002-08-16  4:47                                       ` William Lee Irwin III
2002-08-17  3:46                                   ` Martin J. Bligh
2002-08-17  4:00                                     ` Linus Torvalds
2002-08-17  4:15                                       ` Martin J. Bligh
2002-08-17  4:46                                         ` Linus Torvalds
2001-11-02  5:12                                           ` Pavel Machek
2002-08-17  5:04                                           ` Linus Torvalds
2002-08-17  5:24                                             ` lots of mem on 32 bit machines (was: aio-core why not using SuS?) Martin J. Bligh
2002-08-17  5:12                                           ` aio-core why not using SuS? [Re: [rfc] aio-core for 2.5.29 (Re: async-io API registration for 2.5.29)] Martin J. Bligh
2002-08-17 17:02                                             ` Linus Torvalds
2002-08-17 21:27                                               ` 32 bit arch with lots of RAM Martin J. Bligh
2002-08-22 16:30                                                 ` Andrea Arcangeli
2002-08-22 16:36                                                   ` Martin J. Bligh
2002-08-22 16:15                                               ` aio-core why not using SuS? [Re: [rfc] aio-core for 2.5.29 (Re: async-io API registration for 2.5.29)] Andrea Arcangeli
2002-08-22 16:12                                             ` Andrea Arcangeli
2002-08-20  0:35                                           ` Ingo Molnar
2002-08-17  4:36                                       ` William Lee Irwin III
2002-08-16  2:32                                 ` Rik van Riel
2002-08-16  2:32                               ` Andrea Arcangeli
2002-08-16  9:39                           ` Suparna Bhattacharya
2002-08-16 10:03                             ` Andrea Arcangeli
2002-08-16 11:23                               ` Suparna Bhattacharya
2002-08-16 11:28                                 ` Suparna Bhattacharya
2002-08-16 13:49                                   ` Dan Kegel
2002-09-02 18:40                                 ` Andrea Arcangeli
2002-09-03 12:04                                   ` aio-core in 2.5 - io_queue_wait and io_getevents Suparna Bhattacharya
2002-09-05  5:21                                   ` aio-core why not using SuS? [Re: [rfc] aio-core for 2.5.29 (Re: async-io API registration for 2.5.29)] Benjamin LaHaise
2002-08-16 13:43                             ` Dan Kegel
2002-08-16 14:21                               ` Jamie Lokier
2002-08-16 14:42                                 ` Benjamin LaHaise
2002-08-16 15:40                               ` John Gardiner Myers
2002-08-23 16:11                                 ` aio-core why not using SuS? [Re: [rfc] aio-core for 2.5.29 (Re:async-io " Dan Kegel
2002-08-16  1:53                         ` aio-core why not using SuS? [Re: [rfc] aio-core for 2.5.29 (Re: async-io " Dan Kegel
2002-08-01 19:18                     ` [rfc] aio-core for 2.5.29 (Re: async-io API registration for 2.5.29) Chris Wedgwood
2002-08-01 19:25                       ` Linus Torvalds
2002-08-01 19:31                         ` Chris Wedgwood
2002-08-02  8:24                     ` Pavel Machek
2002-08-02 11:59                       ` Alan Cox
2002-08-02 15:56                         ` Linus Torvalds
2002-07-31  1:20     ` async-io API registration for 2.5.29 Rik van Riel
2002-07-31  1:32       ` Andrea Arcangeli
2002-07-31  8:25         ` Christoph Hellwig
2002-07-31 13:19           ` Andrea Arcangeli
2002-07-30 13:34 ` Linus Torvalds
2002-07-30 16:49   ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020730212657.GL1181@dualathlon.random \
    --to=andrea@suse.de \
    --cc=bcrl@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-aio@kvack.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).