linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Terje Eggestad <terje.eggestad@scali.com>
To: Christoph Hellwig <hch@infradead.org>,
	Arjan van de Ven <arjanv@redhat.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>, D.A.Fedorov@inp.nsk.su
Subject: Re: The disappearing sys_call_table export.
Date: 05 May 2003 11:33:36 +0200	[thread overview]
Message-ID: <1052127216.2821.51.camel@pc-16.office.scali.no> (raw)
In-Reply-To: <20030505092324.A13336@infradead.org>

Unfortunately we live in an insane world. 

First of all, in the Changelog where the export was removed for 2.5.41

http://www.kernel.org/pub/linux/kernel/v2.5/ChangeLog-2.5.41

Arjan lists 4 reasons for having the export in the first place, and I'm
on point 3. Here Arjan pretty much acknowledges that there is a
legitimate need to have a event/hook system to be informed of a syscall.
The exact quote is: "Eg the use of the export in this just a bandaid due
to lack of a proper mechanism". 

My argument for *why* there should be a mechanism stops here. 


Since you're bright inquisitive: The exact problem I'm facing is pretty
complex:


1. performance is everything. 
2. We're making a MPI library, and as such we don't have any control
with the application. 
3a. The various hardware for cluster interconnect all work with DMA. 
3b. the performance loss from copying from a receive area to the
userspace buffer is unacceptable. 
3c. It's therefore necessary for HW to access user pages. 
4. In order to to 3, the user pages must be pinned down. 
5. the way MPI is written, it's not using a special malloc() to allocate
the send receive buffers. It can't since it would break language binding
to fortran. Thus ANY writeable user page may be used. 
6. point 4: pinning is VERY expensive (point 1), so I can't pin the
buffers every time they're used. 
7. The only way to cache buffers (to see if they're used before and
hence pinned) is the user space virtual address. A syscall, thus ioctl
to a device file is prohibitive expensive under point 1.  
8a. if the app (glibc in practice, but you never know) use sbrk() with a
negative arg, and then a positive argument, I can get a a different set
of user pages with the same address. 
8b ditto with a set of munmap()/mmap().
9. since the number of times. any 'realloc' may happen is << than the
numbers of times any buffer may be used, it's necessary under point 1 to
to trace changes to virtual addresses to phys pages, rather than test
every time an address is being used. 
10. kernel patches are impractical, I must be able to do this with std
stock, redhat, AND suse kernels.   
 



On Mon, 2003-05-05 at 10:23, Christoph Hellwig wrote:
> On Mon, May 05, 2003 at 10:19:45AM +0200, Terje Eggestad wrote:
> > Now that it seem that all are in agreement that the sys_call_table
> > symbol shall not be exported to modules, are there any work in progress
> > to allow modules to get an event/notification whenever a specific
> > syscall is being called?
> 
> No.
> 
> > We have a specific need to trace mmap() and sbrk() calls. 
> 
> Well, you get mmap events for your driver and I can't imagine a sane
> reason for intwercepting sbrk().  Do you have a pointer to the driver
> source doing such strange things?
-- 
_________________________________________________________________________

Terje Eggestad                  mailto:terje.eggestad@scali.no
Scali Scalable Linux Systems    http://www.scali.com

Olaf Helsets Vei 6              tel:    +47 22 62 89 61 (OFFICE)
P.O.Box 150, Oppsal                     +47 975 31 574  (MOBILE)
N-0619 Oslo                     fax:    +47 22 62 89 51
NORWAY            
_________________________________________________________________________


  reply	other threads:[~2003-05-05  9:21 UTC|newest]

Thread overview: 207+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-05-05  8:19 The disappearing sys_call_table export Terje Eggestad
2003-05-05  8:23 ` Christoph Hellwig
2003-05-05  9:33   ` Terje Eggestad [this message]
2003-05-05  9:38     ` Arjan van de Ven
2003-05-05 10:12       ` Terje Eggestad
2003-05-05 10:25     ` Christoph Hellwig
2003-05-05 11:23       ` Terje Eggestad
2003-05-05 11:27         ` Arjan van de Ven
2003-05-05 11:31         ` Terje Eggestad
2003-05-05 11:33           ` Arjan van de Ven
2003-05-05 15:53             ` Tigran Aivazian
2003-05-05 14:57               ` Christoph Hellwig
2003-05-05 14:59               ` Arjan van de Ven
2003-05-05 12:52         ` Christoph Hellwig
2003-05-05 13:41           ` Terje Eggestad
2003-05-05 13:43             ` Christoph Hellwig
2003-05-05 13:50               ` Terje Eggestad
2003-05-05 13:54                 ` Arjan van de Ven
2003-05-05 13:55                 ` Christoph Hellwig
2003-05-05 14:28                   ` Carl-Daniel Hailfinger
2003-05-05 14:34                     ` Christoph Hellwig
2003-05-05 15:25                       ` Carl-Daniel Hailfinger
2003-05-06  7:30       ` Eric W. Biederman
2003-05-06  8:14         ` Terje Eggestad
2003-05-06  9:21           ` Eric W. Biederman
2003-05-06 11:21             ` Terje Eggestad
2003-05-06 11:37               ` Eric W. Biederman
2003-05-06 12:08                 ` Terje Eggestad
2003-05-05 11:16     ` Alan Cox
2003-05-05 13:23       ` Terje Eggestad
2003-05-08 12:25       ` Terje Malmedal
2003-05-08 12:29         ` Christoph Hellwig
2003-05-08 13:18           ` Terje Malmedal
2003-05-08 14:25             ` Christoph Hellwig
2003-05-08 15:29               ` Terje Malmedal
2003-05-08 18:13                 ` Jesse Pollard
2003-05-08 19:17                   ` Christoph Hellwig
2003-05-09  9:18                   ` Terje Malmedal
2003-05-08 14:58         ` Alan Cox
2003-05-09  8:56           ` Terje Malmedal
2003-05-07  2:14     ` Ben Lau
2003-05-05  8:27 ` Arjan van de Ven
2003-05-05  9:01 ` Dmitry A. Fedorov
2003-05-05  9:19   ` Christoph Hellwig
2003-05-05  9:32   ` Arjan van de Ven
2003-05-05 13:30 Dmitry A. Fedorov
2003-05-05 13:42 ` Christoph Hellwig
2003-05-05 14:46   ` Dmitry A. Fedorov
2003-05-05 13:45 ` viro
2003-05-05 14:29   ` Dmitry A. Fedorov
     [not found] <mailman.1052142720.4060.linux-kernel2news@redhat.com>
2003-05-05 20:50 ` Pete Zaitcev
2003-05-06  2:17   ` Dmitry A. Fedorov
2003-05-05 21:29 Chuck Ebbert
2003-05-05 22:49 ` Terje Eggestad
2003-05-06  2:23   ` Dmitry A. Fedorov
2003-05-06  7:27     ` Terje Eggestad
2003-05-06  8:21       ` Dmitry A. Fedorov
2003-05-06  8:45 Yoav Weiss
2003-05-06  9:15 ` David S. Miller
2003-05-06 19:45   ` David Schwartz
2003-05-06 10:06 ` Dmitry A. Fedorov
2003-05-06 17:01 ` Jerry Cooperstein
2003-05-06 17:45   ` Yoav Weiss
2003-05-06 15:51 Yoav Weiss
2003-05-06 20:48 Chuck Ebbert
2003-05-07 15:34 petter wahlman
2003-05-07 15:48 ` Arjan van de Ven
2003-05-07 16:00 ` Richard B. Johnson
2003-05-07 16:08   ` petter wahlman
2003-05-07 16:45     ` Richard B. Johnson
2003-05-07 16:59     ` Richard B. Johnson
2003-05-07 18:07       ` petter wahlman
2003-05-07 18:33         ` Richard B. Johnson
2003-05-08  8:58           ` petter wahlman
2003-05-08 15:11             ` Richard B. Johnson
2003-05-07 21:27         ` Jesse Pollard
2003-05-07 17:21     ` Jesse Pollard
2003-05-07 16:18 ` Steffen Persvold
2003-05-08 12:23   ` Eric W. Biederman
2003-05-07 19:04 Chuck Ebbert
2003-05-08  9:58 ` Terje Eggestad
2003-05-08  9:59   ` Arjan van de Ven
2003-05-08 10:20     ` viro
2003-05-08 12:54     ` Terje Eggestad
2003-05-08 12:58       ` Christoph Hellwig
2003-05-08 19:10         ` Shachar Shemesh
2003-05-08 19:15           ` Christoph Hellwig
2003-05-08 21:48             ` J.A. Magallon
2003-05-09  7:43               ` Muli Ben-Yehuda
2003-05-09  7:42             ` Muli Ben-Yehuda
2003-05-09  8:08               ` Greg KH
2003-05-09 19:07                 ` Muli Ben-Yehuda
2003-05-08 14:08 Chuck Ebbert
2003-05-08 14:36 ` Christoph Hellwig
2003-05-08 14:42 ` Alan Cox
2003-05-08 14:56 ` Jesse Pollard
2003-05-08 15:22   ` Alan Cox
2003-05-08 17:02     ` William Stearns
2003-05-08 18:28     ` Jesse Pollard
2003-05-10 14:38     ` Ahmed Masud
2003-05-10 16:50       ` Arjan van de Ven
2003-05-10 17:51         ` Ahmed Masud
2003-05-10 17:56           ` Arjan van de Ven
2003-05-10 18:03             ` Ahmed Masud
2003-05-10 18:09             ` Ahmed Masud
2003-05-10 18:43           ` Werner Almesberger
2003-05-10 18:26         ` Werner Almesberger
2003-05-11 11:01         ` Terje Malmedal
2003-05-11 11:57           ` Ahmed Masud
2003-05-08 19:43 Chuck Ebbert
2003-05-08 19:48 ` Christoph Hellwig
2003-05-08 21:44 ` Alan Cox
2003-05-08 19:43 Chuck Ebbert
2003-05-08 19:58 ` Christoph Hellwig
2003-05-09 13:53 ` Jesse Pollard
2003-05-09 14:37   ` Ragnar =?unknown-8bit?Q?Kj=F8rstad?=
2003-05-12 14:19     ` Jesse Pollard
2003-05-12 15:56       ` Christoph Hellwig
2003-05-08 19:43 Chuck Ebbert
2003-05-09  7:50 Chuck Ebbert
2003-05-09  7:59 ` Christoph Hellwig
2003-05-09 12:18 ` Alan Cox
2003-05-09 17:07   ` Valdis.Kletnieks
2003-05-10 15:34     ` Alan Cox
2003-05-09  7:50 Chuck Ebbert
2003-05-09  7:57 ` Christoph Hellwig
2003-05-09  9:11 Chuck Ebbert
2003-05-09 10:47 ` Christoph Hellwig
2003-05-09  9:43 Chuck Ebbert
2003-05-09 11:09 Chuck Ebbert
2003-05-09 12:41 Chuck Ebbert
2003-05-09 12:47 ` Christoph Hellwig
2003-05-09 17:07 Chuck Ebbert
2003-05-09 17:07 Chuck Ebbert
2003-05-09 18:27 ` Richard B. Johnson
2003-05-09 19:02   ` Valdis.Kletnieks
2003-05-09 19:18     ` Richard B. Johnson
2003-05-09 19:25       ` Valdis.Kletnieks
2003-05-09 21:22 Chuck Ebbert
2003-05-10 19:18 Yoav Weiss
2003-05-10 19:53 ` Muli Ben-Yehuda
2003-05-10 20:06   ` Yoav Weiss
2003-05-11  3:54     ` Ahmed Masud
2003-05-10 20:48 ` David Wagner
2003-05-10 19:32 Chuck Ebbert
2003-05-10 21:45 Yoav Weiss
2003-05-11 16:32 Chuck Ebbert
2003-05-11 17:20 ` David Wagner
2003-05-11 17:53 ` Yoav Weiss
2003-05-11 20:39 Chuck Ebbert
2003-05-11 22:32 ` Yoav Weiss
2003-05-11 21:46   ` Alan Cox
2003-05-11 22:57     ` David Schwartz
2003-05-14 21:08       ` H. Peter Anvin
2003-05-11 23:22     ` Yoav Weiss
2003-05-11 22:32 ` Ahmed Masud
     [not found] <20030511164010$5d34@gated-at.bofh.it>
2003-05-12  0:47 ` Ben Pfaff
2003-05-12 16:32 Chuck Ebbert
2003-05-12 16:46 ` Alan Cox
     [not found] <20030512164017$6c09@gated-at.bofh.it>
2003-05-12 17:02 ` Pascal Schmidt
2003-05-12 21:51 Chuck Ebbert
2003-05-12 21:05 ` Alan Cox
2003-05-12 22:12 ` Valdis.Kletnieks
2003-05-12 21:19   ` Alan Cox
2003-05-12 22:29     ` Valdis.Kletnieks
2003-05-13 12:31     ` Ahmed Masud
2003-05-12 22:57 Yoav Weiss
2003-05-12 23:58 ` Bryan Andersen
2003-05-13 12:11 ` Jesse Pollard
2003-05-13 13:44   ` Yoav Weiss
2003-05-13 21:26     ` Jesse Pollard
2003-05-13 22:21       ` Yoav Weiss
2003-05-14 13:05         ` Jesse Pollard
2003-05-13  1:57 Chuck Ebbert
2003-05-13  2:25 ` Yoav Weiss
2003-05-13  1:57 Chuck Ebbert
2003-05-13 12:24 ` Jesse Pollard
2003-05-13  9:52 Chuck Ebbert
2003-05-13 13:32 ` Yoav Weiss
2003-05-14  7:44 ` Mike Touloumtzis
2003-05-14 10:34   ` Ahmed Masud
2003-05-14 20:58     ` Mike Touloumtzis
2003-05-14 21:32       ` Richard B. Johnson
2003-05-14 21:37         ` Yoav Weiss
2003-05-14 21:51           ` Richard B. Johnson
2003-05-15 13:17         ` Jesse Pollard
2003-05-15 15:16           ` Chris Ricker
2003-05-15 15:31             ` Richard B. Johnson
2003-05-15 15:33               ` Chris Ricker
2003-05-15 15:46                 ` Richard B. Johnson
2003-05-15 16:21                   ` Ahmed Masud
2003-05-15  2:06       ` Ahmed Masud
2003-05-13 13:58 Yoav Weiss
2003-05-13 22:51 ` Ahmed Masud
2003-05-13 23:58   ` Yoav Weiss
2003-06-12 23:20     ` Nigel Cunningham
2003-06-15 22:37       ` Yoav Weiss
2003-05-13 14:45 Chuck Ebbert
2003-05-13 21:32 ` Jesse Pollard
2003-05-13 14:45 Chuck Ebbert
2003-05-13 19:00 ` jjs
2003-05-13 21:44 ` Jesse Pollard
2003-05-14  8:41 Chuck Ebbert
2003-05-14 23:24 Chuck Ebbert
2003-05-15  0:49 ` David Schwartz
2003-05-15  8:16 Chuck Ebbert
2003-05-16 16:15 Chuck Ebbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1052127216.2821.51.camel@pc-16.office.scali.no \
    --to=terje.eggestad@scali.com \
    --cc=D.A.Fedorov@inp.nsk.su \
    --cc=arjanv@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).