linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andi Kleen <ak@suse.de>
To: "Michael S. Tsirkin" <mst@mellanox.co.il>
Cc: Andi Kleen <ak@suse.de>,
	discuss@x86-64.org, linux-kernel@vger.kernel.org
Subject: Re: [discuss] f_ops flag to speed up compatible ioctls in linux kernel
Date: Tue, 7 Sep 2004 14:14:18 +0200	[thread overview]
Message-ID: <20040907121418.GC25051@wotan.suse.de> (raw)
In-Reply-To: <20040907104017.GB10096@mellanox.co.il>

On Tue, Sep 07, 2004 at 01:40:17PM +0300, Michael S. Tsirkin wrote:
> Hello!
> Quoting Andi Kleen (ak@suse.de) "Re: [discuss] f_ops flag to speed up compatible ioctls in linux kernel":
> > On Wed, Sep 01, 2004 at 10:22:45AM +0300, Michael S. Tsirkin wrote:
> > > Hello!
> > > Currently, on the x86_64 architecture, its quite tricky to make
> > > a char device ioctl work for an x86 executables.
> > > In particular,
> > >    1. there is a requirement that ioctl number is unique -
> > >       which is hard to guarantee especially for out of kernel modules
> > 
> > Yes, that is a problem for some people. But you should
> > have used an unique number in the first place.
> 
> Do you mean the _IOC macro and friends?
> But their uniqueness depends on allocating a unique magic number
> in the first place.

Yep. It's not bullet proof, but works pretty well in practice with
a little care.

> 
> > There are some hackish ways to work around it for non modules[1], but at some
> > point we should probably support it better.
> > 
> > [1] it can be handled, except for module unloading, so you have
> > to disable that.
> 
> Why use the global hash at all?
> Why not, for example, pass a parameter to the ioctl function
> to make it possible to figure out this is a compat call?

The main reason is that traditionally there was some resistance
to put compat code into the drivers itself because it "looked too
ugly". So it was just put into a few centralized files. Patching 
all the f_ops wouldn't have been practical for this. 

Maybe it could be added as an additional mechanism now though.

> > >    2. there's a performance huge overhead for each compat call - there's
> > >       a hash lookup in a global hash inside a lock_kernel -
> > >       and I think compat performance *is* important.
> > 
> > Did you actually measure it? I doubt it is a big issue.
> > 
> 
> But that would depend on what the driver actually does inside
> the ioctl and on how many ioctls are already registered, would it not?

Most ioctls should be registered at boot, the additional ones
are probably negligible.

> 
> I built a silly driver example which just used a semaphore and a switch
> statement inside the ioctl.
> 
> ~/<1>tavor/tools/driver_new>time /tmp/ioctltest64 /dev/mst/mt23108_pci_cr0
> 0.357u 4.760s 0:05.11 100.0%    0+0k 0+0io 0pf+0w
> ~/<1>tavor/tools/driver_new>time /tmp/ioctltest32 /dev/mst/mt23108_pci_cr0
> 0.641u 6.486s 0:07.12 100.0%    0+0k 0+0io 0pf+0w
> 
> So just looking at system time there seems to be an overhead of
> about 20%.

That's with an empty ioctl? I would expect most ioctls to do
more work, so the overhead would be less.

Still it could be probably made better. 

> The overhead is bigger if there are collisions in the hash.
> 
> For muti-processor scenarious, the difference is much more pronounced
> (note I have dual-cpu Opteron system):
> 
> ~>time /tmp/ioctltest32 /dev/mst/mt23108_pci_cr0 & ;time /tmp/ioctltest32
> /dev/mst/mt23108_pci_cr0 &
> [2] 10829
> [3] 10830
> [2]    Done                          /tmp/ioctltest32 /dev/mst/mt23108_pci_cr0
> 0.435u 21.322s 0:21.76 99.9%    0+0k 0+0io 0pf+0w
> [3]    Done                          /tmp/ioctltest32 /dev/mst/mt23108_pci_cr0
> 0.683u 21.231s 0:21.92 99.9%    0+0k 0+0io 0pf+0w
> ~>
> 
> 
> ~>time /tmp/ioctltest64 /dev/mst/mt23108_pci_cr0 & ;time /tmp/ioctltest64
> /dev/mst/mt23108_pci_cr0 &
> [2] 10831
> [3] 10832
> [3]    Done                          /tmp/ioctltest64 /dev/mst/mt23108_pci_cr0
> 0.474u 11.194s 0:11.70 99.6%    0+0k 0+0io 0pf+0w
> [2]    Done                          /tmp/ioctltest64 /dev/mst/mt23108_pci_cr0
> 0.476u 11.277s 0:11.75 99.9%    0+0k 0+0io 0pf+0w
> ~>
> 
> So we get 50% slowdown.
> I imagine this is the result of BKL contention during the hash lookup.


Ok, this could be improved agreed (although I still think your microbenchmark
is a bit too artificial) 

In theory the BKL could be dropped from the lookup anyways
if RCU is needed for the cleanup. For locking the handler 
itself into memory it doesn't make any difference.

What happens when you just remove the lock_kernel() there? 
(as long as you don't unload any modules this should be safe) 

-Andi



  reply	other threads:[~2004-09-07 12:21 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-01  7:22 f_ops flag to speed up compatible ioctls in linux kernel Michael S. Tsirkin
2004-09-01  7:32 ` viro
2004-09-01  7:44   ` Michael S. Tsirkin
2004-09-01  7:47   ` Lee Revell
2004-09-01  8:19     ` Michael S. Tsirkin
2004-09-01 15:55   ` Roland Dreier
2004-09-01 18:02     ` Chris Wright
2004-09-01 18:12       ` Roland Dreier
2004-09-01 18:31         ` viro
2004-09-01 20:54       ` Roland Dreier
     [not found]         ` <20040901170800.K1924@build.pdx.osdl.net>
     [not found]           ` <20040901190122.L1924@build.pdx.osdl.net>
2004-09-02  3:46             ` Roland Dreier
2004-09-01 18:06   ` Bill Davidsen
2004-09-01  8:30 ` Arjan van de Ven
2004-09-01 15:40 ` [PATCH] fs/compat.c: rwsem instead of BKL around ioctl32_hash_table Roland Dreier
2004-09-01 23:27   ` Andrew Morton
2004-09-02 21:14   ` Andi Kleen
2004-09-02 22:26     ` Roland Dreier
2004-09-03 14:37       ` [discuss] " Andi Kleen
2004-09-03 14:55         ` Roland Dreier
2004-09-03 15:02           ` Andi Kleen
2004-09-03  8:00 ` [discuss] f_ops flag to speed up compatible ioctls in linux kernel Andi Kleen
2004-09-07 10:40   ` Michael S. Tsirkin
2004-09-07 12:14     ` Andi Kleen [this message]
2004-09-07 13:45       ` Michael S. Tsirkin
2004-09-07 14:15         ` Andi Kleen
2004-09-07 14:25           ` Michael S. Tsirkin
2004-09-07 14:29             ` Andi Kleen
2004-09-07 14:37               ` Michael S. Tsirkin
2004-09-07 14:44                 ` Andi Kleen
2004-09-07 14:45                   ` Michael S. Tsirkin
2004-09-07 15:10                     ` Andi Kleen
2004-09-07 18:16                       ` Michael S. Tsirkin
2004-09-08  6:55                         ` Andi Kleen
2004-09-08 14:28                           ` [patch] " Michael S. Tsirkin
2004-09-08 14:38                             ` Andi Kleen
2004-09-08 14:54                               ` Michael S. Tsirkin
2004-09-08 14:58                                 ` Andi Kleen
2004-09-12 20:05                                   ` Michael S. Tsirkin
2004-09-15 13:19                               ` [patch] Re: [discuss] " Michael S. Tsirkin
2004-09-07 21:36                       ` Is FIOQSIZE compatible? ( was Re: f_ops flag to speed up compatible ioctls in linux kernel) Michael S. Tsirkin
2004-09-08  6:54                         ` Andi Kleen
2004-09-07 15:03               ` [discuss] f_ops flag to speed up compatible ioctls in linux kernel Herbert Poetzl
2004-09-07 18:07                 ` Michael S. Tsirkin
2004-09-09 13:54                   ` Herbert Poetzl
2004-12-12 21:51       ` how to detect a 32 bit process on 64 bit kernel Michael S. Tsirkin
2004-12-12 22:01         ` Jan Engelhardt
2004-12-12 22:23         ` Christoph Hellwig
2004-12-13 19:50           ` Michael S. Tsirkin
2004-12-13 21:01             ` Jan Engelhardt
2004-12-13 21:32             ` Brian Gerst
2004-12-13 21:37               ` Michael S. Tsirkin
2004-12-12 22:37         ` Willy Tarreau
2004-12-12 23:30           ` Bongani Hlope
2004-12-14  7:28         ` Andi Kleen
2004-09-20 14:49   ` [patch] speed up ioctls in linux kernel Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040907121418.GC25051@wotan.suse.de \
    --to=ak@suse.de \
    --cc=discuss@x86-64.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@mellanox.co.il \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).