All of lore.kernel.org
 help / color / mirror / Atom feed
* Inline hunt results for 4.3.0-rc1
@ 2015-10-27 14:32 Denys Vlasenko
  2015-10-27 14:55 ` Hannes Frederic Sowa
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Denys Vlasenko @ 2015-10-27 14:32 UTC (permalink / raw)
  To: Linux Kernel Mailing List, Oleg Nesterov

Hi,

I have created a set of semi-automated scripts which look for
large inlines in the kernel.

Recently I taught it to even generate "git format-patch" patches
(unfortunately, only for inlines in *.c files, not *.h),
and here are they for 4.3.0-rc1 - i.e. current Linus tree.

Submitting 300+ patches separately would amount to spamming,
instead I encourage people to take a look at the patches
on the Web:

    http://busybox.net/~vda/inline_hunt/4.3.0-rc1/
    http://busybox.net/~vda/inline_hunt/4.3.0-rc1/README

and in particular, at the set of most juicy patches, each of which
shaves off more than 1000 bytes off its *.c module:

    http://busybox.net/~vda/inline_hunt/4.3.0-rc1/patch_saves1000/


Kernel was built with the .config file stored in this directory.
The build was for x86_64 architecture.

Then deinlining of every inline was attempted, kernel rebuilt and
size of resulting vmlinux compared with the original.
(The procedure is somewhat non-trivial since inlines in *.h files,
when automatically deinlined, are replicated several times.)

I don't see an easy method of automatically generating patches
for inlines in *.h files: they have to be moved to a *.c file.

However, preparation of patches for inlines in *.c files can be automated.
Here are the results. Patches are formatted as follows:

    Subject: [PATCH] path/to/file.c: Deinline funcname, save NNNN bytes

    This function compiles to MMMM bytes of machine code.


The patches are sorted into directories as follows:

patch_saves1000/* are patches which save 1000 or more bytes of code each.
None of them were found to have particularly tiny functions:
the smallest function compiles to 101 bytes of machine code, only 18 compile
to less than 200 bytes.
These patches are prime candidates for review and inclusion into kernel.

patch_saves300_funcsize100/* are patches which save from 300 to 1000
bytes of code each, and deinlined function generates no less than 100
bytes of code. These patches also look likely to be worthy of inclusion.

patch_funcsize200/* are patches whose deinlined function generates
no less than 200 bytes of code, but the savings are less than 300 bytes
overall (otherwise, the patch would be in one of previous directories).
This means that the function is either having two callsites:

    drivers/isdn/capi/capilib.c: Deinline mq_init, save 190 bytes
    This function compiles to 200 bytes of machine code.

or has just one callsite, but it is a largish function where inlining
actually hurts:

    drivers/gpu/drm/i915/i915_gem_context.c: Deinline mi_set_context,
save 112 bytes
    This function compiles to 772 bytes of machine code.

patch_saves300/* are patches which save from 300 to 1000
bytes of code each, but deinlined function generates less than 100 bytes
of code.

patch_remaining/* are the remaining patches.

To retrieve all patches, you can use this command:

    wget -r -np -nH --cut-dirs=2 http://busybox.net/~vda/inline_hunt/4.3.0-rc1/

Patch counts per directory:

patch_saves1000:            117 patches
patch_saves300_funcsize100: 239 patches
patch_funcsize200:          108 patches
patch_saves300:             115 patches
patch_remaining:            656 patches
Total:                     1235 patches

List of patches in patch_saves1000/*:

drivers/block/cciss.c: Deinline complete_command, save 1072 bytes
drivers/block/cciss.c: Deinline finish_cmd, save 1296 bytes
drivers/block/mtip32xx/mtip32xx.c: Deinline mtip_workq_sdbfx, save 1648 bytes
drivers/block/skd_main.c: Deinline skd_reg_read32, save 1584 bytes
drivers/block/skd_main.c: Deinline skd_reg_write32, save 1584 bytes
drivers/cpufreq/intel_pstate.c: Deinline intel_pstate_sample, save 1136 bytes
drivers/gpu/drm/i915/intel_uncore.c: Deinline __force_wake_get, save 5392 bytes
drivers/gpu/drm/nouveau/nvkm/subdev/bios/init.c: Deinline init_nvreg,
save 1008 bytes
drivers/gpu/drm/radeon/cik.c: Deinline cik_irq_ack, save 1840 bytes
drivers/gpu/drm/radeon/si.c: Deinline si_irq_ack, save 1904 bytes
drivers/i2c/busses/i2c-pasemi.c: Deinline reg_read, save 1041 bytes
drivers/i2c/busses/i2c-pasemi.c: Deinline reg_write, save 1493 bytes
drivers/infiniband/hw/nes/nes_verbs.c: Deinline nes_free_qp_mem, save 1056 bytes
drivers/infiniband/hw/qib/qib_iba6120.c: Deinline qib_write_kreg, save
1136 bytes
drivers/infiniband/hw/qib/qib_iba7322.c: Deinline qib_write_kreg_port,
save 2992 bytes
drivers/input/joystick/adi.c: Deinline adi_get_bits, save 1072 bytes
drivers/isdn/gigaset/bas-gigaset.c: Deinline dump_urb, save 1536 bytes
drivers/isdn/gigaset/capi.c: Deinline dump_cmsg, save 2736 bytes
drivers/isdn/gigaset/capi.c: Deinline dump_rawmsg, save 1648 bytes
drivers/isdn/hardware/mISDN/hfcmulti.c: Deinline hfcmulti_resync, save
5696 bytes
drivers/isdn/hardware/mISDN/mISDNipac.c: Deinline ph_command, save 1021 bytes
drivers/isdn/hardware/mISDN/mISDNisar.c: Deinline deliver_status, save
1200 bytes
drivers/isdn/hardware/mISDN/mISDNisar.c: Deinline isar_rcv_frame, save
1613 bytes
drivers/isdn/hisax/avm_pci.c: Deinline hdlc_fill_fifo, save 1711 bytes
drivers/isdn/hisax/hscx_irq.c: Deinline waitforCEC, save 4848 bytes
drivers/isdn/hisax/hscx_irq.c: Deinline WriteHSCXCMDR, save 7598 bytes
drivers/isdn/hisax/isar.c: Deinline rcv_mbox, save 2602 bytes
drivers/isdn/hisax/isdnl2.c: Deinline enquiry_cr, save 2339 bytes
drivers/isdn/hisax/s0box.c: Deinline read_fifo, save 1539 bytes
drivers/isdn/mISDN/dsp_blowfish.c: Deinline encrypt_block, save 1828 bytes
drivers/md/raid5.c: Deinline sync_request, save 3644 bytes
drivers/media/i2c/tvp5150.c: Deinline tvp5150_write, save 2624 bytes
drivers/misc/mei/hw-me.c: Deinline mei_hcsr_read, save 3584 bytes
drivers/misc/mei/hw-me.c: Deinline mei_hcsr_set, save 1648 bytes
drivers/misc/mei/hw-me.c: Deinline mei_hcsr_write, save 2480 bytes
drivers/misc/mei/hw-me.c: Deinline mei_me_d0i3c_read, save 2544 bytes
drivers/mmc/core/core.c: Deinline mmc_set_ios, save 1328 bytes
drivers/mtd/chips/cfi_cmdset_0020.c: Deinline do_erase_oneblock, save 2331 bytes
drivers/mtd/chips/cfi_cmdset_0020.c: Deinline do_write_buffer, save 5428 bytes
drivers/mtd/devices/docg3.c: Deinline doc_writeb, save 14241 bytes
drivers/mtd/devices/docg3.c: Deinline doc_writew, save 2459 bytes
drivers/net/ethernet/atheros/atl1c/atl1c_main.c: Deinline
atl1c_clean_buffer, save 1136 bytes
drivers/net/ethernet/freescale/fs_enet/mii-bitbang.c: Deinline
mdio_read, save 3180 bytes
drivers/net/ethernet/jme.c: Deinline jme_reset_mac_processor, save 2880 bytes
drivers/net/ethernet/myricom/myri10ge/myri10ge.c: Deinline
myri10ge_clean_rx_done, save 1840 bytes
drivers/net/ethernet/natsemi/ns83820.c: Deinline rx_refill, save 1032 bytes
drivers/net/ethernet/smsc/smsc911x.c: Deinline smsc911x_reg_write,
save 1584 bytes
drivers/net/ethernet/ti/tlan.c: Deinline tlan_set_timer, save 1200 bytes
drivers/net/ppp/ppp_generic.c: Deinline ppp_pernet, save 1560 bytes
drivers/net/ppp/pppoe.c: Deinline pppoe_pernet, save 3200 bytes
drivers/net/wireless/ath/ath6kl/wmi.c: Deinline
ath6kl_wmi_get_new_buf, save 2016 bytes
drivers/net/wireless/ipw2x00/ipw2100.c: Deinline write_nic_byte, save 2192 bytes
drivers/net/wireless/rt2x00/rt2500usb.c: Deinline
rt2500usb_register_write, save 1136 bytes
drivers/pcmcia/yenta_socket.c: Deinline config_readb, save 1264 bytes
drivers/pcmcia/yenta_socket.c: Deinline config_writeb, save 1136 bytes
drivers/pcmcia/yenta_socket.c: Deinline exca_readb, save 1072 bytes
drivers/pcmcia/yenta_socket.c: Deinline exca_writeb, save 1328 bytes
drivers/scsi/fnic/fnic_scsi.c: Deinline fnic_queue_abort_io_req, save 1824 bytes
drivers/scsi/hpsa.c: Deinline hpsa_show_dev_msg, save 2096 bytes
drivers/scsi/ppa.c: Deinline ppa_connect, save 1472 bytes
drivers/scsi/ppa.c: Deinline ppa_c_pulse, save 1280 bytes
drivers/scsi/qla2xxx/qla_target.c: Deinline qlt_build_ctio_crc2_pkt,
save 1644 bytes
drivers/spmi/spmi.c: Deinline spmi_read_cmd, save 1200 bytes
drivers/spmi/spmi.c: Deinline spmi_write_cmd, save 1776 bytes
drivers/tty/isicom.c: Deinline WaitTillCardIsFree, save 1120 bytes
drivers/usb/gadget/udc/m66592-udc.c: Deinline control_reg_set_pid,
save 1008 bytes
drivers/usb/gadget/udc/r8a66597-udc.c: Deinline pipe_change, save 2176 bytes
drivers/usb/host/oxu210hp-hcd.c: Deinline itd_submit, save 2596 bytes
drivers/usb/host/sl811-hcd.c: Deinline start_transfer, save 1062 bytes
drivers/video/fbdev/pm3fb.c: Deinline PM3_WRITE_DAC_REG, save 1264 bytes
drivers/video/fbdev/ssd1307fb.c: Deinline ssd1307fb_write_cmd, save 1552 bytes
drivers/virtio/virtio_ring.c: Deinline virtqueue_add, save 1016 bytes
fs/affs/file.c: Deinline affs_bread_ino, save 1008 bytes
fs/direct-io.c: Deinline dio_bio_add_page, save 1228 bytes
fs/direct-io.c: Deinline dio_bio_submit, save 1392 bytes
fs/direct-io.c: Deinline dio_new_bio, save 2316 bytes
fs/direct-io.c: Deinline dio_send_cur_page, save 6104 bytes
fs/direct-io.c: Deinline dio_zero_block, save 2484 bytes
fs/direct-io.c: Deinline submit_page_section, save 4552 bytes
fs/dlm/lock.c: Deinline unhold_lkb, save 1040 bytes
fs/gfs2/glock.c: Deinline do_error, save 1392 bytes
fs/minix/itree_common.c: Deinline get_block, save 4378 bytes
fs/minix/itree_common.c: Deinline get_branch, save 1621 bytes
fs/nfsd/nfs4state.c: Deinline renew_client_locked, save 1648 bytes
fs/ntfs/runlist.c: Deinline ntfs_rl_realloc, save 1512 bytes
fs/reiserfs/stree.c: Deinline comp_keys, save 1949 bytes
fs/reiserfs/stree.c: Deinline key_in_buffer, save 1136 bytes
ipc/compat.c: Deinline __put_compat_ipc_perm, save 1120 bytes
kernel/locking/lockdep.c: Deinline graph_unlock, save 1104 bytes
kernel/locking/lockdep.c: Deinline hlock_class, save 3696 bytes
kernel/locking/lockdep.c: Deinline look_up_lock_class, save 1184 bytes
kernel/locking/lockdep.c: Deinline register_lock_class, save 2600 bytes
kernel/locking/mutex.c: Deinline __mutex_lock_common, save 7228 bytes
kernel/sched/fair.c: Deinline decay_load, save 1520 bytes
kernel/sched/fair.c: Deinline update_cfs_rq_load_avg, save 5296 bytes
kernel/sched/fair.c: Deinline update_load_avg, save 7152 bytes
kernel/sched/fair.c: Deinline __update_load_avg, save 7872 bytes
kernel/time/timekeeping.c: Deinline timekeeping_get_delta, save 2928 bytes
kernel/time/timekeeping.c: Deinline timekeeping_get_ns, save 3184 bytes
kernel/time/timer.c: Deinline debug_activate, save 1072 bytes
kernel/time/timer.c: Deinline __mod_timer, save 1300 bytes
lib/xz/xz_dec_lzma2.c: Deinline rc_bit, save 1392 bytes
mm/page_alloc.c: Deinline __free_one_page, save 1000 bytes
mm/page-writeback.c: Deinline wb_dirty_limits, save 1744 bytes
mm/slub.c: Deinline __cmpxchg_double_slab, save 1456 bytes
mm/slub.c: Deinline slab_alloc_node, save 3223 bytes
mm/slub.c: Deinline slab_pre_alloc_hook, save 1328 bytes
net/l2tp/l2tp_core.c: Deinline l2tp_pernet, save 2128 bytes
net/netfilter/ipset/ip_set_core.c: Deinline ip_set_rcu_get, save 1520 bytes
net/netfilter/ipvs/ip_vs_core.c: Deinline ip_vs_in_stats, save 2196 bytes
net/netfilter/ipvs/ip_vs_core.c: Deinline ip_vs_out_stats, save 1416 bytes
net/netfilter/ipvs/ip_vs_xmit.c: Deinline ip_vs_nat_send_or_cont, save
1428 bytes
net/netfilter/ipvs/ip_vs_xmit.c: Deinline ip_vs_send_or_cont, save 1556 bytes
net/netfilter/nft_hash.c: Deinline nft_hash_obj, save 1152 bytes
net/openvswitch/datapath.c: Deinline get_dp, save 1456 bytes
security/selinux/hooks.c: Deinline task_sid, save 4656 bytes

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Inline hunt results for 4.3.0-rc1
  2015-10-27 14:32 Inline hunt results for 4.3.0-rc1 Denys Vlasenko
@ 2015-10-27 14:55 ` Hannes Frederic Sowa
  2015-10-27 15:17   ` Denys Vlasenko
  2015-10-27 15:31 ` Peter Hurley
  2015-10-29  9:33 ` Denys Vlasenko
  2 siblings, 1 reply; 6+ messages in thread
From: Hannes Frederic Sowa @ 2015-10-27 14:55 UTC (permalink / raw)
  To: Denys Vlasenko, Linux Kernel Mailing List, Oleg Nesterov

Hello,

On Tue, Oct 27, 2015, at 15:32, Denys Vlasenko wrote:
> I have created a set of semi-automated scripts which look for
> large inlines in the kernel.
> 
> Recently I taught it to even generate "git format-patch" patches
> (unfortunately, only for inlines in *.c files, not *.h),
> and here are they for 4.3.0-rc1 - i.e. current Linus tree.
> 
> Submitting 300+ patches separately would amount to spamming,
> instead I encourage people to take a look at the patches
> on the Web:
> 
>     http://busybox.net/~vda/inline_hunt/4.3.0-rc1/
>     http://busybox.net/~vda/inline_hunt/4.3.0-rc1/README
> 
> and in particular, at the set of most juicy patches, each of which
> shaves off more than 1000 bytes off its *.c module:
> 
>     http://busybox.net/~vda/inline_hunt/4.3.0-rc1/patch_saves1000/

Does gcc -finline-limit=2000 somehow has the same effect?

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Inline hunt results for 4.3.0-rc1
  2015-10-27 14:55 ` Hannes Frederic Sowa
@ 2015-10-27 15:17   ` Denys Vlasenko
  2015-10-27 15:25     ` Hannes Frederic Sowa
  0 siblings, 1 reply; 6+ messages in thread
From: Denys Vlasenko @ 2015-10-27 15:17 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: Linux Kernel Mailing List, Oleg Nesterov

On Tue, Oct 27, 2015 at 3:55 PM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> On Tue, Oct 27, 2015, at 15:32, Denys Vlasenko wrote:
>> I have created a set of semi-automated scripts which look for
>> large inlines in the kernel.
>>
>> Recently I taught it to even generate "git format-patch" patches
>> (unfortunately, only for inlines in *.c files, not *.h),
>> and here are they for 4.3.0-rc1 - i.e. current Linus tree.
>>
>> Submitting 300+ patches separately would amount to spamming,
>> instead I encourage people to take a look at the patches
>> on the Web:
>>
>>     http://busybox.net/~vda/inline_hunt/4.3.0-rc1/
>>     http://busybox.net/~vda/inline_hunt/4.3.0-rc1/README
>>
>> and in particular, at the set of most juicy patches, each of which
>> shaves off more than 1000 bytes off its *.c module:
>>
>>     http://busybox.net/~vda/inline_hunt/4.3.0-rc1/patch_saves1000/
>
> Does gcc -finline-limit=2000 somehow has the same effect?

I'm afraid that's not a solution.

Any compiler-option-based fix would only work for inlines in *.c
files, but at the same time it would replicate inlines in *.h files
many times (once per module which calls the "auto-deinlined" inline).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Inline hunt results for 4.3.0-rc1
  2015-10-27 15:17   ` Denys Vlasenko
@ 2015-10-27 15:25     ` Hannes Frederic Sowa
  0 siblings, 0 replies; 6+ messages in thread
From: Hannes Frederic Sowa @ 2015-10-27 15:25 UTC (permalink / raw)
  To: Denys Vlasenko; +Cc: Linux Kernel Mailing List, Oleg Nesterov

On Tue, Oct 27, 2015, at 16:17, Denys Vlasenko wrote:
> On Tue, Oct 27, 2015 at 3:55 PM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
> > On Tue, Oct 27, 2015, at 15:32, Denys Vlasenko wrote:
> >> I have created a set of semi-automated scripts which look for
> >> large inlines in the kernel.
> >>
> >> Recently I taught it to even generate "git format-patch" patches
> >> (unfortunately, only for inlines in *.c files, not *.h),
> >> and here are they for 4.3.0-rc1 - i.e. current Linus tree.
> >>
> >> Submitting 300+ patches separately would amount to spamming,
> >> instead I encourage people to take a look at the patches
> >> on the Web:
> >>
> >>     http://busybox.net/~vda/inline_hunt/4.3.0-rc1/
> >>     http://busybox.net/~vda/inline_hunt/4.3.0-rc1/README
> >>
> >> and in particular, at the set of most juicy patches, each of which
> >> shaves off more than 1000 bytes off its *.c module:
> >>
> >>     http://busybox.net/~vda/inline_hunt/4.3.0-rc1/patch_saves1000/
> >
> > Does gcc -finline-limit=2000 somehow has the same effect?
> 
> I'm afraid that's not a solution.

Ok, thank you.

> Any compiler-option-based fix would only work for inlines in *.c
> files, but at the same time it would replicate inlines in *.h files
> many times (once per module which calls the "auto-deinlined" inline).

Do your patches also detect functions which are in Header files with
static inline and we take their address often? Maybe dst_output(_sk)
could be an example?

Bye,
Hanes

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Inline hunt results for 4.3.0-rc1
  2015-10-27 14:32 Inline hunt results for 4.3.0-rc1 Denys Vlasenko
  2015-10-27 14:55 ` Hannes Frederic Sowa
@ 2015-10-27 15:31 ` Peter Hurley
  2015-10-29  9:33 ` Denys Vlasenko
  2 siblings, 0 replies; 6+ messages in thread
From: Peter Hurley @ 2015-10-27 15:31 UTC (permalink / raw)
  To: Denys Vlasenko; +Cc: Linux Kernel Mailing List, Oleg Nesterov, Greg KH

[ +cc Greg KH]

On 10/27/2015 10:32 AM, Denys Vlasenko wrote:
> Hi,
> 
> I have created a set of semi-automated scripts which look for
> large inlines in the kernel.
> 
> Recently I taught it to even generate "git format-patch" patches
> (unfortunately, only for inlines in *.c files, not *.h),
> and here are they for 4.3.0-rc1 - i.e. current Linus tree.
> 
> Submitting 300+ patches separately would amount to spamming,
> instead I encourage people to take a look at the patches
> on the Web:
> 
>     http://busybox.net/~vda/inline_hunt/4.3.0-rc1/

Looking over the drivers/tty/* patches, all of the patches
that save 300+ and 1000+ look ok to me, so if you want to
send those as a series (to the tty maintainers), I'll be
happy to mark them reviewed.

Wrt the "saves 100+' patches, some are ok, some are not,
and some I'd like to eliminate in a different way.
If you want to send those as a separate series, I could
mark as reviewed the ok patches, indicate which ones
to drop and which patches I'd be willing to rework.

Regards,
Peter Hurley



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Inline hunt results for 4.3.0-rc1
  2015-10-27 14:32 Inline hunt results for 4.3.0-rc1 Denys Vlasenko
  2015-10-27 14:55 ` Hannes Frederic Sowa
  2015-10-27 15:31 ` Peter Hurley
@ 2015-10-29  9:33 ` Denys Vlasenko
  2 siblings, 0 replies; 6+ messages in thread
From: Denys Vlasenko @ 2015-10-29  9:33 UTC (permalink / raw)
  To: Linux Kernel Mailing List, Oleg Nesterov

On Tue, Oct 27, 2015 at 3:32 PM, Denys Vlasenko
<vda.linux@googlemail.com> wrote:
> patch_remaining/* are the remaining patches.
>
> To retrieve all patches, you can use this command:
>
>     wget -r -np -nH --cut-dirs=2 http://busybox.net/~vda/inline_hunt/4.3.0-rc1/
>
> Patch counts per directory:
>
> patch_saves1000:            117 patches
> patch_saves300_funcsize100: 239 patches
> patch_funcsize200:          108 patches
> patch_saves300:             115 patches
> patch_remaining:            656 patches
> Total:                     1235 patches

Update after a deeper search:

Patch counts per directory:

patch_saves1000:            125 patches
patch_saves300_funcsize100: 249 patches
patch_funcsize200:          117 patches
patch_saves300:             119 patches
patch_remaining:            685 patches
Total:                     1295 patches

New patches in patch_saves1000/*:

arch/x86/kvm/vmx.c: Deinline vmcs_readl, save 9728 bytes
crypto/cast6_generic.c: Deinline QBAR, save 2864 bytes
crypto/cast6_generic.c: Deinline Q, save 2800 bytes
drivers/android/binder.c: Deinline binder_lock, save 4016 bytes
drivers/android/binder.c: Deinline binder_unlock, save 1968 bytes
drivers/atm/eni.c: Deinline put_dma, save 1200 bytes
drivers/atm/he.c: Deinline he_writel_internal, save 3200 bytes
drivers/atm/lanai.c: Deinline reg_write, save 1239 bytes

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-10-29  9:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-27 14:32 Inline hunt results for 4.3.0-rc1 Denys Vlasenko
2015-10-27 14:55 ` Hannes Frederic Sowa
2015-10-27 15:17   ` Denys Vlasenko
2015-10-27 15:25     ` Hannes Frederic Sowa
2015-10-27 15:31 ` Peter Hurley
2015-10-29  9:33 ` Denys Vlasenko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.