linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* mmotm 2010-11-23-16-12 uploaded
@ 2010-11-24  0:13 akpm
  2010-11-24  4:52 ` mmotm 2010-11-23 - lockdep whinge in e1000e driver Valdis.Kletnieks
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: akpm @ 2010-11-24  0:13 UTC (permalink / raw)
  To: mm-commits, linux-kernel, linux-mm, linux-fsdevel

The mm-of-the-moment snapshot 2010-11-23-16-12 has been uploaded to

   http://userweb.kernel.org/~akpm/mmotm/

and will soon be available at

   git://zen-kernel.org/kernel/mmotm.git

It contains the following patches against 2.6.37-rc3:

leds-fix-bug-with-reading-nas-ss4200-dmi-code.patch
include-linux-fsh-fix-userspace-build.patch
nommu-yield-cpu-while-disposing-vm.patch
uml-disable-winch-irq-before-freeing-handler-data.patch
arch-x86-kernel-entry_64s-fix-build-with-gas-2161.patch
memcg-fix-false-positive-vm_bug-on-non-smp.patch
memcg-fix-false-positive-vm_bug-on-non-smp-fix.patch
linux-next.patch
next-remove-localversion.patch
i-need-old-gcc.patch
aesni-nfg.patch
arch-alpha-kernel-systblss-remove-debug-check.patch
sgi-xpc-xpc-fails-to-discover-partitions-with-all-nasids-above-128.patch
fuse-fix-attributes-after-openo_trunc.patch
drivers-leds-leds-lp5521c-change-some-macros-to-functions.patch
drivers-leds-leds-lp5523c-change-some-macros-to-functions.patch
drivers-leds-leds-lp5521c-adjust-delays-and-add-comments-to-them.patch
drivers-leds-leds-lp5523c-adjust-delays-and-add-comments-to-them.patch
drivers-leds-leds-lp5521c-perform-sw-reset-before-detection.patch
drivers-leds-leds-lp5523c-perform-sw-reset-before-detection.patch
memcg-avoid-deadlock-between-move-charge-and-try_charge.patch
cgroups-make-swap-accounting-default-behavior-configurable.patch
cgroups-make-swap-accounting-default-behavior-configurable-update.patch
mm-page_allocc-fix-build_all_zonelist-where-percpu_alloc-is-wrongly-called-under-stop_machine_run.patch
mm-page_allocc-fix-build_all_zonelist-where-percpu_alloc-is-wrongly-called-under-stop_machine_run-cleanup.patch
mm-remove-call-to-find_vma-in-pagewalk-for-non-hugetlbfs.patch
pagemap-set-pagemap-walk-limit-to-pmd-boundary.patch
drivers-misc-isl29020c-remove-incorrect-kfree-in-isl29020_remove.patch
backlight-grab-ops_lock-before-testing-bd-ops.patch
reiserfs-fix-inode-mutex-reiserfs-lock-misordering.patch
scripts-fix-gfp-translate-for-recent-changes-to-gfph.patch
scripts-fix-gfp-translate-for-recent-changes-to-gfph-fix.patch
mm-vmap-area-cache.patch
arch-arm-plat-omap-iovmmc-fix-end-address-of-vm-area-comparation-in-alloc_iovm_area.patch
backlight-fix-88pm860x_bl-macro-collision.patch
cciss-fix-botched-tag-masking-for-scsi-tape-commands.patch
arch-x86-kernel-entry_32s-i386-too.patch
arch-x86-include-asm-fixmaph-mark-__set_fixmap_offset-as-__always_inline.patch
ibm_rtl-fix-printk-format-warning.patch
acerhdf-add-support-for-aspire-1410-bios-v13314.patch
arch-x86-kernel-apic-io_apicc-fix-warning.patch
x86-olpc-add-xo-1-suspend-resume-support.patch
fs-btrfs-inodec-eliminate-memory-leak.patch
btrfs-dont-dereference-extent_mapping-if-null.patch
cifs-dont-overwrite-dentry-name-in-d_revalidate.patch
cpufreq-fix-ondemand-governor-powersave_bias-execution-time-misuse.patch
drivers-dma-use-the-ccflag-y-instead-of-extra_cflags.patch
drivers-dma-ioat-use-the-ccflag-y-instead-of-extra_cflags.patch
jfs-dont-overwrite-dentry-name-in-d_revalidate.patch
powerpc-enable-arch_dma_addr_t_64bit-with-arch_phys_addr_t_64bit.patch
debugfs-remove-module_exit.patch
drivers-gpu-drm-radeon-atomc-fix-warning.patch
irq-use-per_cpu-kstat_irqs.patch
irq-use-per_cpu-kstat_irqs-checkpatch-fixes.patch
drivers-leds-leds-lp5521c-fix-potential-buffer-overflow.patch
leds-route-kbd-leds-through-the-generic-leds-layer.patch
mips-enable-arch_dma_addr_t_64bit-with-highmem-64bit_phys_addr-64bit.patch
isdn-capi-unregister-capictr-notifier-after-init-failure.patch
isdn-capi-make-kcapi-use-a-separate-workqueue.patch
drivers-video-backlight-l4f00242t03c-make-1-bit-signed-field-unsigned.patch
drivers-video-backlight-l4f00242t03c-full-implement-fb-power-states-for-this-lcd.patch
btusb-patch-add_apple_macbookpro62.patch
atmel_serial-fix-rts-high-after-initialization-in-rs485-mode.patch
atmel_serial-fix-rts-high-after-initialization-in-rs485-mode-fix.patch
drivers-message-fusion-mptsasc-fix-warning.patch
hpsa-remove-incorrect-redefinition-of-pci_device_id_hp_cissf.patch
drivers-block-makefile-replace-the-use-of-module-objs-with-module-y.patch
drivers-block-aoe-makefile-replace-the-use-of-module-objs-with-module-y.patch
vfs-remove-a-warning-on-open_fmode.patch
vfs-add-__fmode_exec.patch
n_hdlc-fix-read-and-write-locking.patch
n_hdlc-fix-read-and-write-locking-update.patch
mm.patch
mm-page-allocator-adjust-the-per-cpu-counter-threshold-when-memory-is-low.patch
mm-vmstat-use-a-single-setter-function-and-callback-for-adjusting-percpu-thresholds.patch
mm-vmstat-use-a-single-setter-function-and-callback-for-adjusting-percpu-thresholds-fix.patch
mm-vmstat-use-a-single-setter-function-and-callback-for-adjusting-percpu-thresholds-update.patch
mm-vmstat-use-a-single-setter-function-and-callback-for-adjusting-percpu-thresholds-fix-set_pgdat_percpu_threshold-dont-use-for_each_online_cpu.patch
mm-mempolicyc-add-rcu-read-lock-to-protect-pid-structure.patch
writeback-integrated-background-writeback-work.patch
writeback-trace-wakeup-event-for-background-writeback.patch
writeback-stop-background-kupdate-works-from-livelocking-other-works.patch
writeback-stop-background-kupdate-works-from-livelocking-other-works-update.patch
writeback-avoid-livelocking-wb_sync_all-writeback.patch
writeback-avoid-livelocking-wb_sync_all-writeback-update.patch
writeback-check-skipped-pages-on-wb_sync_all.patch
writeback-check-skipped-pages-on-wb_sync_all-update.patch
writeback-check-skipped-pages-on-wb_sync_all-update-fix.patch
writeback-io-less-balance_dirty_pages.patch
writeback-consolidate-variable-names-in-balance_dirty_pages.patch
writeback-per-task-rate-limit-on-balance_dirty_pages.patch
writeback-per-task-rate-limit-on-balance_dirty_pages-fix.patch
writeback-prevent-duplicate-balance_dirty_pages_ratelimited-calls.patch
writeback-account-per-bdi-accumulated-written-pages.patch
writeback-bdi-write-bandwidth-estimation.patch
writeback-bdi-write-bandwidth-estimation-fix.patch
writeback-show-bdi-write-bandwidth-in-debugfs.patch
writeback-quit-throttling-when-bdi-dirty-pages-dropped-low.patch
writeback-reduce-per-bdi-dirty-threshold-ramp-up-time.patch
writeback-make-reasonable-gap-between-the-dirty-background-thresholds.patch
writeback-scale-down-max-throttle-bandwidth-on-concurrent-dirtiers.patch
writeback-add-trace-event-for-balance_dirty_pages.patch
writeback-make-nr_to_write-a-per-file-limit.patch
writeback-make-nr_to_write-a-per-file-limit-fix.patch
sync_inode_metadata-fix-comment.patch
mm-page-writebackc-fix-__set_page_dirty_no_writeback-return-value.patch
vmscan-factor-out-kswapd-sleeping-logic-from-kswapd.patch
mm-find_get_pages_contig-fixlet.patch
fs-mpagec-consolidate-code.patch
fs-mpagec-consolidate-code-checkpatch-fixes.patch
mm-convert-sprintf_symbol-to-%ps.patch
mm-smaps-export-mlock-information.patch
mm-compaction-add-trace-events-for-memory-compaction-activity.patch
mm-vmscan-convert-lumpy_mode-into-a-bitmask.patch
mm-vmscan-reclaim-order-0-and-use-compaction-instead-of-lumpy-reclaim.patch
mm-vmscan-reclaim-order-0-and-use-compaction-instead-of-lumpy-reclaim-fix.patch
mm-migration-allow-migration-to-operate-asynchronously-and-avoid-synchronous-compaction-in-the-faster-path.patch
mm-migration-allow-migration-to-operate-asynchronously-and-avoid-synchronous-compaction-in-the-faster-path-fix.patch
mm-migration-cleanup-migrate_pages-api-by-matching-types-for-offlining-and-sync.patch
mm-compaction-perform-a-faster-migration-scan-when-migrating-asynchronously.patch
mm-vmscan-rename-lumpy_mode-to-reclaim_mode.patch
mm-deactivate-invalidated-pages.patch
mm-deactivate-invalidated-pages-fix.patch
mm-remove-unused-get_vm_area_node.patch
mm-remove-gfp-mask-from-pcpu_get_vm_areas.patch
mm-unify-module_alloc-code-for-vmalloc.patch
oom-allow-a-non-cap_sys_resource-proces-to-oom_score_adj-down.patch
mm-clear-pageerror-bit-in-msync-fsync.patch
frv-duplicate-output_buffer-of-e03.patch
frv-duplicate-output_buffer-of-e03-checkpatch-fixes.patch
hpet-factor-timer-allocate-from-open.patch
kernel-power-changed-makefile-to-use-proper-ccflag-flag.patch
um-mark-config_highmem-as-broken.patch
arch-um-drivers-linec-safely-iterate-over-list-of-winch-handlers.patch
kmsg_dump-constrain-mtdoops-and-ramoops-to-perform-their-actions-only-for-kmsg_dump_panic.patch
kmsg_dump-add-kmsg_dump-calls-to-the-reboot-halt-poweroff-and-emergency_restart-paths.patch
set_rtc_mmss-show-warning-message-only-once.patch
include-linux-kernelh-abs-fix-handling-of-32-bit-unsigneds-on-64-bit.patch
include-linux-kernelh-abs-fix-handling-of-32-bit-unsigneds-on-64-bit-fix.patch
add-the-common-dma_addr_t-typedef-to-include-linux-typesh.patch
dca-remove-unneeded-null-check.patch
scripts-get_maintainerpl-make-rolestats-the-default.patch
scripts-get_maintainerpl-use-git-fallback-more-often.patch
maintainers-intel-gfx-is-a-subscribers-only-mailing-list.patch
percpucounter-optimize-__percpu_counter_add-a-bit-through-the-use-of-this_cpu-operations.patch
drivers-mmc-host-omapc-use-resource_size.patch
drivers-mmc-host-omap_hsmmcc-use-resource_size.patch
scripts-checkpatchpl-add-check-for-multiple-terminating-semicolons-and-casts-of-vmalloc.patch
checkpatchpl-fix-cast-detection.patch
fs-select-fix-information-leak-to-userspace.patch
fs-select-fix-information-leak-to-userspace-fix.patch
epoll-convert-max_user_watches-to-long.patch
binfmt_elf-cleanups.patch
drivers-rtc-rtc-omapc-fix-a-memory-leak.patch
rtc-add-real-time-clock-driver-for-nvidia-tegra.patch
drivers-gpio-cs5535-gpioc-add-some-additional-cs5535-specific-gpio-functionality.patch
drivers-staging-olpc_dcon-convert-to-new-cs5535-gpio-api.patch
cyber2000fb-avoid-palette-corruption-at-higher-clocks.patch
jbd-remove-dependency-on-__gfp_nofail.patch
memcg-add-page_cgroup-flags-for-dirty-page-tracking.patch
memcg-document-cgroup-dirty-memory-interfaces.patch
memcg-document-cgroup-dirty-memory-interfaces-fix.patch
memcg-create-extensible-page-stat-update-routines.patch
memcg-add-lock-to-synchronize-page-accounting-and-migration.patch
memcg-use-zalloc-rather-than-mallocmemset.patch
fs-proc-basec-kernel-latencytopc-convert-sprintf_symbol-to-%ps.patch
fs-proc-basec-kernel-latencytopc-convert-sprintf_symbol-to-%ps-checkpatch-fixes.patch
proc-use-unsigned-long-inside-proc-statm.patch
exec_domain-establish-a-linux32-domain-on-config_compat-systems.patch
rapidio-use-common-destid-storage-for-endpoints-and-switches.patch
rapidio-integrate-rio_switch-into-rio_dev.patch
fs-execc-provide-the-correct-process-pid-to-the-pipe-helper.patch
nfc-driver-for-nxp-semiconductors-pn544-nfc-chip.patch
nfc-driver-for-nxp-semiconductors-pn544-nfc-chip-update.patch
remove-dma64_addr_t.patch
pps-trivial-fixes.patch
pps-declare-variables-where-they-are-used-in-switch.patch
pps-fix-race-in-pps_fetch-handler.patch
pps-unify-timestamp-gathering.patch
pps-access-pps-device-by-direct-pointer.patch
pps-convert-printk-pr_-to-dev_.patch
pps-move-idr-stuff-to-ppsc.patch
pps-add-async-pps-event-handler.patch
pps-add-async-pps-event-handler-fix.patch
pps-dont-disable-interrupts-when-using-spin-locks.patch
pps-use-bug_on-for-kernel-api-safety-checks.patch
pps-simplify-conditions-a-bit.patch
ntp-add-hardpps-implementation.patch
pps-capture-monotonic_raw-timestamps-as-well.patch
pps-add-kernel-consumer-support.patch
pps-add-parallel-port-pps-client.patch
pps-add-parallel-port-pps-signal-generator.patch
memstick-a-few-changes-to-core.patch
memstick-add-support-for-legacy-memorysticks.patch
memstick-add-driver-for-ricoh-r5c592-card-reader.patch
memstick-add-driver-for-ricoh-r5c592-card-reader-fix.patch
memstick-core-fix-device_register-error-handling.patch
w1-ds2423-counter-driver-and-documentation.patch
w1-ds2423-counter-driver-and-documentation-fix.patch
romfs-have-romfs_fsh-pull-in-necessary-headers.patch
decompressors-add-missing-init-ie-__init.patch
decompressors-get-rid-of-set_error_fn-macro.patch
decompressors-include-linux-slabh-in-linux-decompress-mmh.patch
decompressors-remove-unused-function-from-lib-decompress_unlzmac.patch
make-sure-nobodys-leaking-resources.patch
journal_add_journal_head-debug.patch
releasing-resources-with-children.patch
make-frame_pointer-default=y.patch
mutex-subsystem-synchro-test-module.patch
mutex-subsystem-synchro-test-module-add-missing-header-file.patch
slab-leaks3-default-y.patch
put_bh-debug.patch
add-debugging-aid-for-memory-initialisation-problems.patch
workaround-for-a-pci-restoring-bug.patch
prio_tree-debugging-patch.patch
single_open-seq_release-leak-diagnostics.patch
add-a-refcount-check-in-dput.patch
getblk-handle-2tb-devices.patch
memblock-add-input-size-checking-to-memblock_find_region.patch
memblock-add-input-size-checking-to-memblock_find_region-fix.patch

^ permalink raw reply	[flat|nested] 27+ messages in thread

* mmotm 2010-11-23 - lockdep whinge in e1000e driver
  2010-11-24  0:13 mmotm 2010-11-23-16-12 uploaded akpm
@ 2010-11-24  4:52 ` Valdis.Kletnieks
  2010-11-24  4:55 ` mmotm 2010-11-23 - WARNING: at drivers/tty/tty_io.c:1331 Valdis.Kletnieks
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Valdis.Kletnieks @ 2010-11-24  4:52 UTC (permalink / raw)
  To: akpm, Peter Zijlstra, Ingo Molnar, Jesse Brandeburg
  Cc: mm-commits, linux-kernel, netdev

[-- Attachment #1: Type: text/plain, Size: 3240 bytes --]

On Tue, 23 Nov 2010 16:13:06 PST, akpm@linux-foundation.org said:
> The mm-of-the-moment snapshot 2010-11-23-16-12 has been uploaded to
> 
>    http://userweb.kernel.org/~akpm/mmotm/

Whinges during boot while bringing up the ethernet interface:

[    1.081504] ===================================================
[    1.081507] [ INFO: suspicious rcu_dereference_check() usage. ]
[    1.081509] ---------------------------------------------------
[    1.081512] include/linux/inetdevice.h:208 invoked rcu_dereference_check() without protection!
[    1.081514] 
[    1.081515] other info that might help us debug this:
[    1.081516] 
[    1.081518] 
[    1.081518] rcu_scheduler_active = 1, debug_locks = 1
[    1.081521] 3 locks held by swapper/1:
[    1.081523]  #0:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff812d0b57>] device_lock+0xf/0x11
[    1.081534]  #1:  (&__lockdep_no_validate__){+.+.+.}, at: [<ffffffff812d0b57>] device_lock+0xf/0x11
[    1.081541]  #2:  (rtnl_mutex){+.+.+.}, at: [<ffffffff8142dee8>] rtnl_lock+0x12/0x14
[    1.081549] 
[    1.081550] stack backtrace:
[    1.081553] Pid: 1, comm: swapper Not tainted 2.6.37-rc3-mmotm1123 #3
[    1.081555] Call Trace:
[    1.081562]  [<ffffffff81069580>] lockdep_rcu_dereference+0x9d/0xa5
[    1.081567]  [<ffffffff8147b235>] __in_dev_get_rcu.clone.12+0x3f/0x47
[    1.081571]  [<ffffffff8147b24d>] inet_get_link_af_size+0x10/0x1f
[    1.081575]  [<ffffffff8142ce16>] if_nlmsg_size+0xd5/0x111
[    1.081579]  [<ffffffff8142ecf6>] rtmsg_ifinfo+0x1f/0xeb
[    1.081584]  [<ffffffff8105d78e>] ? raw_notifier_call_chain+0xf/0x11
[    1.081589]  [<ffffffff81421ee7>] register_netdevice+0x3ea/0x410
[    1.081593]  [<ffffffff81421f47>] register_netdev+0x3a/0x4c
[    1.081599]  [<ffffffff81551cc2>] e1000_probe+0x986/0xb6f
[    1.081604]  [<ffffffff81237b2e>] local_pci_probe+0x3f/0x70
[    1.081608]  [<ffffffff81237eae>] pci_device_probe+0x65/0x96
[    1.081614]  [<ffffffff8115a82a>] ? sysfs_create_link+0xe/0x10
[    1.081617]  [<ffffffff812d0fe0>] driver_probe_device+0xe8/0x182
[    1.081621]  [<ffffffff812d10c4>] __driver_attach+0x4a/0x6b
[    1.081625]  [<ffffffff812d107a>] ? __driver_attach+0x0/0x6b
[    1.081629]  [<ffffffff812d01cf>] bus_for_each_dev+0x57/0x83
[    1.081633]  [<ffffffff812d0ca5>] driver_attach+0x19/0x1b
[    1.081637]  [<ffffffff812d08e7>] bus_add_driver+0xae/0x205
[    1.081641]  [<ffffffff812d1324>] driver_register+0xb5/0x122
[    1.081646]  [<ffffffff81b455cb>] ? e1000_init_module+0x0/0x3e
[    1.081650]  [<ffffffff812380e4>] __pci_register_driver+0x61/0xcd
[    1.081654]  [<ffffffff81b455cb>] ? e1000_init_module+0x0/0x3e
[    1.081658]  [<ffffffff81b45607>] e1000_init_module+0x3c/0x3e
[    1.081663]  [<ffffffff810002ff>] do_one_initcall+0x7a/0x12f
[    1.081668]  [<ffffffff81b1fd08>] kernel_init+0x15d/0x1e7
[    1.081672]  [<ffffffff810035d4>] kernel_thread_helper+0x4/0x10
[    1.081678]  [<ffffffff8102f845>] ? finish_task_switch+0x3f/0xe3
[    1.081682]  [<ffffffff8155b5c0>] ? restore_args+0x0/0x30
[    1.081686]  [<ffffffff81b1fbab>] ? kernel_init+0x0/0x1e7
[    1.081690]  [<ffffffff810035d0>] ? kernel_thread_helper+0x0/0x10
[    1.081731] e1000e 0000:00:19.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:24:e8:c6:ad:17


[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* mmotm 2010-11-23 - WARNING: at drivers/tty/tty_io.c:1331
  2010-11-24  0:13 mmotm 2010-11-23-16-12 uploaded akpm
  2010-11-24  4:52 ` mmotm 2010-11-23 - lockdep whinge in e1000e driver Valdis.Kletnieks
@ 2010-11-24  4:55 ` Valdis.Kletnieks
  2010-11-25 15:14   ` Kyle McMartin
  2010-11-24  5:01 ` mmotm 2010-11-23 + autogroups -> inconsistent lock state Valdis.Kletnieks
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Valdis.Kletnieks @ 2010-11-24  4:55 UTC (permalink / raw)
  To: akpm; +Cc: mm-commits, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1874 bytes --]

On Tue, 23 Nov 2010 16:13:06 PST, akpm@linux-foundation.org said:
> The mm-of-the-moment snapshot 2010-11-23-16-12 has been uploaded to
> 
>    http://userweb.kernel.org/~akpm/mmotm/

Seen during boot:

[   22.859616] SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
[   23.015434] ------------[ cut here ]------------
[   23.015443] WARNING: at drivers/tty/tty_io.c:1331 tty_open+0x2a2/0x49a()
[   23.015446] Hardware name: Latitude E6500                  
[   23.015448] Modules linked in:
[   23.015453] Pid: 1207, comm: plymouthd Not tainted 2.6.37-rc3-mmotm1123 #3
[   23.015455] Call Trace:
[   23.015461]  [<ffffffff8103b189>] warn_slowpath_common+0x80/0x98
[   23.015465]  [<ffffffff8103b1b6>] warn_slowpath_null+0x15/0x17
[   23.015469]  [<ffffffff8128a3ab>] tty_open+0x2a2/0x49a
[   23.015475]  [<ffffffff810fd53f>] chrdev_open+0x11d/0x146
[   23.015479]  [<ffffffff810fd422>] ? chrdev_open+0x0/0x146
[   23.015483]  [<ffffffff810f7b4c>] __dentry_open+0x31a/0x483
[   23.015488]  [<ffffffff810f88fe>] nameidata_to_filp+0x50/0x57
[   23.015492]  [<ffffffff81105e53>] do_last+0x448/0x5b2
[   23.015497]  [<ffffffff81229229>] ? __raw_spin_lock_init+0x31/0x50
[   23.015501]  [<ffffffff81106205>] do_filp_open+0x248/0x64a
[   23.015507]  [<ffffffff810a5c4d>] ? trace_preempt_on+0x15/0x28
[   23.015511]  [<ffffffff81110cda>] ? alloc_fd+0x17c/0x18e
[   23.015516]  [<ffffffff8155adfc>] ? _raw_spin_unlock+0x30/0x69
[   23.015521]  [<ffffffff8155e3b8>] ? sub_preempt_count+0x35/0x49
[   23.015525]  [<ffffffff81110cda>] ? alloc_fd+0x17c/0x18e
[   23.015529]  [<ffffffff810f8965>] do_sys_open+0x60/0xfb
[   23.015533]  [<ffffffff8155a64b>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[   23.015537]  [<ffffffff810f8a1b>] sys_open+0x1b/0x1d
[   23.015542]  [<ffffffff8100277b>] system_call_fastpath+0x16/0x1b
[   23.015545] ---[ end trace 12db3a7ab6675b51 ]---


[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* mmotm 2010-11-23 + autogroups -> inconsistent lock state
  2010-11-24  0:13 mmotm 2010-11-23-16-12 uploaded akpm
  2010-11-24  4:52 ` mmotm 2010-11-23 - lockdep whinge in e1000e driver Valdis.Kletnieks
  2010-11-24  4:55 ` mmotm 2010-11-23 - WARNING: at drivers/tty/tty_io.c:1331 Valdis.Kletnieks
@ 2010-11-24  5:01 ` Valdis.Kletnieks
  2010-11-24 20:25   ` Mike Galbraith
  2010-11-24 13:56 ` mmotm 2010-11-23-16-12 uploaded Zimny Lech
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Valdis.Kletnieks @ 2010-11-24  5:01 UTC (permalink / raw)
  To: akpm, Ingo Molnar, Mike Galbraith; +Cc: mm-commits, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4032 bytes --]

On Tue, 23 Nov 2010 16:13:06 PST, akpm@linux-foundation.org said:
> The mm-of-the-moment snapshot 2010-11-23-16-12 has been uploaded to
> 
>    http://userweb.kernel.org/~akpm/mmotm/

(I appear to be on a roll tonight - 3 splats before I even had a chance to login. :)

mmotm + Ingo's cleanup of Mike's autogroups patch.

[  114.569222] =================================
[  114.578171] [ INFO: inconsistent lock state ]
[  114.578171] 2.6.37-rc3-mmotm1123 #3
[  114.578171] ---------------------------------
[  114.578171] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[  114.578171] kworker/0:0/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
[  114.578171]  (&(&sighand->siglock)->rlock){?.+...}, at: [<ffffffff8104bfb1>] __lock_task_sighand+0x88/0xd6
[  114.578171] {HARDIRQ-ON-W} state was registered at:
[  114.578171]   [<ffffffff8106a9a9>] __lock_acquire+0x358/0xd4e
[  114.578171]   [<ffffffff8106b8b1>] lock_acquire+0x100/0x126
[  114.578171]   [<ffffffff8155a849>] _raw_spin_lock+0x36/0x45
[  114.578171]   [<ffffffff81030bc6>] sched_autogroup_fork+0x30/0x61
[  114.578171]   [<ffffffff8103995a>] copy_process+0x994/0x1325
[  114.578171]   [<ffffffff8103a4ca>] do_fork+0x1ae/0x3e3
[  114.578171]   [<ffffffff81009603>] kernel_thread+0x6b/0x6d
[  114.578171]   [<ffffffff8105832e>] kthreadd+0xdd/0x11f
[  114.578171]   [<ffffffff810035d4>] kernel_thread_helper+0x4/0x10
[  114.578171] irq event stamp: 1137212
[  114.578171] hardirqs last  enabled at (1137209): [<ffffffff8155ae6f>] _raw_spin_unlock_irqrestore+0x3a/0x80
[  114.578171] hardirqs last disabled at (1137210): [<ffffffff8155b467>] save_args+0x67/0x70
[  114.578171] softirqs last  enabled at (1137212): [<ffffffff810414f3>] _local_bh_enable+0xe/0x10
[  114.578171] softirqs last disabled at (1137211): [<ffffffff81041edd>] irq_enter+0x3d/0x6f
[  114.578171] 
[  114.578171] other info that might help us debug this:
[  114.578171] 3 locks held by kworker/0:0/0:
[  114.578171]  #0:  (&(&new_timer->it_lock)->rlock){-.....}, at: [<ffffffff81056e7b>] posix_timer_fn+0x24/0xc7
[  114.578171]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff81056d77>] rcu_read_lock+0x0/0x35
[  114.578171]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff8104a6e6>] rcu_read_lock+0x0/0x35
[  114.578171] 
[  114.578171] stack backtrace:
[  114.578171] Pid: 0, comm: kworker/0:0 Tainted: G        W   2.6.37-rc3-mmotm1123 #3
[  114.578171] Call Trace:
[  114.578171]  <IRQ>  [<ffffffff8106a467>] valid_state+0x17c/0x18e
[  114.578171]  [<ffffffff81069d2c>] ? check_usage_forwards+0x0/0x87
[  114.578171]  [<ffffffff8106a558>] mark_lock+0xdf/0x1d8
[  114.578171]  [<ffffffff81069d2c>] ? check_usage_forwards+0x0/0x87
[  114.578171]  [<ffffffff8106a928>] __lock_acquire+0x2d7/0xd4e
[  114.578171]  [<ffffffff8106a4a6>] ? mark_lock+0x2d/0x1d8
[  114.578171]  [<ffffffff8104bfb1>] ? __lock_task_sighand+0x88/0xd6
[  114.578171]  [<ffffffff8106b8b1>] lock_acquire+0x100/0x126
[  114.578171]  [<ffffffff8104bfb1>] ? __lock_task_sighand+0x88/0xd6
[  114.578171]  [<ffffffff8155a942>] _raw_spin_lock_irqsave+0x44/0x57
[  114.578171]  [<ffffffff8104bfb1>] ? __lock_task_sighand+0x88/0xd6
[  114.578171]  [<ffffffff8104bfb1>] __lock_task_sighand+0x88/0xd6
[  114.578171]  [<ffffffff8104c6b3>] send_sigqueue+0x51/0x162
[  114.578171]  [<ffffffff81056e42>] posix_timer_event+0x3f/0x54
[  114.578171]  [<ffffffff81056ea1>] posix_timer_fn+0x4a/0xc7
[  114.578171]  [<ffffffff812294fd>] ? do_raw_spin_unlock+0xd0/0xfa
[  114.578171]  [<ffffffff8105bb7e>] __run_hrtimer+0x13e/0x27a
[  114.578171]  [<ffffffff81056e57>] ? posix_timer_fn+0x0/0xc7
[  114.578171]  [<ffffffff8105c5f3>] hrtimer_interrupt+0xea/0x1d6
[  114.578171]  [<ffffffff8101ad4f>] smp_apic_timer_interrupt+0x74/0x87
[  114.578171]  [<ffffffff81003193>] apic_timer_interrupt+0x13/0x20
[  114.578171]  <EOI>  [<ffffffff81000cf5>] ? cpu_idle+0x42/0x14e
[  114.578171]  [<ffffffff81000dd5>] ? cpu_idle+0x122/0x14e
[  114.578171]  [<ffffffff81b57170>] start_secondary+0x1a9/0x1ad
~                                                                    

[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: mmotm 2010-11-23-16-12 uploaded
  2010-11-24  0:13 mmotm 2010-11-23-16-12 uploaded akpm
                   ` (2 preceding siblings ...)
  2010-11-24  5:01 ` mmotm 2010-11-23 + autogroups -> inconsistent lock state Valdis.Kletnieks
@ 2010-11-24 13:56 ` Zimny Lech
  2010-11-24 18:51 ` mmotm 2010-11-23-16-12 uploaded (olpc) Randy Dunlap
  2010-11-24 19:41 ` [PATCH -mmotm/-next] media: fix timblogiw kconfig & build error Randy Dunlap
  5 siblings, 0 replies; 27+ messages in thread
From: Zimny Lech @ 2010-11-24 13:56 UTC (permalink / raw)
  To: akpm; +Cc: mm-commits, linux-kernel, linux-mm, linux-fsdevel

Ave

2010/11/24  <akpm@linux-foundation.org>:
> The mm-of-the-moment snapshot 2010-11-23-16-12 has been uploaded to

So far, so good - eight builds and one error (AFAICS known issue)

'make CONFIG_DEBUG_SECTION_MISMATCH=y'
  GEN     .version
  CHK     include/generated/compile.h
  UPD     include/generated/compile.h
  CC      init/version.o
  LD      init/built-in.o
  LD      .tmp_vmlinux1
drivers/built-in.o: In function `timblogiw_close':
/home/test/linux-2.6-mm/drivers/media/video/timblogiw.c:704: undefined
reference to `dma_release_channel'
drivers/built-in.o: In function `buffer_release':
/home/test/linux-2.6-mm/drivers/media/video/timblogiw.c:595: undefined
reference to `dma_sync_wait'
drivers/built-in.o: In function `timblogiw_open':
/home/test/linux-2.6-mm/drivers/media/video/timblogiw.c:671: undefined
reference to `__dma_request_channel'
make[1]: *** [.tmp_vmlinux1] Error 1
make: *** [sub-make] Error 2






-- 
Slawa!
N.P.S.

Chwała tobie, Szatanie, cześć na wysokościach
Nieba, gdzie królowałeś, chwała w głębokościach
Piekła, gdzie zwyciężony, trwasz w dumnym milczeniu!
Uczyń, niechaj ma dusza spocznie z Tobą w cieniu
Drzewa Wiedzy, gdy swoje konary rozwinie,
Jak sklepienie kościoła, który nie przeminie!

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: mmotm 2010-11-23-16-12 uploaded (olpc)
  2010-11-24  0:13 mmotm 2010-11-23-16-12 uploaded akpm
                   ` (3 preceding siblings ...)
  2010-11-24 13:56 ` mmotm 2010-11-23-16-12 uploaded Zimny Lech
@ 2010-11-24 18:51 ` Randy Dunlap
  2010-11-24 19:13   ` Andres Salomon
  2010-11-26 16:46   ` Daniel Drake
  2010-11-24 19:41 ` [PATCH -mmotm/-next] media: fix timblogiw kconfig & build error Randy Dunlap
  5 siblings, 2 replies; 27+ messages in thread
From: Randy Dunlap @ 2010-11-24 18:51 UTC (permalink / raw)
  To: akpm, Daniel Drake, Andres Salomon; +Cc: linux-kernel, linux-mm, linux-fsdevel

On Tue, 23 Nov 2010 16:13:06 -0800 akpm@linux-foundation.org wrote:

> The mm-of-the-moment snapshot 2010-11-23-16-12 has been uploaded to
> 
>    http://userweb.kernel.org/~akpm/mmotm/
> 
> and will soon be available at
> 
>    git://zen-kernel.org/kernel/mmotm.git


make[4]: *** No rule to make target `arch/x86/platform/olpc/olpc-xo1-wakeup.c', needed by `arch/x86/platform/olpc/olpc-xo1-wakeup.o'.


It's olpc-xo1-wakeup.S, so I guess it needs a special makefile rule ??

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: mmotm 2010-11-23-16-12 uploaded (olpc)
  2010-11-24 18:51 ` mmotm 2010-11-23-16-12 uploaded (olpc) Randy Dunlap
@ 2010-11-24 19:13   ` Andres Salomon
  2010-11-26 16:46   ` Daniel Drake
  1 sibling, 0 replies; 27+ messages in thread
From: Andres Salomon @ 2010-11-24 19:13 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: akpm, Daniel Drake, linux-kernel, linux-mm, linux-fsdevel

On Wed, 24 Nov 2010 10:51:26 -0800
Randy Dunlap <randy.dunlap@oracle.com> wrote:

> On Tue, 23 Nov 2010 16:13:06 -0800 akpm@linux-foundation.org wrote:
> 
> > The mm-of-the-moment snapshot 2010-11-23-16-12 has been uploaded to
> > 
> >    http://userweb.kernel.org/~akpm/mmotm/
> > 
> > and will soon be available at
> > 
> >    git://zen-kernel.org/kernel/mmotm.git=
> 
> 
> make[4]: *** No rule to make target
> `arch/x86/platform/olpc/olpc-xo1-wakeup.c', needed by
> `arch/x86/platform/olpc/olpc-xo1-wakeup.o'.
> 
> 
> It's olpc-xo1-wakeup.S, so I guess it needs a special makefile rule ??
> 

I had trouble with this as well (and after flailing at it a bit, ended
up just dropping the olpc pm stuff from my tree for now).  The build
failure is definitely config-specific.  I suspected that it needs
something like the following, but failed to figure it out:

foo-y := olpc-xo1-wakeup.o
obj-$(CONFIG_OLPC_XO1) += olpc-xo1.o foo.o



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH -mmotm/-next] media: fix timblogiw kconfig & build error
  2010-11-24  0:13 mmotm 2010-11-23-16-12 uploaded akpm
                   ` (4 preceding siblings ...)
  2010-11-24 18:51 ` mmotm 2010-11-23-16-12 uploaded (olpc) Randy Dunlap
@ 2010-11-24 19:41 ` Randy Dunlap
  5 siblings, 0 replies; 27+ messages in thread
From: Randy Dunlap @ 2010-11-24 19:41 UTC (permalink / raw)
  To: akpm, Pelagicore AB, linux-media
  Cc: linux-kernel, linux-mm, linux-fsdevel, Zimny Lech

From: Randy Dunlap <randy.dunlap@oracle.com>

timblogiw uses dma() interfaces and it selects TIMB_DMA for that
support.  However, drivers/dma/ is not built unless
CONFIG_DMA_ENGINE is enabled, so select/enable that symbol also.

drivers/built-in.o: In function `timblogiw_close':
timblogiw.c:(.text+0x4419fe): undefined reference to `dma_release_channel'
drivers/built-in.o: In function `buffer_release':
timblogiw.c:(.text+0x441a8d): undefined reference to `dma_sync_wait'
drivers/built-in.o: In function `timblogiw_open':
timblogiw.c:(.text+0x44212b): undefined reference to `__dma_request_channel'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
---
 drivers/media/video/Kconfig |    1 +
 1 file changed, 1 insertion(+)

--- mmotm-2010-1123-1612.orig/drivers/media/video/Kconfig
+++ mmotm-2010-1123-1612/drivers/media/video/Kconfig
@@ -669,6 +669,7 @@ config VIDEO_HEXIUM_GEMINI
 config VIDEO_TIMBERDALE
 	tristate "Support for timberdale Video In/LogiWIN"
 	depends on VIDEO_V4L2 && I2C
+	select DMA_ENGINE
 	select TIMB_DMA
 	select VIDEO_ADV7180
 	select VIDEOBUF_DMA_CONTIG

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: mmotm 2010-11-23 + autogroups -> inconsistent lock state
  2010-11-24  5:01 ` mmotm 2010-11-23 + autogroups -> inconsistent lock state Valdis.Kletnieks
@ 2010-11-24 20:25   ` Mike Galbraith
  2010-11-24 20:39     ` Mike Galbraith
                       ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Mike Galbraith @ 2010-11-24 20:25 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: akpm, Ingo Molnar, mm-commits, linux-kernel

On Wed, 2010-11-24 at 00:01 -0500, Valdis.Kletnieks@vt.edu wrote:
> On Tue, 23 Nov 2010 16:13:06 PST, akpm@linux-foundation.org said:
> > The mm-of-the-moment snapshot 2010-11-23-16-12 has been uploaded to
> > 
> >    http://userweb.kernel.org/~akpm/mmotm/
> 
> (I appear to be on a roll tonight - 3 splats before I even had a chance to login. :)
> 
> mmotm + Ingo's cleanup of Mike's autogroups patch.
...

Sorry for slow response, been trying to use some of my last few vacation
days on vacation stuff ;-)

The below should run gripe free.  Suppose I should learn to turn on
lockdep and whatnot when tinkering/testing.

Unfortunately, tip's update_shares() changes are still being difficult.

static void update_shares(int cpu)
{
        struct cfs_rq *cfs_rq;
        struct rq *rq = cpu_rq(cpu);

        rcu_read_lock();
        for_each_leaf_cfs_rq(rq, cfs_rq)
                update_shares_cpu(cfs_rq->tg, cpu);
        rcu_read_unlock();
}

Despite task groups being freed via rcu, update_shares_cup() hits freed
memory and explodes, and nothing I've tried has been able to stop it.
The only thing I haven't tried (aside from the right thing;) is to take
rcu out of the picture entirely.


From: Mike Galbraith <efault@gmx.de>
Date: Sat, 20 Nov 2010 12:35:00 -0700
Subject: [PATCH] sched: Improve desktop interactivity: Implement automated per session task groups

A recurring complaint from CFS users is that parallel kbuild has a negative
impact on desktop interactivity.  This patch implements an idea from Linus,
to automatically create task groups.  Currently, only per session autogroups
are implemented, but the patch leaves the way open for enhancement.

Implementation: each task's signal struct contains an inherited pointer to
a refcounted autogroup struct containing a task group pointer, the default
for all tasks pointing to the init_task_group.  When a task calls setsid(),
a new task group is created, the process is moved into the new task group,
and a reference to the preveious task group is dropped.  Child processes
inherit this task group thereafter, and increase it's refcount.  When the
last thread of a process exits, the process's reference is dropped, such
that when the last process referencing an autogroup exits, the autogroup
is destroyed.

At runqueue selection time, IFF a task has no cgroup assignment, its current
autogroup is used.

Autogroup bandwidth is controllable via setting it's nice level through the
proc filesystem.  cat /proc/<pid>/autogroup displays the task's group and the
group's nice level.  echo <nice level> > /proc/<pid>/autogroup Sets the task
group's shares to the weight of nice <level> task.  Setting nice level is rate
limited for !admin users due to the abuse risk of task group locking.

The feature is enabled from boot by default if CONFIG_SCHED_AUTOGROUP=y is
selected, but can be disabled via the boot option noautogroup, and can also
be turned on/off on the fly via..
	echo [01] > /proc/sys/kernel/sched_autogroup_enabled.
..which will automatically move tasks to/from the root task group.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
LKML-Reference: <1290281700.28711.9.camel@maggy.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 Documentation/kernel-parameters.txt |    2 
 fs/proc/base.c                      |   79 +++++++++++
 include/linux/sched.h               |   23 +++
 init/Kconfig                        |   12 +
 kernel/fork.c                       |    5 
 kernel/sched.c                      |   13 +
 kernel/sched_autogroup.c            |  240 ++++++++++++++++++++++++++++++++++++
 kernel/sched_autogroup.h            |   23 +++
 kernel/sched_debug.c                |   29 ++--
 kernel/sys.c                        |    4 
 kernel/sysctl.c                     |   11 +
 11 files changed, 423 insertions(+), 18 deletions(-)

Index: linux-2.6.37.git/include/linux/sched.h
===================================================================
--- linux-2.6.37.git.orig/include/linux/sched.h
+++ linux-2.6.37.git/include/linux/sched.h
@@ -509,6 +509,8 @@ struct thread_group_cputimer {
 	spinlock_t lock;
 };
 
+struct autogroup;
+
 /*
  * NOTE! "signal_struct" does not have it's own
  * locking, because a shared signal_struct always
@@ -576,6 +578,9 @@ struct signal_struct {
 
 	struct tty_struct *tty; /* NULL if no tty */
 
+#ifdef CONFIG_SCHED_AUTOGROUP
+	struct autogroup *autogroup;
+#endif
 	/*
 	 * Cumulative resource counters for dead threads in the group,
 	 * and for reaped dead child processes forked by this group.
@@ -1931,6 +1936,24 @@ int sched_rt_handler(struct ctl_table *t
 
 extern unsigned int sysctl_sched_compat_yield;
 
+#ifdef CONFIG_SCHED_AUTOGROUP
+extern unsigned int sysctl_sched_autogroup_enabled;
+
+extern void sched_autogroup_create_attach(struct task_struct *p);
+extern void sched_autogroup_detach(struct task_struct *p);
+extern void sched_autogroup_fork(struct signal_struct *sig);
+extern void sched_autogroup_exit(struct signal_struct *sig);
+#ifdef CONFIG_PROC_FS
+extern void proc_sched_autogroup_show_task(struct task_struct *p, struct seq_file *m);
+extern int proc_sched_autogroup_set_nice(struct task_struct *p, int *nice);
+#endif
+#else
+static inline void sched_autogroup_create_attach(struct task_struct *p) { }
+static inline void sched_autogroup_detach(struct task_struct *p) { }
+static inline void sched_autogroup_fork(struct signal_struct *sig) { }
+static inline void sched_autogroup_exit(struct signal_struct *sig) { }
+#endif
+
 #ifdef CONFIG_RT_MUTEXES
 extern int rt_mutex_getprio(struct task_struct *p);
 extern void rt_mutex_setprio(struct task_struct *p, int prio);
Index: linux-2.6.37.git/kernel/sched.c
===================================================================
--- linux-2.6.37.git.orig/kernel/sched.c
+++ linux-2.6.37.git/kernel/sched.c
@@ -78,6 +78,7 @@
 
 #include "sched_cpupri.h"
 #include "workqueue_sched.h"
+#include "sched_autogroup.h"
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/sched.h>
@@ -268,6 +269,10 @@ struct task_group {
 	struct task_group *parent;
 	struct list_head siblings;
 	struct list_head children;
+
+#ifdef CONFIG_SCHED_AUTOGROUP
+	struct autogroup *autogroup;
+#endif
 };
 
 #define root_task_group init_task_group
@@ -605,11 +610,14 @@ static inline int cpu_of(struct rq *rq)
  */
 static inline struct task_group *task_group(struct task_struct *p)
 {
+	struct task_group *tg;
 	struct cgroup_subsys_state *css;
 
 	css = task_subsys_state_check(p, cpu_cgroup_subsys_id,
 			lockdep_is_held(&task_rq(p)->lock));
-	return container_of(css, struct task_group, css);
+	tg = container_of(css, struct task_group, css);
+
+	return autogroup_task_group(p, tg);
 }
 
 /* Change a task's cfs_rq and parent entity if it moves across CPUs/groups */
@@ -2006,6 +2014,7 @@ static void sched_irq_time_avg_update(st
 #include "sched_idletask.c"
 #include "sched_fair.c"
 #include "sched_rt.c"
+#include "sched_autogroup.c"
 #include "sched_stoptask.c"
 #ifdef CONFIG_SCHED_DEBUG
 # include "sched_debug.c"
@@ -7979,7 +7988,7 @@ void __init sched_init(void)
 #ifdef CONFIG_CGROUP_SCHED
 	list_add(&init_task_group.list, &task_groups);
 	INIT_LIST_HEAD(&init_task_group.children);
-
+	autogroup_init(&init_task);
 #endif /* CONFIG_CGROUP_SCHED */
 
 #if defined CONFIG_FAIR_GROUP_SCHED && defined CONFIG_SMP
Index: linux-2.6.37.git/kernel/fork.c
===================================================================
--- linux-2.6.37.git.orig/kernel/fork.c
+++ linux-2.6.37.git/kernel/fork.c
@@ -174,8 +174,10 @@ static inline void free_signal_struct(st
 
 static inline void put_signal_struct(struct signal_struct *sig)
 {
-	if (atomic_dec_and_test(&sig->sigcnt))
+	if (atomic_dec_and_test(&sig->sigcnt)) {
+		sched_autogroup_exit(sig);
 		free_signal_struct(sig);
+	}
 }
 
 void __put_task_struct(struct task_struct *tsk)
@@ -904,6 +906,7 @@ static int copy_signal(unsigned long clo
 	posix_cpu_timers_init_group(sig);
 
 	tty_audit_fork(sig);
+	sched_autogroup_fork(sig);
 
 	sig->oom_adj = current->signal->oom_adj;
 	sig->oom_score_adj = current->signal->oom_score_adj;
Index: linux-2.6.37.git/kernel/sys.c
===================================================================
--- linux-2.6.37.git.orig/kernel/sys.c
+++ linux-2.6.37.git/kernel/sys.c
@@ -1080,8 +1080,10 @@ SYSCALL_DEFINE0(setsid)
 	err = session;
 out:
 	write_unlock_irq(&tasklist_lock);
-	if (err > 0)
+	if (err > 0) {
 		proc_sid_connector(group_leader);
+		sched_autogroup_create_attach(group_leader);
+	}
 	return err;
 }
 
Index: linux-2.6.37.git/kernel/sched_debug.c
===================================================================
--- linux-2.6.37.git.orig/kernel/sched_debug.c
+++ linux-2.6.37.git/kernel/sched_debug.c
@@ -87,6 +87,20 @@ static void print_cfs_group_stats(struct
 }
 #endif
 
+#if defined(CONFIG_CGROUP_SCHED) && \
+	(defined(CONFIG_FAIR_GROUP_SCHED) || defined(CONFIG_RT_GROUP_SCHED))
+static void task_group_path(struct task_group *tg, char *buf, int buflen)
+{
+	/* may be NULL if the underlying cgroup isn't fully-created yet */
+	if (!tg->css.cgroup) {
+		if (!autogroup_path(tg, buf, buflen))
+			buf[0] = '\0';
+		return;
+	}
+	cgroup_path(tg->css.cgroup, buf, buflen);
+}
+#endif
+
 static void
 print_task(struct seq_file *m, struct rq *rq, struct task_struct *p)
 {
@@ -115,7 +129,7 @@ print_task(struct seq_file *m, struct rq
 		char path[64];
 
 		rcu_read_lock();
-		cgroup_path(task_group(p)->css.cgroup, path, sizeof(path));
+		task_group_path(task_group(p), path, sizeof(path));
 		rcu_read_unlock();
 		SEQ_printf(m, " %s", path);
 	}
@@ -147,19 +161,6 @@ static void print_rq(struct seq_file *m,
 	read_unlock_irqrestore(&tasklist_lock, flags);
 }
 
-#if defined(CONFIG_CGROUP_SCHED) && \
-	(defined(CONFIG_FAIR_GROUP_SCHED) || defined(CONFIG_RT_GROUP_SCHED))
-static void task_group_path(struct task_group *tg, char *buf, int buflen)
-{
-	/* may be NULL if the underlying cgroup isn't fully-created yet */
-	if (!tg->css.cgroup) {
-		buf[0] = '\0';
-		return;
-	}
-	cgroup_path(tg->css.cgroup, buf, buflen);
-}
-#endif
-
 void print_cfs_rq(struct seq_file *m, int cpu, struct cfs_rq *cfs_rq)
 {
 	s64 MIN_vruntime = -1, min_vruntime, max_vruntime = -1,
Index: linux-2.6.37.git/fs/proc/base.c
===================================================================
--- linux-2.6.37.git.orig/fs/proc/base.c
+++ linux-2.6.37.git/fs/proc/base.c
@@ -1407,6 +1407,82 @@ static const struct file_operations proc
 
 #endif
 
+#ifdef CONFIG_SCHED_AUTOGROUP
+/*
+ * Print out autogroup related information:
+ */
+static int sched_autogroup_show(struct seq_file *m, void *v)
+{
+	struct inode *inode = m->private;
+	struct task_struct *p;
+
+	p = get_proc_task(inode);
+	if (!p)
+		return -ESRCH;
+	proc_sched_autogroup_show_task(p, m);
+
+	put_task_struct(p);
+
+	return 0;
+}
+
+static ssize_t
+sched_autogroup_write(struct file *file, const char __user *buf,
+	    size_t count, loff_t *offset)
+{
+	struct inode *inode = file->f_path.dentry->d_inode;
+	struct task_struct *p;
+	char buffer[PROC_NUMBUF];
+	long nice;
+	int err;
+
+	memset(buffer, 0, sizeof(buffer));
+	if (count > sizeof(buffer) - 1)
+		count = sizeof(buffer) - 1;
+	if (copy_from_user(buffer, buf, count))
+		return -EFAULT;
+
+	err = strict_strtol(strstrip(buffer), 0, &nice);
+	if (err)
+		return -EINVAL;
+
+	p = get_proc_task(inode);
+	if (!p)
+		return -ESRCH;
+
+	err = nice;
+	err = proc_sched_autogroup_set_nice(p, &err);
+	if (err)
+		count = err;
+
+	put_task_struct(p);
+
+	return count;
+}
+
+static int sched_autogroup_open(struct inode *inode, struct file *filp)
+{
+	int ret;
+
+	ret = single_open(filp, sched_autogroup_show, NULL);
+	if (!ret) {
+		struct seq_file *m = filp->private_data;
+
+		m->private = inode;
+	}
+	return ret;
+}
+
+static const struct file_operations proc_pid_sched_autogroup_operations = {
+	.open		= sched_autogroup_open,
+	.read		= seq_read,
+	.write		= sched_autogroup_write,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
+#endif /* CONFIG_SCHED_AUTOGROUP */
+
 static ssize_t comm_write(struct file *file, const char __user *buf,
 				size_t count, loff_t *offset)
 {
@@ -2733,6 +2809,9 @@ static const struct pid_entry tgid_base_
 #ifdef CONFIG_SCHED_DEBUG
 	REG("sched",      S_IRUGO|S_IWUSR, proc_pid_sched_operations),
 #endif
+#ifdef CONFIG_SCHED_AUTOGROUP
+	REG("autogroup",  S_IRUGO|S_IWUSR, proc_pid_sched_autogroup_operations),
+#endif
 	REG("comm",      S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
 #ifdef CONFIG_HAVE_ARCH_TRACEHOOK
 	INF("syscall",    S_IRUSR, proc_pid_syscall),
Index: linux-2.6.37.git/kernel/sched_autogroup.h
===================================================================
--- /dev/null
+++ linux-2.6.37.git/kernel/sched_autogroup.h
@@ -0,0 +1,23 @@
+#ifdef CONFIG_SCHED_AUTOGROUP
+
+static inline struct task_group *
+autogroup_task_group(struct task_struct *p, struct task_group *tg);
+
+#else /* !CONFIG_SCHED_AUTOGROUP */
+
+static inline void autogroup_init(struct task_struct *init_task) {  }
+
+static inline struct task_group *
+autogroup_task_group(struct task_struct *p, struct task_group *tg)
+{
+	return tg;
+}
+
+#ifdef CONFIG_SCHED_DEBUG
+static inline int autogroup_path(struct task_group *tg, char *buf, int buflen)
+{
+	return 0;
+}
+#endif
+
+#endif /* CONFIG_SCHED_AUTOGROUP */
Index: linux-2.6.37.git/kernel/sched_autogroup.c
===================================================================
--- /dev/null
+++ linux-2.6.37.git/kernel/sched_autogroup.c
@@ -0,0 +1,240 @@
+#ifdef CONFIG_SCHED_AUTOGROUP
+
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
+#include <linux/kallsyms.h>
+#include <linux/utsname.h>
+
+unsigned int __read_mostly sysctl_sched_autogroup_enabled = 1;
+
+struct autogroup {
+	struct kref		kref;
+	struct task_group	*tg;
+	struct rw_semaphore	lock;
+	unsigned long		id;
+	int			nice;
+};
+
+static struct autogroup autogroup_default;
+static atomic_t autogroup_seq_nr;
+
+static void autogroup_init(struct task_struct *init_task)
+{
+	autogroup_default.tg = &init_task_group;
+	init_task_group.autogroup = &autogroup_default;
+	kref_init(&autogroup_default.kref);
+	init_rwsem(&autogroup_default.lock);
+	init_task->signal->autogroup = &autogroup_default;
+}
+
+static inline void autogroup_free(struct task_group *tg)
+{
+	kfree(tg->autogroup);
+}
+
+static inline void autogroup_destroy(struct kref *kref)
+{
+	struct autogroup *ag = container_of(kref, struct autogroup, kref);
+
+	sched_destroy_group(ag->tg);
+}
+
+static inline void autogroup_kref_put(struct autogroup *ag)
+{
+	kref_put(&ag->kref, autogroup_destroy);
+}
+
+static inline struct autogroup *autogroup_kref_get(struct autogroup *ag)
+{
+	kref_get(&ag->kref);
+	return ag;
+}
+
+static inline struct autogroup *autogroup_create(void)
+{
+	struct autogroup *ag = kzalloc(sizeof(*ag), GFP_KERNEL);
+	struct task_group *tg;
+
+	if (!ag)
+		goto out_fail;
+
+	tg = sched_create_group(&init_task_group);
+
+	if (IS_ERR(tg))
+		goto out_free;
+
+	kref_init(&ag->kref);
+	init_rwsem(&ag->lock);
+	ag->id = atomic_inc_return(&autogroup_seq_nr);
+	ag->tg = tg;
+	tg->autogroup = ag;
+
+	return ag;
+
+out_free:
+	kfree(ag);
+out_fail:
+	if (printk_ratelimit())
+		printk(KERN_WARNING "autogroup_create: %s failure.\n",
+			ag ? "sched_create_group()" : "kmalloc()");
+
+	return autogroup_kref_get(&autogroup_default);
+}
+
+static inline bool
+task_wants_autogroup(struct task_struct *p, struct task_group *tg)
+{
+	if (tg != &root_task_group)
+		return false;
+
+	if (p->sched_class != &fair_sched_class)
+		return false;
+
+	/*
+	 * We can only assume the task group can't go away on us if
+	 * autogroup_move_group() can see us on ->thread_group list.
+	 */
+	if (p->flags & PF_EXITING)
+		return false;
+
+	return true;
+}
+
+static inline struct task_group *
+autogroup_task_group(struct task_struct *p, struct task_group *tg)
+{
+	int enabled = ACCESS_ONCE(sysctl_sched_autogroup_enabled);
+
+	if (enabled && task_wants_autogroup(p, tg))
+		return p->signal->autogroup->tg;
+
+	return tg;
+}
+
+static void
+autogroup_move_group(struct task_struct *p, struct autogroup *ag)
+{
+	struct autogroup *prev;
+	struct task_struct *t;
+	unsigned long flags;
+
+	if (!lock_task_sighand(p, &flags)) {
+		WARN_ON(1);
+		return;
+	}
+
+	prev = p->signal->autogroup;
+	if (prev == ag) {
+		unlock_task_sighand(p, &flags);
+		return;
+	}
+
+	p->signal->autogroup = autogroup_kref_get(ag);
+	t = p;
+
+	do {
+		sched_move_task(t);
+	} while_each_thread(p, t);
+
+	unlock_task_sighand(p, &flags);
+	autogroup_kref_put(prev);
+}
+
+/* Allocates GFP_KERNEL, cannot be called under any spinlock */
+void sched_autogroup_create_attach(struct task_struct *p)
+{
+	struct autogroup *ag = autogroup_create();
+
+	autogroup_move_group(p, ag);
+	/* drop extra refrence added by autogroup_create() */
+	autogroup_kref_put(ag);
+}
+EXPORT_SYMBOL(sched_autogroup_create_attach);
+
+/* Cannot be called under siglock.  Currently has no users */
+void sched_autogroup_detach(struct task_struct *p)
+{
+	autogroup_move_group(p, &autogroup_default);
+}
+EXPORT_SYMBOL(sched_autogroup_detach);
+
+void sched_autogroup_fork(struct signal_struct *sig)
+{
+	struct task_struct *p = current;
+
+	spin_lock_irq(&p->sighand->siglock);
+	sig->autogroup = autogroup_kref_get(p->signal->autogroup);
+	spin_unlock_irq(&p->sighand->siglock);
+}
+
+void sched_autogroup_exit(struct signal_struct *sig)
+{
+	autogroup_kref_put(sig->autogroup);
+}
+
+static int __init setup_autogroup(char *str)
+{
+	sysctl_sched_autogroup_enabled = 0;
+
+	return 1;
+}
+
+__setup("noautogroup", setup_autogroup);
+
+#ifdef CONFIG_PROC_FS
+
+/* Called with siglock held. */
+int proc_sched_autogroup_set_nice(struct task_struct *p, int *nice)
+{
+	static unsigned long next = INITIAL_JIFFIES;
+	struct autogroup *ag;
+	int err;
+
+	if (*nice < -20 || *nice > 19)
+		return -EINVAL;
+
+	err = security_task_setnice(current, *nice);
+	if (err)
+		return err;
+
+	if (*nice < 0 && !can_nice(current, *nice))
+		return -EPERM;
+
+	/* this is a heavy operation taking global locks.. */
+	if (!capable(CAP_SYS_ADMIN) && time_before(jiffies, next))
+		return -EAGAIN;
+
+	next = HZ / 10 + jiffies;
+	ag = autogroup_kref_get(p->signal->autogroup);
+
+	down_write(&ag->lock);
+	err = sched_group_set_shares(ag->tg, prio_to_weight[*nice + 20]);
+	if (!err)
+		ag->nice = *nice;
+	up_write(&ag->lock);
+
+	autogroup_kref_put(ag);
+
+	return err;
+}
+
+void proc_sched_autogroup_show_task(struct task_struct *p, struct seq_file *m)
+{
+	struct autogroup *ag = autogroup_kref_get(p->signal->autogroup);
+
+	down_read(&ag->lock);
+	seq_printf(m, "/autogroup-%ld nice %d\n", ag->id, ag->nice);
+	up_read(&ag->lock);
+
+	autogroup_kref_put(ag);
+}
+#endif /* CONFIG_PROC_FS */
+
+#ifdef CONFIG_SCHED_DEBUG
+static inline int autogroup_path(struct task_group *tg, char *buf, int buflen)
+{
+	return snprintf(buf, buflen, "%s-%ld", "/autogroup", tg->autogroup->id);
+}
+#endif /* CONFIG_SCHED_DEBUG */
+
+#endif /* CONFIG_SCHED_AUTOGROUP */
Index: linux-2.6.37.git/kernel/sysctl.c
===================================================================
--- linux-2.6.37.git.orig/kernel/sysctl.c
+++ linux-2.6.37.git/kernel/sysctl.c
@@ -382,6 +382,17 @@ static struct ctl_table kern_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec,
 	},
+#ifdef CONFIG_SCHED_AUTOGROUP
+	{
+		.procname	= "sched_autogroup_enabled",
+		.data		= &sysctl_sched_autogroup_enabled,
+		.maxlen		= sizeof(unsigned int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+		.extra1		= &zero,
+		.extra2		= &one,
+	},
+#endif
 #ifdef CONFIG_PROVE_LOCKING
 	{
 		.procname	= "prove_locking",
Index: linux-2.6.37.git/init/Kconfig
===================================================================
--- linux-2.6.37.git.orig/init/Kconfig
+++ linux-2.6.37.git/init/Kconfig
@@ -728,6 +728,18 @@ config NET_NS
 
 endif # NAMESPACES
 
+config SCHED_AUTOGROUP
+	bool "Automatic process group scheduling"
+	select CGROUPS
+	select CGROUP_SCHED
+	select FAIR_GROUP_SCHED
+	help
+	  This option optimizes the scheduler for common desktop workloads by
+	  automatically creating and populating task groups.  This separation
+	  of workloads isolates aggressive CPU burners (like build jobs) from
+	  desktop applications.  Task group autogeneration is currently based
+	  upon task session.
+
 config MM_OWNER
 	bool
 
Index: linux-2.6.37.git/Documentation/kernel-parameters.txt
===================================================================
--- linux-2.6.37.git.orig/Documentation/kernel-parameters.txt
+++ linux-2.6.37.git/Documentation/kernel-parameters.txt
@@ -1622,6 +1622,8 @@ and is between 256 and 4096 characters.
 	noapic		[SMP,APIC] Tells the kernel to not make use of any
 			IOAPICs that may be present in the system.
 
+	noautogroup	Disable scheduler automatic task group creation.
+
 	nobats		[PPC] Do not use BATs for mapping kernel lowmem
 			on "Classic" PPC cores.
 

        



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: mmotm 2010-11-23 + autogroups -> inconsistent lock state
  2010-11-24 20:25   ` Mike Galbraith
@ 2010-11-24 20:39     ` Mike Galbraith
  2010-11-25  6:09     ` Valdis.Kletnieks
  2010-12-02 18:16     ` Paul E. McKenney
  2 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2010-11-24 20:39 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: akpm, Ingo Molnar, mm-commits, linux-kernel

On Wed, 2010-11-24 at 13:25 -0700, Mike Galbraith wrote:

> The below should run gripe free.

In <= 2.6.37-rc3 kernel I mean.  The tip version is still explosive.

	-Mike


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: mmotm 2010-11-23 + autogroups -> inconsistent lock state
  2010-11-24 20:25   ` Mike Galbraith
  2010-11-24 20:39     ` Mike Galbraith
@ 2010-11-25  6:09     ` Valdis.Kletnieks
  2010-12-02 18:16     ` Paul E. McKenney
  2 siblings, 0 replies; 27+ messages in thread
From: Valdis.Kletnieks @ 2010-11-25  6:09 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: akpm, Ingo Molnar, mm-commits, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 750 bytes --]

On Wed, 24 Nov 2010 13:25:25 MST, Mike Galbraith said:
> On Wed, 2010-11-24 at 00:01 -0500, Valdis.Kletnieks@vt.edu wrote:
> > On Tue, 23 Nov 2010 16:13:06 PST, akpm@linux-foundation.org said:
> > > The mm-of-the-moment snapshot 2010-11-23-16-12 has been uploaded to
> > > 
> > >    http://userweb.kernel.org/~akpm/mmotm/
> > 
> > (I appear to be on a roll tonight - 3 splats before I even had a chance to login. :)
> > 
> > mmotm + Ingo's cleanup of Mike's autogroups patch.
> ...
> 
> Sorry for slow response, been trying to use some of my last few vacation
> days on vacation stuff ;-)
> 
> The below should run gripe free.  Suppose I should learn to turn on
> lockdep and whatnot when tinkering/testing.

Yes, this version runs quietly, thanks.


[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: mmotm 2010-11-23 - WARNING: at drivers/tty/tty_io.c:1331
  2010-11-24  4:55 ` mmotm 2010-11-23 - WARNING: at drivers/tty/tty_io.c:1331 Valdis.Kletnieks
@ 2010-11-25 15:14   ` Kyle McMartin
  2010-11-25 16:44     ` Jiri Slaby
  0 siblings, 1 reply; 27+ messages in thread
From: Kyle McMartin @ 2010-11-25 15:14 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: akpm, mm-commits, linux-kernel, Jiri Slaby

On Tue, Nov 23, 2010 at 11:55:39PM -0500, Valdis.Kletnieks@vt.edu wrote:
> On Tue, 23 Nov 2010 16:13:06 PST, akpm@linux-foundation.org said:
> > The mm-of-the-moment snapshot 2010-11-23-16-12 has been uploaded to
> > 
> >    http://userweb.kernel.org/~akpm/mmotm/
> 
> Seen during boot:
> 
> [   23.015448] Modules linked in:
> [   23.015453] Pid: 1207, comm: plymouthd Not tainted 2.6.37-rc3-mmotm1123 #3
> [   23.015455] Call Trace:

I've been trying to figure this one out for a while, without much luck.
(Users are seeing it in 2.6.36 as well.)

I *think* (I added a rawhide debugging patch to print the tty->name)
that plymouth is always opening tty7 to cause this. My guess is the BKL
removal has exposed some kind of race, but it's not obvious to me (and
there's many other bugs to sort through too. :(

CC-ing Jiri since he seems to be the poor guy who's been poking this
recently (there's a good few threads about this (though the others look
like an ldisc attach race...)) I wouldn't think that's the case here
since N_TTY is the default...

--Kyle

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: mmotm 2010-11-23 - WARNING: at drivers/tty/tty_io.c:1331
  2010-11-25 15:14   ` Kyle McMartin
@ 2010-11-25 16:44     ` Jiri Slaby
  2010-11-25 16:51       ` Jiri Slaby
  0 siblings, 1 reply; 27+ messages in thread
From: Jiri Slaby @ 2010-11-25 16:44 UTC (permalink / raw)
  To: Kyle McMartin
  Cc: Valdis.Kletnieks, akpm, mm-commits, linux-kernel, Alan Cox, Greg KH

On 11/25/2010 04:14 PM, Kyle McMartin wrote:
> On Tue, Nov 23, 2010 at 11:55:39PM -0500, Valdis.Kletnieks@vt.edu wrote:
>> On Tue, 23 Nov 2010 16:13:06 PST, akpm@linux-foundation.org said:
>>> The mm-of-the-moment snapshot 2010-11-23-16-12 has been uploaded to
>>>
>>>    http://userweb.kernel.org/~akpm/mmotm/
>>
>> Seen during boot:
>>
>> [   23.015448] Modules linked in:
>> [   23.015453] Pid: 1207, comm: plymouthd Not tainted 2.6.37-rc3-mmotm1123 #3
>> [   23.015455] Call Trace:
> 
> I've been trying to figure this one out for a while, without much luck.
> (Users are seeing it in 2.6.36 as well.)
> 
> I *think* (I added a rawhide debugging patch to print the tty->name)
> that plymouth is always opening tty7 to cause this. My guess is the BKL
> removal has exposed some kind of race, but it's not obvious to me (and
> there's many other bugs to sort through too. :(
> 
> CC-ing Jiri since he seems to be the poor guy who's been poking this
> recently (there's a good few threads about this (though the others look
> like an ldisc attach race...)) I wouldn't think that's the case here
> since N_TTY is the default...

Ok, tty_reopen is called without TTY_LDISC set. For further
considerations, note tty_lock is held in tty_open. TTY_LDISC is cleared in:

1) __tty_hangup from tty_ldisc_hangup to tty_ldisc_enable. During this
section tty_lock is held.

2) tty_release via tty_ldisc_release till the end of tty existence. If
tty->count <= 1, tty_lock is taken, TTY_CLOSING bit set and then
tty_ldisc_release called. tty_reopen checks TTY_CLOSING before checking
TTY_LDISC.

3) tty_set_ldisc from tty_ldisc_halt to tty_ldisc_enable. We take
tty_lock, set TTY_LDISC_CHANGING, put tty_lock, do some other work, take
tty_lock, call tty_ldisc_enable, put tty_lock.

So the only option I see is 3) and we should do:
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -1310,7 +1310,8 @@ static int tty_reopen(struct tty_struct *tty)
 {
        struct tty_driver *driver = tty->driver;

-       if (test_bit(TTY_CLOSING, &tty->flags))
+       if (test_bit(TTY_CLOSING, &tty->flags) ||
+                       test_bit(TTY_LDISC_CHANGING, &tty->flags))
                return -EIO;

        if (driver->type == TTY_DRIVER_TYPE_PTY &&

Alan, Greg?

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: mmotm 2010-11-23 - WARNING: at drivers/tty/tty_io.c:1331
  2010-11-25 16:44     ` Jiri Slaby
@ 2010-11-25 16:51       ` Jiri Slaby
  2010-11-25 17:16         ` [PATCH 1/1] TTY: don't allow reopen when ldisc is changing Jiri Slaby
  0 siblings, 1 reply; 27+ messages in thread
From: Jiri Slaby @ 2010-11-25 16:51 UTC (permalink / raw)
  Cc: Kyle McMartin, Valdis.Kletnieks, akpm, mm-commits, linux-kernel,
	Alan Cox, Greg KH

On 11/25/2010 05:44 PM, Jiri Slaby wrote:
> On 11/25/2010 04:14 PM, Kyle McMartin wrote:
>> On Tue, Nov 23, 2010 at 11:55:39PM -0500, Valdis.Kletnieks@vt.edu wrote:
>>> On Tue, 23 Nov 2010 16:13:06 PST, akpm@linux-foundation.org said:
>>>> The mm-of-the-moment snapshot 2010-11-23-16-12 has been uploaded to
>>>>
>>>>    http://userweb.kernel.org/~akpm/mmotm/
>>>
>>> Seen during boot:
>>>
>>> [   23.015448] Modules linked in:
>>> [   23.015453] Pid: 1207, comm: plymouthd Not tainted 2.6.37-rc3-mmotm1123 #3
>>> [   23.015455] Call Trace:
>>
>> I've been trying to figure this one out for a while, without much luck.
>> (Users are seeing it in 2.6.36 as well.)
>>
>> I *think* (I added a rawhide debugging patch to print the tty->name)
>> that plymouth is always opening tty7 to cause this. My guess is the BKL
>> removal has exposed some kind of race, but it's not obvious to me (and
>> there's many other bugs to sort through too. :(
>>
>> CC-ing Jiri since he seems to be the poor guy who's been poking this
>> recently (there's a good few threads about this (though the others look
>> like an ldisc attach race...)) I wouldn't think that's the case here
>> since N_TTY is the default...
> 
> Ok, tty_reopen is called without TTY_LDISC set. For further
> considerations, note tty_lock is held in tty_open. TTY_LDISC is cleared in:
> 
> 1) __tty_hangup from tty_ldisc_hangup to tty_ldisc_enable. During this
> section tty_lock is held.
> 
> 2) tty_release via tty_ldisc_release till the end of tty existence. If
> tty->count <= 1, tty_lock is taken, TTY_CLOSING bit set and then
> tty_ldisc_release called. tty_reopen checks TTY_CLOSING before checking
> TTY_LDISC.
> 
> 3) tty_set_ldisc from tty_ldisc_halt to tty_ldisc_enable. We take
> tty_lock, set TTY_LDISC_CHANGING, put tty_lock, do some other work, take
> tty_lock, call tty_ldisc_enable, put tty_lock.

Oh, "do some other work" includes tty_ldisc_halt where TTY_LDISC is
cleared and tty_lock is _not_ held.

> So the only option I see is 3) and we should do:
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -1310,7 +1310,8 @@ static int tty_reopen(struct tty_struct *tty)
>  {
>         struct tty_driver *driver = tty->driver;
> 
> -       if (test_bit(TTY_CLOSING, &tty->flags))
> +       if (test_bit(TTY_CLOSING, &tty->flags) ||
> +                       test_bit(TTY_LDISC_CHANGING, &tty->flags))
>                 return -EIO;
> 
>         if (driver->type == TTY_DRIVER_TYPE_PTY &&
> 
> Alan, Greg?
> 
> thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/1] TTY: don't allow reopen when ldisc is changing
  2010-11-25 16:51       ` Jiri Slaby
@ 2010-11-25 17:16         ` Jiri Slaby
  2010-11-25 17:59           ` Kyle McMartin
  2010-11-26  0:28           ` Kyle McMartin
  0 siblings, 2 replies; 27+ messages in thread
From: Jiri Slaby @ 2010-11-25 17:16 UTC (permalink / raw)
  To: gregkh; +Cc: akpm, linux-kernel, jirislaby, Kyle McMartin, Alan Cox

There are many WARNINGs like the following reported nowadays:
WARNING: at drivers/tty/tty_io.c:1331 tty_open+0x2a2/0x49a()
Hardware name: Latitude E6500
Modules linked in:
Pid: 1207, comm: plymouthd Not tainted 2.6.37-rc3-mmotm1123 #3
Call Trace:
 [<ffffffff8103b189>] warn_slowpath_common+0x80/0x98
 [<ffffffff8103b1b6>] warn_slowpath_null+0x15/0x17
 [<ffffffff8128a3ab>] tty_open+0x2a2/0x49a
 [<ffffffff810fd53f>] chrdev_open+0x11d/0x146
...

This means tty_reopen is called without TTY_LDISC set. For further
considerations, note tty_lock is held in tty_open. TTY_LDISC is cleared in:
1) __tty_hangup from tty_ldisc_hangup to tty_ldisc_enable. During this
section tty_lock is held.

2) tty_release via tty_ldisc_release till the end of tty existence. If
tty->count <= 1, tty_lock is taken, TTY_CLOSING bit set and then
tty_ldisc_release called. tty_reopen checks TTY_CLOSING before checking
TTY_LDISC.

3) tty_set_ldisc from tty_ldisc_halt to tty_ldisc_enable. We:
   * take tty_lock, set TTY_LDISC_CHANGING, put tty_lock
   * call tty_ldisc_halt (clear TTY_LDISC), tty_lock is _not_ held
   * do some other work
   * take tty_lock, call tty_ldisc_enable (set TTY_LDISC), put
     tty_lock

So the only option I see is 3). The solution is to check
TTY_LDISC_CHANGING along with TTY_CLOSING in tty_reopen.

Nicely reproducible with two processes:
while (1) {
	fd = open("/dev/ttyS1", O_RDWR);
	if (fd < 0) {
		warn("open");
		continue;
	}
	close(fd);
}
--------
while (1) {
        fd = open("/dev/ttyS1", O_RDWR);
        ld1 = 0; ld2 = 2;
        while (1) {
                ioctl(fd, TIOCSETD, &ld1);
                ioctl(fd, TIOCSETD, &ld2);
        }
        close(fd);
}

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: <Valdis.Kletnieks@vt.edu>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
---
 drivers/tty/tty_io.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index c05c5af..878f6d6 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -1310,7 +1310,8 @@ static int tty_reopen(struct tty_struct *tty)
 {
 	struct tty_driver *driver = tty->driver;
 
-	if (test_bit(TTY_CLOSING, &tty->flags))
+	if (test_bit(TTY_CLOSING, &tty->flags) ||
+			test_bit(TTY_LDISC_CHANGING, &tty->flags))
 		return -EIO;
 
 	if (driver->type == TTY_DRIVER_TYPE_PTY &&
-- 
1.7.3.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/1] TTY: don't allow reopen when ldisc is changing
  2010-11-25 17:16         ` [PATCH 1/1] TTY: don't allow reopen when ldisc is changing Jiri Slaby
@ 2010-11-25 17:59           ` Kyle McMartin
  2010-11-26  0:28           ` Kyle McMartin
  1 sibling, 0 replies; 27+ messages in thread
From: Kyle McMartin @ 2010-11-25 17:59 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: gregkh, akpm, linux-kernel, jirislaby, Kyle McMartin, Alan Cox

On Thu, Nov 25, 2010 at 06:16:23PM +0100, Jiri Slaby wrote:
> -	if (test_bit(TTY_CLOSING, &tty->flags))
> +	if (test_bit(TTY_CLOSING, &tty->flags) ||
> +			test_bit(TTY_LDISC_CHANGING, &tty->flags))
>  		return -EIO;
>  

Doh, nice catch. I just built a couple test images and sent them out to
the reporters for confirmation.

Thanks!
 Kyle

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/1] TTY: don't allow reopen when ldisc is changing
  2010-11-25 17:16         ` [PATCH 1/1] TTY: don't allow reopen when ldisc is changing Jiri Slaby
  2010-11-25 17:59           ` Kyle McMartin
@ 2010-11-26  0:28           ` Kyle McMartin
  2010-11-26  7:46             ` Jiri Slaby
  1 sibling, 1 reply; 27+ messages in thread
From: Kyle McMartin @ 2010-11-26  0:28 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: gregkh, akpm, linux-kernel, jirislaby, Kyle McMartin, Alan Cox

On Thu, Nov 25, 2010 at 06:16:23PM +0100, Jiri Slaby wrote:
> -	if (test_bit(TTY_CLOSING, &tty->flags))
> +	if (test_bit(TTY_CLOSING, &tty->flags) ||
> +			test_bit(TTY_LDISC_CHANGING, &tty->flags))
>  		return -EIO;
>  
>  	if (driver->type == TTY_DRIVER_TYPE_PTY &&

Unfortunately, users report this doesn't seem to fix things for them
(built against 2.6.36 (plus another patch you wrote iirc.))

https://bugzilla.redhat.com/show_bug.cgi?id=630464#c27

I tried reverting the TTY patches between 2.6.36 and 2.6.35 and getting
them to test that, and it seems ok:

https://bugzilla.redhat.com/show_bug.cgi?id=630464#c30

So I guess there must be a race here somewhere... I'll keep looking. :/

I would imagine it's something that's probably existed since the dawn of
time but the BKL has just papered over entirely.

Thanks for trying!
  --Kyle

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/1] TTY: don't allow reopen when ldisc is changing
  2010-11-26  0:28           ` Kyle McMartin
@ 2010-11-26  7:46             ` Jiri Slaby
  2010-11-26 13:27               ` Kyle McMartin
  2010-11-27  2:59               ` Kyle McMartin
  0 siblings, 2 replies; 27+ messages in thread
From: Jiri Slaby @ 2010-11-26  7:46 UTC (permalink / raw)
  To: Kyle McMartin; +Cc: Jiri Slaby, gregkh, akpm, linux-kernel, Alan Cox

On 11/26/2010 01:28 AM, Kyle McMartin wrote:
> On Thu, Nov 25, 2010 at 06:16:23PM +0100, Jiri Slaby wrote:
>> -	if (test_bit(TTY_CLOSING, &tty->flags))
>> +	if (test_bit(TTY_CLOSING, &tty->flags) ||
>> +			test_bit(TTY_LDISC_CHANGING, &tty->flags))
>>  		return -EIO;
>>  
>>  	if (driver->type == TTY_DRIVER_TYPE_PTY &&
> 
> Unfortunately, users report this doesn't seem to fix things for them
> (built against 2.6.36 (plus another patch you wrote iirc.))

Which patches exactly do you have? You need three of mine in 2.6.36.

> https://bugzilla.redhat.com/show_bug.cgi?id=630464#c27

regards,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/1] TTY: don't allow reopen when ldisc is changing
  2010-11-26  7:46             ` Jiri Slaby
@ 2010-11-26 13:27               ` Kyle McMartin
  2010-11-27  2:59               ` Kyle McMartin
  1 sibling, 0 replies; 27+ messages in thread
From: Kyle McMartin @ 2010-11-26 13:27 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Kyle McMartin, gregkh, akpm, linux-kernel, Alan Cox

On Fri, Nov 26, 2010 at 08:46:18AM +0100, Jiri Slaby wrote:
> On 11/26/2010 01:28 AM, Kyle McMartin wrote:
> > On Thu, Nov 25, 2010 at 06:16:23PM +0100, Jiri Slaby wrote:
> >> -	if (test_bit(TTY_CLOSING, &tty->flags))
> >> +	if (test_bit(TTY_CLOSING, &tty->flags) ||
> >> +			test_bit(TTY_LDISC_CHANGING, &tty->flags))
> >>  		return -EIO;
> >>  
> >>  	if (driver->type == TTY_DRIVER_TYPE_PTY &&
> > 
> > Unfortunately, users report this doesn't seem to fix things for them
> > (built against 2.6.36 (plus another patch you wrote iirc.))
> 
> Which patches exactly do you have? You need three of mine in 2.6.36.
> 

Just tty-restore-tty_ldisc_wait_idle.patch on top of 2.6.36.1, I'll grab
the other two now.

--Kyle

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: mmotm 2010-11-23-16-12 uploaded (olpc)
  2010-11-24 18:51 ` mmotm 2010-11-23-16-12 uploaded (olpc) Randy Dunlap
  2010-11-24 19:13   ` Andres Salomon
@ 2010-11-26 16:46   ` Daniel Drake
  1 sibling, 0 replies; 27+ messages in thread
From: Daniel Drake @ 2010-11-26 16:46 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: akpm, Andres Salomon, linux-kernel, linux-mm, linux-fsdevel

On 24 November 2010 18:51, Randy Dunlap <randy.dunlap@oracle.com> wrote:
> make[4]: *** No rule to make target `arch/x86/platform/olpc/olpc-xo1-wakeup.c', needed by `arch/x86/platform/olpc/olpc-xo1-wakeup.o'.
>
>
> It's olpc-xo1-wakeup.S, so I guess it needs a special makefile rule ??

Works if you build it in, but fails as above as a module.

And it looks like making it work as a module is not as easy as we
thought. I'll discuss this with Andres and get a new patch submitted
soon.

Daniel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/1] TTY: don't allow reopen when ldisc is changing
  2010-11-26  7:46             ` Jiri Slaby
  2010-11-26 13:27               ` Kyle McMartin
@ 2010-11-27  2:59               ` Kyle McMartin
  2010-11-27  8:50                 ` Jiri Slaby
  1 sibling, 1 reply; 27+ messages in thread
From: Kyle McMartin @ 2010-11-27  2:59 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Kyle McMartin, gregkh, akpm, linux-kernel, Alan Cox

On Fri, Nov 26, 2010 at 08:46:18AM +0100, Jiri Slaby wrote:
> >> -	if (test_bit(TTY_CLOSING, &tty->flags))
> >> +	if (test_bit(TTY_CLOSING, &tty->flags) ||
> >> +			test_bit(TTY_LDISC_CHANGING, &tty->flags))
> >>  		return -EIO;
> >>  
> >>  	if (driver->type == TTY_DRIVER_TYPE_PTY &&
> > 
> > Unfortunately, users report this doesn't seem to fix things for them
> > (built against 2.6.36 (plus another patch you wrote iirc.))
> 
> Which patches exactly do you have? You need three of mine in 2.6.36.
> 
> > https://bugzilla.redhat.com/show_bug.cgi?id=630464#c27
> 

Hrm, I'm still seeing it on top of Linus' latest with that patch. :/

Even more bizarrely, I tried to come up with ways this could be failing,
and decided to test a few things...

I set_bit(TTY_DEBUG, &tty->flags) just before returning from
tty_init_dev (which, afaict, should be called for vc/tty$n and ptys?)
and then checked it with a similar WARN_ON in tty_reopen, and found that
I was hitting it fairly regularly.

As far as I can tell, for this to occur, we'd need something to open
/dev/tty1 first, which hits the tty_init_dev, and something else to very
closely follow that, hit the linking of driver->ttys[idx] and so skip
into tty_reopen, and smack into my WARN_ON.

Of course, given the locking, I have no idea how it could possibly be
happening.

I'm poking around to see, I think maybe something might be dropping
locks in the callchain that gives us a window where this might be
possible... I don't see any other way we could end up with tty1 having
TTY_LDISC unset.

(I'm poking in some more debugging, and moving the 'linking in' of the
 device until after tty_ldisc_setup in tty_init_dev, but I'm not
 particularly hopeful.)

 --Kyle

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/1] TTY: don't allow reopen when ldisc is changing
  2010-11-27  2:59               ` Kyle McMartin
@ 2010-11-27  8:50                 ` Jiri Slaby
  2010-11-27  9:43                   ` Jiri Slaby
  0 siblings, 1 reply; 27+ messages in thread
From: Jiri Slaby @ 2010-11-27  8:50 UTC (permalink / raw)
  To: Kyle McMartin; +Cc: gregkh, akpm, linux-kernel, Alan Cox

On 11/27/2010 03:59 AM, Kyle McMartin wrote:
> I'm poking around to see, I think maybe something might be dropping
> locks in the callchain that gives us a window where this might be
> possible...

Of course, that's the case:
        clear_bit(TTY_LDISC, &tty->flags);
        tty_unlock();
        cancel_delayed_work_sync(&tty->buf.work);
        mutex_unlock(&tty->ldisc_mutex);

        tty_lock();
        mutex_lock(&tty->ldisc_mutex);

in tty_ldisc_hangup. Hence my point 1) from previous posts doesn't hold too:
1) __tty_hangup from tty_ldisc_hangup to tty_ldisc_enable. During this
section tty_lock is held.

I will check, how to fix this.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/1] TTY: don't allow reopen when ldisc is changing
  2010-11-27  8:50                 ` Jiri Slaby
@ 2010-11-27  9:43                   ` Jiri Slaby
  2010-11-27 15:11                     ` Jiri Slaby
  0 siblings, 1 reply; 27+ messages in thread
From: Jiri Slaby @ 2010-11-27  9:43 UTC (permalink / raw)
  Cc: Kyle McMartin, gregkh, akpm, linux-kernel, Alan Cox

[-- Attachment #1: Type: text/plain, Size: 813 bytes --]

On 11/27/2010 09:50 AM, Jiri Slaby wrote:
> On 11/27/2010 03:59 AM, Kyle McMartin wrote:
>> I'm poking around to see, I think maybe something might be dropping
>> locks in the callchain that gives us a window where this might be
>> possible...
> 
> Of course, that's the case:
>         clear_bit(TTY_LDISC, &tty->flags);
>         tty_unlock();
>         cancel_delayed_work_sync(&tty->buf.work);
>         mutex_unlock(&tty->ldisc_mutex);
> 
>         tty_lock();
>         mutex_lock(&tty->ldisc_mutex);
> 
> in tty_ldisc_hangup. Hence my point 1) from previous posts doesn't hold too:
> 1) __tty_hangup from tty_ldisc_hangup to tty_ldisc_enable. During this
> section tty_lock is held.
> 
> I will check, how to fix this.

Reproducible with 2 running processes from the attachment.

regards,
-- 
js
suse labs

[-- Attachment #2: tty_reopen.c --]
[-- Type: text/x-csrc, Size: 1227 bytes --]

#include <err.h>
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#include <sys/ioctl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>

static void do_work(const char *tty)
{
	char buf[256];
	unsigned int cnt = 0;
	unsigned int errc = 0;
	int fd, con;

	if (signal(SIGHUP, SIG_IGN) == SIG_ERR)
		err(1, "signal(SIGHUP)");

	setsid();

	con = open("/tmp/aaa", O_WRONLY|O_NOCTTY|O_CREAT);
	if (con < 0)
		err(2, "open cons");

	while (1) {
		if (!(cnt++ % 10000)) {
			int len = sprintf(buf, "err=%x\n", errc);
			write(con, buf, len);
			errc = 0;
		}
		fd = open(tty, O_RDWR|O_NOCTTY);
		if (fd < 0) {
			errc |= 1;
			continue;
		}
		if (ioctl(fd, TIOCSCTTY)) {
			errc |= 2;
			continue;
		}

		if (vhangup()) {
			errc |= 4;
			continue;
		}
		close(fd);
	}
	close(con);
	exit(errc);
}

int main(int argc, char **argv)
{
	pid_t pid;

	switch (pid = fork()) {
	case 0:
		do_work(argv[1]);
		break;
	case -1:
		err(1, "fork");
		break;
	default:
	{
		int stat;
		waitpid(pid, &stat, 0);
		if (stat) {
			fprintf(stderr, "exited with: %d sig=%d signr=%u\n",
					WEXITSTATUS(stat), WIFSIGNALED(stat),
					WTERMSIG(stat));
		}
		break;
	}
	}

	return 0;
}

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/1] TTY: don't allow reopen when ldisc is changing
  2010-11-27  9:43                   ` Jiri Slaby
@ 2010-11-27 15:11                     ` Jiri Slaby
  2010-11-27 23:53                       ` Kyle McMartin
  0 siblings, 1 reply; 27+ messages in thread
From: Jiri Slaby @ 2010-11-27 15:11 UTC (permalink / raw)
  Cc: Kyle McMartin, gregkh, akpm, linux-kernel, Alan Cox

[-- Attachment #1: Type: text/plain, Size: 1114 bytes --]

On 11/27/2010 10:43 AM, Jiri Slaby wrote:
> On 11/27/2010 09:50 AM, Jiri Slaby wrote:
>> On 11/27/2010 03:59 AM, Kyle McMartin wrote:
>>> I'm poking around to see, I think maybe something might be dropping
>>> locks in the callchain that gives us a window where this might be
>>> possible...
>>
>> Of course, that's the case:
>>         clear_bit(TTY_LDISC, &tty->flags);
>>         tty_unlock();
>>         cancel_delayed_work_sync(&tty->buf.work);
>>         mutex_unlock(&tty->ldisc_mutex);
>>
>>         tty_lock();
>>         mutex_lock(&tty->ldisc_mutex);
>>
>> in tty_ldisc_hangup. Hence my point 1) from previous posts doesn't hold too:
>> 1) __tty_hangup from tty_ldisc_hangup to tty_ldisc_enable. During this
>> section tty_lock is held.
>>
>> I will check, how to fix this.
> 
> Reproducible with 2 running processes from the attachment.

Is it fixed with the attached proof-of-concept patch?

So you need:
THIS ONE
TTY: don't allow reopen when ldisc is changing
TTY: ldisc, fix open flag handling
Char: TTY, restore tty_ldisc_wait_idle

The last one is in 2.6.37-rc2 already.

thanks,
-- 
js
suse labs

[-- Attachment #2: 0001-TTY-open-hangup-race-fixup.patch --]
[-- Type: text/x-patch, Size: 2328 bytes --]

>From 9e88e8b9915b5e067507a087437d80e6a133d612 Mon Sep 17 00:00:00 2001
From: Jiri Slaby <jslaby@suse.cz>
Date: Sat, 27 Nov 2010 16:06:46 +0100
Subject: [PATCH 1/1] TTY: open/hangup race fixup


Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 drivers/tty/tty_io.c |   10 +++++++++-
 include/linux/tty.h  |    1 +
 2 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 878f6d6..35480dd 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -559,6 +559,9 @@ void __tty_hangup(struct tty_struct *tty)
 
 	tty_lock();
 
+	/* some functions below drop BTM, so we need this bit */
+	set_bit(TTY_HUPPING, &tty->flags);
+
 	/* inuse_filps is protected by the single tty lock,
 	   this really needs to change if we want to flush the
 	   workqueue with the lock held */
@@ -578,6 +581,10 @@ void __tty_hangup(struct tty_struct *tty)
 	}
 	spin_unlock(&tty_files_lock);
 
+	/*
+	 * it drops BTM and thus races with reopen
+	 * we protect the race by TTY_HUPPING
+	 */
 	tty_ldisc_hangup(tty);
 
 	read_lock(&tasklist_lock);
@@ -615,7 +622,6 @@ void __tty_hangup(struct tty_struct *tty)
 	tty->session = NULL;
 	tty->pgrp = NULL;
 	tty->ctrl_status = 0;
-	set_bit(TTY_HUPPED, &tty->flags);
 	spin_unlock_irqrestore(&tty->ctrl_lock, flags);
 
 	/* Account for the p->signal references we killed */
@@ -641,6 +647,7 @@ void __tty_hangup(struct tty_struct *tty)
 	 * can't yet guarantee all that.
 	 */
 	set_bit(TTY_HUPPED, &tty->flags);
+	clear_bit(TTY_HUPPING, &tty->flags);
 	tty_ldisc_enable(tty);
 
 	tty_unlock();
@@ -1311,6 +1318,7 @@ static int tty_reopen(struct tty_struct *tty)
 	struct tty_driver *driver = tty->driver;
 
 	if (test_bit(TTY_CLOSING, &tty->flags) ||
+			test_bit(TTY_HUPPING, &tty->flags) ||
 			test_bit(TTY_LDISC_CHANGING, &tty->flags))
 		return -EIO;
 
diff --git a/include/linux/tty.h b/include/linux/tty.h
index 032d79f..54e4eaa 100644
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -366,6 +366,7 @@ struct tty_file_private {
 #define TTY_HUPPED 		18	/* Post driver->hangup() */
 #define TTY_FLUSHING		19	/* Flushing to ldisc in progress */
 #define TTY_FLUSHPENDING	20	/* Queued buffer flush pending */
+#define TTY_HUPPING 		21	/* ->hangup() in progress */
 
 #define TTY_WRITE_FLUSH(tty) tty_write_flush((tty))
 
-- 
1.7.3.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/1] TTY: don't allow reopen when ldisc is changing
  2010-11-27 15:11                     ` Jiri Slaby
@ 2010-11-27 23:53                       ` Kyle McMartin
  0 siblings, 0 replies; 27+ messages in thread
From: Kyle McMartin @ 2010-11-27 23:53 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Kyle McMartin, gregkh, akpm, linux-kernel, Alan Cox

On Sat, Nov 27, 2010 at 04:11:06PM +0100, Jiri Slaby wrote:
> Is it fixed with the attached proof-of-concept patch?
> 
> So you need:
> THIS ONE
> TTY: don't allow reopen when ldisc is changing
> TTY: ldisc, fix open flag handling
> Char: TTY, restore tty_ldisc_wait_idle
> 
> The last one is in 2.6.37-rc2 already.

Shoved them all into a build and sent it out for testing, the tester who
was hitting it very frequently reports he hasn't seen it, so I think
you've managed to close the race window, awesome!

I rebooted my laptop continuously and didn't hit the WARN_ON as
well to confirm.

--Kyle

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: mmotm 2010-11-23 + autogroups -> inconsistent lock state
  2010-11-24 20:25   ` Mike Galbraith
  2010-11-24 20:39     ` Mike Galbraith
  2010-11-25  6:09     ` Valdis.Kletnieks
@ 2010-12-02 18:16     ` Paul E. McKenney
  2010-12-03  3:58       ` Mike Galbraith
  2 siblings, 1 reply; 27+ messages in thread
From: Paul E. McKenney @ 2010-12-02 18:16 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Valdis.Kletnieks, akpm, Ingo Molnar, mm-commits, linux-kernel

On Wed, Nov 24, 2010 at 01:25:25PM -0700, Mike Galbraith wrote:
> On Wed, 2010-11-24 at 00:01 -0500, Valdis.Kletnieks@vt.edu wrote:
> > On Tue, 23 Nov 2010 16:13:06 PST, akpm@linux-foundation.org said:
> > > The mm-of-the-moment snapshot 2010-11-23-16-12 has been uploaded to
> > > 
> > >    http://userweb.kernel.org/~akpm/mmotm/
> > 
> > (I appear to be on a roll tonight - 3 splats before I even had a chance to login. :)
> > 
> > mmotm + Ingo's cleanup of Mike's autogroups patch.
> ...
> 
> Sorry for slow response, been trying to use some of my last few vacation
> days on vacation stuff ;-)
> 
> The below should run gripe free.  Suppose I should learn to turn on
> lockdep and whatnot when tinkering/testing.
> 
> Unfortunately, tip's update_shares() changes are still being difficult.
> 
> static void update_shares(int cpu)
> {
>         struct cfs_rq *cfs_rq;
>         struct rq *rq = cpu_rq(cpu);
> 
>         rcu_read_lock();
>         for_each_leaf_cfs_rq(rq, cfs_rq)
>                 update_shares_cpu(cfs_rq->tg, cpu);
>         rcu_read_unlock();
> }
> 
> Despite task groups being freed via rcu, update_shares_cup() hits freed
> memory and explodes, and nothing I've tried has been able to stop it.
> The only thing I haven't tried (aside from the right thing;) is to take
> rcu out of the picture entirely.

Is your new autogroup structure retaining a pointer to memory that
is freed by RCU?

If so, you will need to NULL out that pointer before the memory
in question is passed to call_rcu().  (Or before the call to
synchronize_rcu(), as the case may be.)

							Thanx, Paul

> From: Mike Galbraith <efault@gmx.de>
> Date: Sat, 20 Nov 2010 12:35:00 -0700
> Subject: [PATCH] sched: Improve desktop interactivity: Implement automated per session task groups
> 
> A recurring complaint from CFS users is that parallel kbuild has a negative
> impact on desktop interactivity.  This patch implements an idea from Linus,
> to automatically create task groups.  Currently, only per session autogroups
> are implemented, but the patch leaves the way open for enhancement.
> 
> Implementation: each task's signal struct contains an inherited pointer to
> a refcounted autogroup struct containing a task group pointer, the default
> for all tasks pointing to the init_task_group.  When a task calls setsid(),
> a new task group is created, the process is moved into the new task group,
> and a reference to the preveious task group is dropped.  Child processes
> inherit this task group thereafter, and increase it's refcount.  When the
> last thread of a process exits, the process's reference is dropped, such
> that when the last process referencing an autogroup exits, the autogroup
> is destroyed.
> 
> At runqueue selection time, IFF a task has no cgroup assignment, its current
> autogroup is used.
> 
> Autogroup bandwidth is controllable via setting it's nice level through the
> proc filesystem.  cat /proc/<pid>/autogroup displays the task's group and the
> group's nice level.  echo <nice level> > /proc/<pid>/autogroup Sets the task
> group's shares to the weight of nice <level> task.  Setting nice level is rate
> limited for !admin users due to the abuse risk of task group locking.
> 
> The feature is enabled from boot by default if CONFIG_SCHED_AUTOGROUP=y is
> selected, but can be disabled via the boot option noautogroup, and can also
> be turned on/off on the fly via..
> 	echo [01] > /proc/sys/kernel/sched_autogroup_enabled.
> ..which will automatically move tasks to/from the root task group.
> 
> Signed-off-by: Mike Galbraith <efault@gmx.de>
> Cc: Oleg Nesterov <oleg@redhat.com>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> LKML-Reference: <1290281700.28711.9.camel@maggy.simson.net>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
>  Documentation/kernel-parameters.txt |    2 
>  fs/proc/base.c                      |   79 +++++++++++
>  include/linux/sched.h               |   23 +++
>  init/Kconfig                        |   12 +
>  kernel/fork.c                       |    5 
>  kernel/sched.c                      |   13 +
>  kernel/sched_autogroup.c            |  240 ++++++++++++++++++++++++++++++++++++
>  kernel/sched_autogroup.h            |   23 +++
>  kernel/sched_debug.c                |   29 ++--
>  kernel/sys.c                        |    4 
>  kernel/sysctl.c                     |   11 +
>  11 files changed, 423 insertions(+), 18 deletions(-)
> 
> Index: linux-2.6.37.git/include/linux/sched.h
> ===================================================================
> --- linux-2.6.37.git.orig/include/linux/sched.h
> +++ linux-2.6.37.git/include/linux/sched.h
> @@ -509,6 +509,8 @@ struct thread_group_cputimer {
>  	spinlock_t lock;
>  };
> 
> +struct autogroup;
> +
>  /*
>   * NOTE! "signal_struct" does not have it's own
>   * locking, because a shared signal_struct always
> @@ -576,6 +578,9 @@ struct signal_struct {
> 
>  	struct tty_struct *tty; /* NULL if no tty */
> 
> +#ifdef CONFIG_SCHED_AUTOGROUP
> +	struct autogroup *autogroup;
> +#endif
>  	/*
>  	 * Cumulative resource counters for dead threads in the group,
>  	 * and for reaped dead child processes forked by this group.
> @@ -1931,6 +1936,24 @@ int sched_rt_handler(struct ctl_table *t
> 
>  extern unsigned int sysctl_sched_compat_yield;
> 
> +#ifdef CONFIG_SCHED_AUTOGROUP
> +extern unsigned int sysctl_sched_autogroup_enabled;
> +
> +extern void sched_autogroup_create_attach(struct task_struct *p);
> +extern void sched_autogroup_detach(struct task_struct *p);
> +extern void sched_autogroup_fork(struct signal_struct *sig);
> +extern void sched_autogroup_exit(struct signal_struct *sig);
> +#ifdef CONFIG_PROC_FS
> +extern void proc_sched_autogroup_show_task(struct task_struct *p, struct seq_file *m);
> +extern int proc_sched_autogroup_set_nice(struct task_struct *p, int *nice);
> +#endif
> +#else
> +static inline void sched_autogroup_create_attach(struct task_struct *p) { }
> +static inline void sched_autogroup_detach(struct task_struct *p) { }
> +static inline void sched_autogroup_fork(struct signal_struct *sig) { }
> +static inline void sched_autogroup_exit(struct signal_struct *sig) { }
> +#endif
> +
>  #ifdef CONFIG_RT_MUTEXES
>  extern int rt_mutex_getprio(struct task_struct *p);
>  extern void rt_mutex_setprio(struct task_struct *p, int prio);
> Index: linux-2.6.37.git/kernel/sched.c
> ===================================================================
> --- linux-2.6.37.git.orig/kernel/sched.c
> +++ linux-2.6.37.git/kernel/sched.c
> @@ -78,6 +78,7 @@
> 
>  #include "sched_cpupri.h"
>  #include "workqueue_sched.h"
> +#include "sched_autogroup.h"
> 
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/sched.h>
> @@ -268,6 +269,10 @@ struct task_group {
>  	struct task_group *parent;
>  	struct list_head siblings;
>  	struct list_head children;
> +
> +#ifdef CONFIG_SCHED_AUTOGROUP
> +	struct autogroup *autogroup;
> +#endif
>  };
> 
>  #define root_task_group init_task_group
> @@ -605,11 +610,14 @@ static inline int cpu_of(struct rq *rq)
>   */
>  static inline struct task_group *task_group(struct task_struct *p)
>  {
> +	struct task_group *tg;
>  	struct cgroup_subsys_state *css;
> 
>  	css = task_subsys_state_check(p, cpu_cgroup_subsys_id,
>  			lockdep_is_held(&task_rq(p)->lock));
> -	return container_of(css, struct task_group, css);
> +	tg = container_of(css, struct task_group, css);
> +
> +	return autogroup_task_group(p, tg);
>  }
> 
>  /* Change a task's cfs_rq and parent entity if it moves across CPUs/groups */
> @@ -2006,6 +2014,7 @@ static void sched_irq_time_avg_update(st
>  #include "sched_idletask.c"
>  #include "sched_fair.c"
>  #include "sched_rt.c"
> +#include "sched_autogroup.c"
>  #include "sched_stoptask.c"
>  #ifdef CONFIG_SCHED_DEBUG
>  # include "sched_debug.c"
> @@ -7979,7 +7988,7 @@ void __init sched_init(void)
>  #ifdef CONFIG_CGROUP_SCHED
>  	list_add(&init_task_group.list, &task_groups);
>  	INIT_LIST_HEAD(&init_task_group.children);
> -
> +	autogroup_init(&init_task);
>  #endif /* CONFIG_CGROUP_SCHED */
> 
>  #if defined CONFIG_FAIR_GROUP_SCHED && defined CONFIG_SMP
> Index: linux-2.6.37.git/kernel/fork.c
> ===================================================================
> --- linux-2.6.37.git.orig/kernel/fork.c
> +++ linux-2.6.37.git/kernel/fork.c
> @@ -174,8 +174,10 @@ static inline void free_signal_struct(st
> 
>  static inline void put_signal_struct(struct signal_struct *sig)
>  {
> -	if (atomic_dec_and_test(&sig->sigcnt))
> +	if (atomic_dec_and_test(&sig->sigcnt)) {
> +		sched_autogroup_exit(sig);
>  		free_signal_struct(sig);
> +	}
>  }
> 
>  void __put_task_struct(struct task_struct *tsk)
> @@ -904,6 +906,7 @@ static int copy_signal(unsigned long clo
>  	posix_cpu_timers_init_group(sig);
> 
>  	tty_audit_fork(sig);
> +	sched_autogroup_fork(sig);
> 
>  	sig->oom_adj = current->signal->oom_adj;
>  	sig->oom_score_adj = current->signal->oom_score_adj;
> Index: linux-2.6.37.git/kernel/sys.c
> ===================================================================
> --- linux-2.6.37.git.orig/kernel/sys.c
> +++ linux-2.6.37.git/kernel/sys.c
> @@ -1080,8 +1080,10 @@ SYSCALL_DEFINE0(setsid)
>  	err = session;
>  out:
>  	write_unlock_irq(&tasklist_lock);
> -	if (err > 0)
> +	if (err > 0) {
>  		proc_sid_connector(group_leader);
> +		sched_autogroup_create_attach(group_leader);
> +	}
>  	return err;
>  }
> 
> Index: linux-2.6.37.git/kernel/sched_debug.c
> ===================================================================
> --- linux-2.6.37.git.orig/kernel/sched_debug.c
> +++ linux-2.6.37.git/kernel/sched_debug.c
> @@ -87,6 +87,20 @@ static void print_cfs_group_stats(struct
>  }
>  #endif
> 
> +#if defined(CONFIG_CGROUP_SCHED) && \
> +	(defined(CONFIG_FAIR_GROUP_SCHED) || defined(CONFIG_RT_GROUP_SCHED))
> +static void task_group_path(struct task_group *tg, char *buf, int buflen)
> +{
> +	/* may be NULL if the underlying cgroup isn't fully-created yet */
> +	if (!tg->css.cgroup) {
> +		if (!autogroup_path(tg, buf, buflen))
> +			buf[0] = '\0';
> +		return;
> +	}
> +	cgroup_path(tg->css.cgroup, buf, buflen);
> +}
> +#endif
> +
>  static void
>  print_task(struct seq_file *m, struct rq *rq, struct task_struct *p)
>  {
> @@ -115,7 +129,7 @@ print_task(struct seq_file *m, struct rq
>  		char path[64];
> 
>  		rcu_read_lock();
> -		cgroup_path(task_group(p)->css.cgroup, path, sizeof(path));
> +		task_group_path(task_group(p), path, sizeof(path));
>  		rcu_read_unlock();
>  		SEQ_printf(m, " %s", path);
>  	}
> @@ -147,19 +161,6 @@ static void print_rq(struct seq_file *m,
>  	read_unlock_irqrestore(&tasklist_lock, flags);
>  }
> 
> -#if defined(CONFIG_CGROUP_SCHED) && \
> -	(defined(CONFIG_FAIR_GROUP_SCHED) || defined(CONFIG_RT_GROUP_SCHED))
> -static void task_group_path(struct task_group *tg, char *buf, int buflen)
> -{
> -	/* may be NULL if the underlying cgroup isn't fully-created yet */
> -	if (!tg->css.cgroup) {
> -		buf[0] = '\0';
> -		return;
> -	}
> -	cgroup_path(tg->css.cgroup, buf, buflen);
> -}
> -#endif
> -
>  void print_cfs_rq(struct seq_file *m, int cpu, struct cfs_rq *cfs_rq)
>  {
>  	s64 MIN_vruntime = -1, min_vruntime, max_vruntime = -1,
> Index: linux-2.6.37.git/fs/proc/base.c
> ===================================================================
> --- linux-2.6.37.git.orig/fs/proc/base.c
> +++ linux-2.6.37.git/fs/proc/base.c
> @@ -1407,6 +1407,82 @@ static const struct file_operations proc
> 
>  #endif
> 
> +#ifdef CONFIG_SCHED_AUTOGROUP
> +/*
> + * Print out autogroup related information:
> + */
> +static int sched_autogroup_show(struct seq_file *m, void *v)
> +{
> +	struct inode *inode = m->private;
> +	struct task_struct *p;
> +
> +	p = get_proc_task(inode);
> +	if (!p)
> +		return -ESRCH;
> +	proc_sched_autogroup_show_task(p, m);
> +
> +	put_task_struct(p);
> +
> +	return 0;
> +}
> +
> +static ssize_t
> +sched_autogroup_write(struct file *file, const char __user *buf,
> +	    size_t count, loff_t *offset)
> +{
> +	struct inode *inode = file->f_path.dentry->d_inode;
> +	struct task_struct *p;
> +	char buffer[PROC_NUMBUF];
> +	long nice;
> +	int err;
> +
> +	memset(buffer, 0, sizeof(buffer));
> +	if (count > sizeof(buffer) - 1)
> +		count = sizeof(buffer) - 1;
> +	if (copy_from_user(buffer, buf, count))
> +		return -EFAULT;
> +
> +	err = strict_strtol(strstrip(buffer), 0, &nice);
> +	if (err)
> +		return -EINVAL;
> +
> +	p = get_proc_task(inode);
> +	if (!p)
> +		return -ESRCH;
> +
> +	err = nice;
> +	err = proc_sched_autogroup_set_nice(p, &err);
> +	if (err)
> +		count = err;
> +
> +	put_task_struct(p);
> +
> +	return count;
> +}
> +
> +static int sched_autogroup_open(struct inode *inode, struct file *filp)
> +{
> +	int ret;
> +
> +	ret = single_open(filp, sched_autogroup_show, NULL);
> +	if (!ret) {
> +		struct seq_file *m = filp->private_data;
> +
> +		m->private = inode;
> +	}
> +	return ret;
> +}
> +
> +static const struct file_operations proc_pid_sched_autogroup_operations = {
> +	.open		= sched_autogroup_open,
> +	.read		= seq_read,
> +	.write		= sched_autogroup_write,
> +	.llseek		= seq_lseek,
> +	.release	= single_release,
> +};
> +
> +#endif /* CONFIG_SCHED_AUTOGROUP */
> +
>  static ssize_t comm_write(struct file *file, const char __user *buf,
>  				size_t count, loff_t *offset)
>  {
> @@ -2733,6 +2809,9 @@ static const struct pid_entry tgid_base_
>  #ifdef CONFIG_SCHED_DEBUG
>  	REG("sched",      S_IRUGO|S_IWUSR, proc_pid_sched_operations),
>  #endif
> +#ifdef CONFIG_SCHED_AUTOGROUP
> +	REG("autogroup",  S_IRUGO|S_IWUSR, proc_pid_sched_autogroup_operations),
> +#endif
>  	REG("comm",      S_IRUGO|S_IWUSR, proc_pid_set_comm_operations),
>  #ifdef CONFIG_HAVE_ARCH_TRACEHOOK
>  	INF("syscall",    S_IRUSR, proc_pid_syscall),
> Index: linux-2.6.37.git/kernel/sched_autogroup.h
> ===================================================================
> --- /dev/null
> +++ linux-2.6.37.git/kernel/sched_autogroup.h
> @@ -0,0 +1,23 @@
> +#ifdef CONFIG_SCHED_AUTOGROUP
> +
> +static inline struct task_group *
> +autogroup_task_group(struct task_struct *p, struct task_group *tg);
> +
> +#else /* !CONFIG_SCHED_AUTOGROUP */
> +
> +static inline void autogroup_init(struct task_struct *init_task) {  }
> +
> +static inline struct task_group *
> +autogroup_task_group(struct task_struct *p, struct task_group *tg)
> +{
> +	return tg;
> +}
> +
> +#ifdef CONFIG_SCHED_DEBUG
> +static inline int autogroup_path(struct task_group *tg, char *buf, int buflen)
> +{
> +	return 0;
> +}
> +#endif
> +
> +#endif /* CONFIG_SCHED_AUTOGROUP */
> Index: linux-2.6.37.git/kernel/sched_autogroup.c
> ===================================================================
> --- /dev/null
> +++ linux-2.6.37.git/kernel/sched_autogroup.c
> @@ -0,0 +1,240 @@
> +#ifdef CONFIG_SCHED_AUTOGROUP
> +
> +#include <linux/proc_fs.h>
> +#include <linux/seq_file.h>
> +#include <linux/kallsyms.h>
> +#include <linux/utsname.h>
> +
> +unsigned int __read_mostly sysctl_sched_autogroup_enabled = 1;
> +
> +struct autogroup {
> +	struct kref		kref;
> +	struct task_group	*tg;
> +	struct rw_semaphore	lock;
> +	unsigned long		id;
> +	int			nice;
> +};
> +
> +static struct autogroup autogroup_default;
> +static atomic_t autogroup_seq_nr;
> +
> +static void autogroup_init(struct task_struct *init_task)
> +{
> +	autogroup_default.tg = &init_task_group;
> +	init_task_group.autogroup = &autogroup_default;
> +	kref_init(&autogroup_default.kref);
> +	init_rwsem(&autogroup_default.lock);
> +	init_task->signal->autogroup = &autogroup_default;
> +}
> +
> +static inline void autogroup_free(struct task_group *tg)
> +{
> +	kfree(tg->autogroup);
> +}
> +
> +static inline void autogroup_destroy(struct kref *kref)
> +{
> +	struct autogroup *ag = container_of(kref, struct autogroup, kref);
> +
> +	sched_destroy_group(ag->tg);
> +}
> +
> +static inline void autogroup_kref_put(struct autogroup *ag)
> +{
> +	kref_put(&ag->kref, autogroup_destroy);
> +}
> +
> +static inline struct autogroup *autogroup_kref_get(struct autogroup *ag)
> +{
> +	kref_get(&ag->kref);
> +	return ag;
> +}
> +
> +static inline struct autogroup *autogroup_create(void)
> +{
> +	struct autogroup *ag = kzalloc(sizeof(*ag), GFP_KERNEL);
> +	struct task_group *tg;
> +
> +	if (!ag)
> +		goto out_fail;
> +
> +	tg = sched_create_group(&init_task_group);
> +
> +	if (IS_ERR(tg))
> +		goto out_free;
> +
> +	kref_init(&ag->kref);
> +	init_rwsem(&ag->lock);
> +	ag->id = atomic_inc_return(&autogroup_seq_nr);
> +	ag->tg = tg;
> +	tg->autogroup = ag;
> +
> +	return ag;
> +
> +out_free:
> +	kfree(ag);
> +out_fail:
> +	if (printk_ratelimit())
> +		printk(KERN_WARNING "autogroup_create: %s failure.\n",
> +			ag ? "sched_create_group()" : "kmalloc()");
> +
> +	return autogroup_kref_get(&autogroup_default);
> +}
> +
> +static inline bool
> +task_wants_autogroup(struct task_struct *p, struct task_group *tg)
> +{
> +	if (tg != &root_task_group)
> +		return false;
> +
> +	if (p->sched_class != &fair_sched_class)
> +		return false;
> +
> +	/*
> +	 * We can only assume the task group can't go away on us if
> +	 * autogroup_move_group() can see us on ->thread_group list.
> +	 */
> +	if (p->flags & PF_EXITING)
> +		return false;
> +
> +	return true;
> +}
> +
> +static inline struct task_group *
> +autogroup_task_group(struct task_struct *p, struct task_group *tg)
> +{
> +	int enabled = ACCESS_ONCE(sysctl_sched_autogroup_enabled);
> +
> +	if (enabled && task_wants_autogroup(p, tg))
> +		return p->signal->autogroup->tg;
> +
> +	return tg;
> +}
> +
> +static void
> +autogroup_move_group(struct task_struct *p, struct autogroup *ag)
> +{
> +	struct autogroup *prev;
> +	struct task_struct *t;
> +	unsigned long flags;
> +
> +	if (!lock_task_sighand(p, &flags)) {
> +		WARN_ON(1);
> +		return;
> +	}
> +
> +	prev = p->signal->autogroup;
> +	if (prev == ag) {
> +		unlock_task_sighand(p, &flags);
> +		return;
> +	}
> +
> +	p->signal->autogroup = autogroup_kref_get(ag);
> +	t = p;
> +
> +	do {
> +		sched_move_task(t);
> +	} while_each_thread(p, t);
> +
> +	unlock_task_sighand(p, &flags);
> +	autogroup_kref_put(prev);
> +}
> +
> +/* Allocates GFP_KERNEL, cannot be called under any spinlock */
> +void sched_autogroup_create_attach(struct task_struct *p)
> +{
> +	struct autogroup *ag = autogroup_create();
> +
> +	autogroup_move_group(p, ag);
> +	/* drop extra refrence added by autogroup_create() */
> +	autogroup_kref_put(ag);
> +}
> +EXPORT_SYMBOL(sched_autogroup_create_attach);
> +
> +/* Cannot be called under siglock.  Currently has no users */
> +void sched_autogroup_detach(struct task_struct *p)
> +{
> +	autogroup_move_group(p, &autogroup_default);
> +}
> +EXPORT_SYMBOL(sched_autogroup_detach);
> +
> +void sched_autogroup_fork(struct signal_struct *sig)
> +{
> +	struct task_struct *p = current;
> +
> +	spin_lock_irq(&p->sighand->siglock);
> +	sig->autogroup = autogroup_kref_get(p->signal->autogroup);
> +	spin_unlock_irq(&p->sighand->siglock);
> +}
> +
> +void sched_autogroup_exit(struct signal_struct *sig)
> +{
> +	autogroup_kref_put(sig->autogroup);
> +}
> +
> +static int __init setup_autogroup(char *str)
> +{
> +	sysctl_sched_autogroup_enabled = 0;
> +
> +	return 1;
> +}
> +
> +__setup("noautogroup", setup_autogroup);
> +
> +#ifdef CONFIG_PROC_FS
> +
> +/* Called with siglock held. */
> +int proc_sched_autogroup_set_nice(struct task_struct *p, int *nice)
> +{
> +	static unsigned long next = INITIAL_JIFFIES;
> +	struct autogroup *ag;
> +	int err;
> +
> +	if (*nice < -20 || *nice > 19)
> +		return -EINVAL;
> +
> +	err = security_task_setnice(current, *nice);
> +	if (err)
> +		return err;
> +
> +	if (*nice < 0 && !can_nice(current, *nice))
> +		return -EPERM;
> +
> +	/* this is a heavy operation taking global locks.. */
> +	if (!capable(CAP_SYS_ADMIN) && time_before(jiffies, next))
> +		return -EAGAIN;
> +
> +	next = HZ / 10 + jiffies;
> +	ag = autogroup_kref_get(p->signal->autogroup);
> +
> +	down_write(&ag->lock);
> +	err = sched_group_set_shares(ag->tg, prio_to_weight[*nice + 20]);
> +	if (!err)
> +		ag->nice = *nice;
> +	up_write(&ag->lock);
> +
> +	autogroup_kref_put(ag);
> +
> +	return err;
> +}
> +
> +void proc_sched_autogroup_show_task(struct task_struct *p, struct seq_file *m)
> +{
> +	struct autogroup *ag = autogroup_kref_get(p->signal->autogroup);
> +
> +	down_read(&ag->lock);
> +	seq_printf(m, "/autogroup-%ld nice %d\n", ag->id, ag->nice);
> +	up_read(&ag->lock);
> +
> +	autogroup_kref_put(ag);
> +}
> +#endif /* CONFIG_PROC_FS */
> +
> +#ifdef CONFIG_SCHED_DEBUG
> +static inline int autogroup_path(struct task_group *tg, char *buf, int buflen)
> +{
> +	return snprintf(buf, buflen, "%s-%ld", "/autogroup", tg->autogroup->id);
> +}
> +#endif /* CONFIG_SCHED_DEBUG */
> +
> +#endif /* CONFIG_SCHED_AUTOGROUP */
> Index: linux-2.6.37.git/kernel/sysctl.c
> ===================================================================
> --- linux-2.6.37.git.orig/kernel/sysctl.c
> +++ linux-2.6.37.git/kernel/sysctl.c
> @@ -382,6 +382,17 @@ static struct ctl_table kern_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= proc_dointvec,
>  	},
> +#ifdef CONFIG_SCHED_AUTOGROUP
> +	{
> +		.procname	= "sched_autogroup_enabled",
> +		.data		= &sysctl_sched_autogroup_enabled,
> +		.maxlen		= sizeof(unsigned int),
> +		.mode		= 0644,
> +		.proc_handler	= proc_dointvec,
> +		.extra1		= &zero,
> +		.extra2		= &one,
> +	},
> +#endif
>  #ifdef CONFIG_PROVE_LOCKING
>  	{
>  		.procname	= "prove_locking",
> Index: linux-2.6.37.git/init/Kconfig
> ===================================================================
> --- linux-2.6.37.git.orig/init/Kconfig
> +++ linux-2.6.37.git/init/Kconfig
> @@ -728,6 +728,18 @@ config NET_NS
> 
>  endif # NAMESPACES
> 
> +config SCHED_AUTOGROUP
> +	bool "Automatic process group scheduling"
> +	select CGROUPS
> +	select CGROUP_SCHED
> +	select FAIR_GROUP_SCHED
> +	help
> +	  This option optimizes the scheduler for common desktop workloads by
> +	  automatically creating and populating task groups.  This separation
> +	  of workloads isolates aggressive CPU burners (like build jobs) from
> +	  desktop applications.  Task group autogeneration is currently based
> +	  upon task session.
> +
>  config MM_OWNER
>  	bool
> 
> Index: linux-2.6.37.git/Documentation/kernel-parameters.txt
> ===================================================================
> --- linux-2.6.37.git.orig/Documentation/kernel-parameters.txt
> +++ linux-2.6.37.git/Documentation/kernel-parameters.txt
> @@ -1622,6 +1622,8 @@ and is between 256 and 4096 characters.
>  	noapic		[SMP,APIC] Tells the kernel to not make use of any
>  			IOAPICs that may be present in the system.
> 
> +	noautogroup	Disable scheduler automatic task group creation.
> +
>  	nobats		[PPC] Do not use BATs for mapping kernel lowmem
>  			on "Classic" PPC cores.
> 
> 
>         
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: mmotm 2010-11-23 + autogroups -> inconsistent lock state
  2010-12-02 18:16     ` Paul E. McKenney
@ 2010-12-03  3:58       ` Mike Galbraith
  0 siblings, 0 replies; 27+ messages in thread
From: Mike Galbraith @ 2010-12-03  3:58 UTC (permalink / raw)
  To: paulmck; +Cc: Valdis.Kletnieks, akpm, Ingo Molnar, mm-commits, linux-kernel

On Thu, 2010-12-02 at 10:16 -0800, Paul E. McKenney wrote:
> On Wed, Nov 24, 2010 at 01:25:25PM -0700, Mike Galbraith wrote:
>  
> > Despite task groups being freed via rcu, update_shares_cup() hits freed
> > memory and explodes, and nothing I've tried has been able to stop it.
> > The only thing I haven't tried (aside from the right thing;) is to take
> > rcu out of the picture entirely.
> 
> Is your new autogroup structure retaining a pointer to memory that
> is freed by RCU?

That turned out to be a typo that left freed cfs_rq registered.  No dark
elves (memory ordering), just a defenseless little typo.

	-Mike


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2010-12-03  3:58 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-24  0:13 mmotm 2010-11-23-16-12 uploaded akpm
2010-11-24  4:52 ` mmotm 2010-11-23 - lockdep whinge in e1000e driver Valdis.Kletnieks
2010-11-24  4:55 ` mmotm 2010-11-23 - WARNING: at drivers/tty/tty_io.c:1331 Valdis.Kletnieks
2010-11-25 15:14   ` Kyle McMartin
2010-11-25 16:44     ` Jiri Slaby
2010-11-25 16:51       ` Jiri Slaby
2010-11-25 17:16         ` [PATCH 1/1] TTY: don't allow reopen when ldisc is changing Jiri Slaby
2010-11-25 17:59           ` Kyle McMartin
2010-11-26  0:28           ` Kyle McMartin
2010-11-26  7:46             ` Jiri Slaby
2010-11-26 13:27               ` Kyle McMartin
2010-11-27  2:59               ` Kyle McMartin
2010-11-27  8:50                 ` Jiri Slaby
2010-11-27  9:43                   ` Jiri Slaby
2010-11-27 15:11                     ` Jiri Slaby
2010-11-27 23:53                       ` Kyle McMartin
2010-11-24  5:01 ` mmotm 2010-11-23 + autogroups -> inconsistent lock state Valdis.Kletnieks
2010-11-24 20:25   ` Mike Galbraith
2010-11-24 20:39     ` Mike Galbraith
2010-11-25  6:09     ` Valdis.Kletnieks
2010-12-02 18:16     ` Paul E. McKenney
2010-12-03  3:58       ` Mike Galbraith
2010-11-24 13:56 ` mmotm 2010-11-23-16-12 uploaded Zimny Lech
2010-11-24 18:51 ` mmotm 2010-11-23-16-12 uploaded (olpc) Randy Dunlap
2010-11-24 19:13   ` Andres Salomon
2010-11-26 16:46   ` Daniel Drake
2010-11-24 19:41 ` [PATCH -mmotm/-next] media: fix timblogiw kconfig & build error Randy Dunlap

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).