linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.15-rt17
@ 2006-02-21 15:55 Ingo Molnar
  2006-02-21 16:11 ` 2.6.15-rt17 K.R. Foley
                   ` (6 more replies)
  0 siblings, 7 replies; 29+ messages in thread
From: Ingo Molnar @ 2006-02-21 15:55 UTC (permalink / raw)
  To: linux-kernel; +Cc: Esben Nielsen, Thomas Gleixner, Steven Rostedt

i have released the 2.6.15-rt17 tree, which can be downloaded from the 
usual place:

   http://redhat.com/~mingo/realtime-preempt/

lots of changes all across the map. There are several bigger changes:

the biggest change is the new PI code from Esben Nielsen, Thomas 
Gleixner and Steven Rostedt. This big rework simplifies and streamlines 
the PI code, and fixes a couple of bugs and races:

  - only the top priority waiter on a lock is enqueued into the pi_list
    of the task which holds the lock. No more pi list walking in the
    boost case.

  - simpler locking rules

  - fast Atomic acquire for the non contended case and atomic release 
    for non waiter case is fully functional now

  - use task_t references instead of thread_info pointers

  - BKL handling for semaphore style locks changed so that BKL is
    dropped before the scheduler is entered and reaquired in the return
    path. This solves a possible deadlock situation in the BKL reacquire
    path of the scheduler.

another change is the reworking of the SLAB code: it now closely matches 
the upstream SLAB code, and it should now work on NUMA systems too 
(untested though).

the tasklet code was reworked too to be PREEMPT_RT friendly: the new PI 
code unearthed a fundamental livelock scenario with PREEMPT_RT, and the 
fix was to rework the tasklet code to get rid of the 'retrigger 
softirqs' approach.

other changes: various hrtimers fixes, latency tracer enhancements - and 
more. (Robust-futexes are not expected to work in this release.)

please report any new breakages, and re-report any old breakages as 
well.

to build a 2.6.15-rt17 tree, the following patches should be applied:

  http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.15.tar.bz2
  http://redhat.com/~mingo/realtime-preempt/patch-2.6.15-rt17

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-21 15:55 2.6.15-rt17 Ingo Molnar
@ 2006-02-21 16:11 ` K.R. Foley
  2006-02-21 16:21 ` 2.6.15-rt17 K.R. Foley
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: K.R. Foley @ 2006-02-21 16:11 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Esben Nielsen, Thomas Gleixner, Steven Rostedt

Ingo Molnar wrote:
> i have released the 2.6.15-rt17 tree, which can be downloaded from the 
> usual place:
> 
>    http://redhat.com/~mingo/realtime-preempt/
> 
> lots of changes all across the map. There are several bigger changes:
> 
> the biggest change is the new PI code from Esben Nielsen, Thomas 
> Gleixner and Steven Rostedt. This big rework simplifies and streamlines 
> the PI code, and fixes a couple of bugs and races:
> 
>   - only the top priority waiter on a lock is enqueued into the pi_list
>     of the task which holds the lock. No more pi list walking in the
>     boost case.
> 
>   - simpler locking rules
> 
>   - fast Atomic acquire for the non contended case and atomic release 
>     for non waiter case is fully functional now
> 
>   - use task_t references instead of thread_info pointers
> 
>   - BKL handling for semaphore style locks changed so that BKL is
>     dropped before the scheduler is entered and reaquired in the return
>     path. This solves a possible deadlock situation in the BKL reacquire
>     path of the scheduler.
> 
> another change is the reworking of the SLAB code: it now closely matches 
> the upstream SLAB code, and it should now work on NUMA systems too 
> (untested though).
> 
> the tasklet code was reworked too to be PREEMPT_RT friendly: the new PI 
> code unearthed a fundamental livelock scenario with PREEMPT_RT, and the 
> fix was to rework the tasklet code to get rid of the 'retrigger 
> softirqs' approach.
> 
> other changes: various hrtimers fixes, latency tracer enhancements - and 
> more. (Robust-futexes are not expected to work in this release.)
> 
> please report any new breakages, and re-report any old breakages as 
> well.
> 
> to build a 2.6.15-rt17 tree, the following patches should be applied:
> 
>   http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.15.tar.bz2
>   http://redhat.com/~mingo/realtime-preempt/patch-2.6.15-rt17
> 
> 	Ingo

Ingo I think you fat fingered the name of the patchfile on your site. :)

-- 
   kr

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-21 15:55 2.6.15-rt17 Ingo Molnar
  2006-02-21 16:11 ` 2.6.15-rt17 K.R. Foley
@ 2006-02-21 16:21 ` K.R. Foley
  2006-02-22  7:51   ` 2.6.15-rt17 Ingo Molnar
  2006-02-21 17:16 ` 2.6.15-rt17 Michal Piotrowski
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 29+ messages in thread
From: K.R. Foley @ 2006-02-21 16:21 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Esben Nielsen, Thomas Gleixner, Steven Rostedt

[-- Attachment #1: Type: text/plain, Size: 2118 bytes --]

Ingo Molnar wrote:
> i have released the 2.6.15-rt17 tree, which can be downloaded from the 
> usual place:
> 
>    http://redhat.com/~mingo/realtime-preempt/
> 
> lots of changes all across the map. There are several bigger changes:
> 
> the biggest change is the new PI code from Esben Nielsen, Thomas 
> Gleixner and Steven Rostedt. This big rework simplifies and streamlines 
> the PI code, and fixes a couple of bugs and races:
> 
>   - only the top priority waiter on a lock is enqueued into the pi_list
>     of the task which holds the lock. No more pi list walking in the
>     boost case.
> 
>   - simpler locking rules
> 
>   - fast Atomic acquire for the non contended case and atomic release 
>     for non waiter case is fully functional now
> 
>   - use task_t references instead of thread_info pointers
> 
>   - BKL handling for semaphore style locks changed so that BKL is
>     dropped before the scheduler is entered and reaquired in the return
>     path. This solves a possible deadlock situation in the BKL reacquire
>     path of the scheduler.
> 
> another change is the reworking of the SLAB code: it now closely matches 
> the upstream SLAB code, and it should now work on NUMA systems too 
> (untested though).
> 
> the tasklet code was reworked too to be PREEMPT_RT friendly: the new PI 
> code unearthed a fundamental livelock scenario with PREEMPT_RT, and the 
> fix was to rework the tasklet code to get rid of the 'retrigger 
> softirqs' approach.
> 
> other changes: various hrtimers fixes, latency tracer enhancements - and 
> more. (Robust-futexes are not expected to work in this release.)
> 
> please report any new breakages, and re-report any old breakages as 
> well.
> 
> to build a 2.6.15-rt17 tree, the following patches should be applied:
> 
>   http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.15.tar.bz2
>   http://redhat.com/~mingo/realtime-preempt/patch-2.6.15-rt17
> 
> 	Ingo
> -

Also would you apply the attached patch to fix the RTC_HISTOGRAM Kconfig
option. In it's current state "tristate" it allows building as a module,
which of course doesn't work.

-- 
   kr

[-- Attachment #2: rtc-Kconfig.patch --]
[-- Type: text/x-patch, Size: 360 bytes --]

--- linux-2.6.15/drivers/char/Kconfig.orig	2006-02-01 12:51:53.000000000 -0600
+++ linux-2.6.15/drivers/char/Kconfig	2006-02-01 12:52:12.000000000 -0600
@@ -712,7 +712,7 @@
 	  module will be called rtc.
 
 config RTC_HISTOGRAM
-	tristate "Real Time Clock Histogram Support"
+	bool "Real Time Clock Histogram Support"
 	default n
 	depends on RTC
 	---help---

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-21 15:55 2.6.15-rt17 Ingo Molnar
  2006-02-21 16:11 ` 2.6.15-rt17 K.R. Foley
  2006-02-21 16:21 ` 2.6.15-rt17 K.R. Foley
@ 2006-02-21 17:16 ` Michal Piotrowski
  2006-02-21 18:53   ` 2.6.15-rt17 Michal Piotrowski
  2006-02-22 12:22   ` 2.6.15-rt17 Steven Rostedt
  2006-02-21 17:23 ` 2.6.15-rt17 Timothy R. Chavez
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 29+ messages in thread
From: Michal Piotrowski @ 2006-02-21 17:16 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Esben Nielsen, Thomas Gleixner, Steven Rostedt

Hi,

On 21/02/06, Ingo Molnar <mingo@elte.hu> wrote:
> i have released the 2.6.15-rt17 tree, which can be downloaded from the
> usual place:
[snip]
> to build a 2.6.15-rt17 tree, the following patches should be applied:
>
>   http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.15.tar.bz2
>   http://redhat.com/~mingo/realtime-preempt/patch-2.6.15-rt17
>
>         Ingo

I have noticed some bugs

----------------------------->
| new stack-footprint maximum: swapper/0, 788 bytes (out of 4044 bytes).
------------|
{   24} [<c0135e7d>] debug_stackoverflow+0x80/0xb2
{   28} [<c01364e9>] __mcount+0x3b/0xb2
{   20} [<c010ebd8>] mcount+0x14/0x18
{  124} [<c01dc5cf>] number+0xe/0x204
{   76} [<c01dcbc3>] vsnprintf+0x3fe/0x438
{   28} [<c01dcc18>] vscnprintf+0x1b/0x2b
{  128} [<c011bd04>] vprintk+0x7a/0x23d
{   20} [<c011bc88>] printk+0x18/0x1a
{  172} [<c010ec2a>] MP_processor_info+0x4a/0x1b4
{   36} [<c010ee2e>] mp_register_lapic+0x9a/0xa0
{   24} [<c0482d5e>] acpi_parse_lapic+0x45/0x56
{   28} [<c048dda6>] acpi_table_parse_madt_family+0xb0/0x100
{   28} [<c048de10>] acpi_table_parse_madt+0x1a/0x1c
{   32} [<c048312a>] acpi_parse_madt_lapic_entries+0x49/0x9a
{    8} [<c048327d>] acpi_process_madt+0x23/0xb3
{   24} [<c0483540>] acpi_boot_init+0x3c/0x4c
{   20} [<c04805d4>] setup_arch+0x1af/0x1ed
{   24} [<c047c72f>] start_kernel+0x30/0x19a


----------------------------->
----------------------------->
| new stack-footprint maximum: rcS/332, 2224 bytes (out of 4044 bytes).
| new stack-footprint maximum: rcS/332, 2224 bytes (out of 4044 bytes).
------------|
------------|
{   24} {   24} [<c0135e7d>] [<c0135e7d>]
debug_stackoverflow+0x80/0xb2debug_stackoverflow+0x80/0xb2

{   28} {   28} [<c01364e9>] [<c01364e9>] __mcount+0x3b/0xb2__mcount+0x3b/0xb2

{   20} {   20} [<c010ebd8>] [<c010ebd8>] mcount+0x14/0x18mcount+0x14/0x18

{   12} {   12} [<c02c3404>] [<c02c3404>]
_raw_spin_lock+0x9/0x26_raw_spin_lock+0x9/0x26

{   16} {   16} [<c02c2a18>] [<c02c2a18>]
__lock_text_start+0x20/0x55__lock_text_start+0x20/0x55

{   32} {   32} [<c015ee47>] [<c015ee47>]
kmem_cache_alloc+0x50/0xb8kmem_cache_alloc+0x50/0xb8

{   16} {   16} [<c0148603>] [<c0148603>]
mempool_alloc_slab+0x13/0x15mempool_alloc_slab+0x13/0x15

{   52} {   52} [<c01484e3>] [<c01484e3>]
mempool_alloc+0x3f/0xdamempool_alloc+0x3f/0xda

{   24} {   24} [<c01d815c>] [<c01d815c>]
as_set_request+0x21/0x76as_set_request+0x21/0x76

{   24} {   24} [<c01cfea3>] [<c01cfea3>]
elv_set_request+0x24/0x34elv_set_request+0x24/0x34

{   44} {   44} [<c01d21b9>] get_request+0x173/0x279
{   60} [<c01d22de>] get_request_wait+0x1f/0xd2
{   76} [<c01d2eeb>] __make_request+0x2c2/0x42d[<c01d21b9>]
{   60} get_request+0x173/0x279[<c01d31e0>]
generic_make_request+0x108/0x11a{   60}
[<c01d22de>] {   60} get_request_wait+0x1f/0xd2[<c01d32cb>]
submit_bio+0xd9/0xe2{   76}
[<c01d2eeb>] {   36} __make_request+0x2c2/0x42d[<c0166a92>]
submit_bh+0x10f/0x131{   60}
[<c01d31e0>] {   20} generic_make_request+0x108/0x11a[<c0164a80>]
__bread_slow+0x51/0x8d{   60}
[<c01d32cb>] {   12} submit_bio+0xd9/0xe2
[<c0164d32>] {   36} __bread+0x2b/0x32[<c0166a92>]
submit_bh+0x10f/0x131{   48}
[<c019bc1b>] {   20} ext3_get_branch+0x6a/0xea[<c0164a80>]
__bread_slow+0x51/0x8d{  120}
[<c019c136>] {   12} ext3_get_block_handle+0xab/0x289[<c0164d32>]
__bread+0x2b/0x32{   40}
[<c019c389>] {   48} ext3_get_block+0x75/0x7c[<c019bc1b>]
ext3_get_branch+0x6a/0xea{  328}
[<c01834ca>] {  120} do_mpage_readpage+0x14d/0x383[<c019c136>]
ext3_get_block_handle+0xab/0x289{   84}
[<c0183779>] {   40} mpage_readpages+0x79/0xf4[<c019c389>]
ext3_get_block+0x75/0x7c{   24}
[<c019d0c4>] {  328} ext3_readpages+0x1b/0x1d[<c01834ca>]
do_mpage_readpage+0x14d/0x383{   76}
[<c014bb7a>] {   84} read_pages+0x2a/0xcd[<c0183779>]
mpage_readpages+0x79/0xf4{   56}
{   24} [<c019d0c4>] [<c014bd6d>]
ext3_readpages+0x1b/0x1d__do_page_cache_readahead+0x150/0x185

{   76} {   28} [<c014bb7a>] [<c014be56>]
read_pages+0x2a/0xcddo_page_cache_readahead+0x40/0x4c

{   56} {   60} [<c014bd6d>] [<c0146943>]
__do_page_cache_readahead+0x150/0x185filemap_nopage+0x148/0x2e8

{   28} {   60} [<c014be56>] [<c0151def>]
do_page_cache_readahead+0x40/0x4cdo_no_page+0x98/0x2a7

{   60} {   48} [<c0146943>] [<c015212f>]
filemap_nopage+0x148/0x2e8__handle_mm_fault+0xc5/0x17d

{   60} {   76} [<c0151def>] [<c02c4576>]
do_no_page+0x98/0x2a7do_page_fault+0x16b/0x494

{   48} {   80} [<c015212f>] [<c01037df>]
__handle_mm_fault+0xc5/0x17derror_code+0x4f/0x54

{   76} {   16} [<c02c4576>] [<c0188529>]
do_page_fault+0x16b/0x494padzero+0x23/0x34

{   80} {  140} [<c01037df>] [<c0189484>]
error_code+0x4f/0x54load_elf_binary+0x752/0xa9f

{   16} {   40} [<c0188529>] [<c016cd86>]
padzero+0x23/0x34search_binary_handler+0xcc/0x299

{  140} {  164} [<c0189484>] [<c0188461>]
load_elf_binary+0x752/0xa9fload_script+0x195/0x1a8

{   40} {   40} [<c016cd86>] [<c016cd86>]
search_binary_handler+0xcc/0x299search_binary_handler+0xcc/0x299

{  164} {   32} [<c0188461>] [<c016d0c4>]
load_script+0x195/0x1a8do_execve+0x171/0x21c

{   40} {   36} [<c016cd86>] [<c01019da>]
search_binary_handler+0xcc/0x299sys_execve+0x2f/0x74

{   32} {=2212} [<c016d0c4>] [<c0102c7e>]
do_execve+0x171/0x21csyscall_call+0x7/0xb

{   36} <---------------------------

[<c01019da>] sys_execve+0x2f/0x74
{=2212} [<c0102c7e>] syscall_call+0x7/0xb
<---------------------------

Here is config http://www.stardust.webpages.pl/files/rt/2.6.15-rt17/rt-config
Here is dmesg http://www.stardust.webpages.pl/files/rt/2.6.15-rt17/rt-dmesg

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-21 15:55 2.6.15-rt17 Ingo Molnar
                   ` (2 preceding siblings ...)
  2006-02-21 17:16 ` 2.6.15-rt17 Michal Piotrowski
@ 2006-02-21 17:23 ` Timothy R. Chavez
  2006-02-21 18:24 ` 2.6.15-rt17 Daniel Walker
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Timothy R. Chavez @ 2006-02-21 17:23 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Esben Nielsen, Thomas Gleixner, Steven Rostedt

On Tue, 2006-02-21 at 16:55 +0100, Ingo Molnar wrote:
> i have released the 2.6.15-rt17 tree, which can be downloaded from the 
> usual place:
> 
>    http://redhat.com/~mingo/realtime-preempt/
> 
> lots of changes all across the map. There are several bigger changes:
> 
> the biggest change is the new PI code from Esben Nielsen, Thomas 
> Gleixner and Steven Rostedt. This big rework simplifies and streamlines 
> the PI code, and fixes a couple of bugs and races:
> 
>   - only the top priority waiter on a lock is enqueued into the pi_list
>     of the task which holds the lock. No more pi list walking in the
>     boost case.
> 
>   - simpler locking rules
> 
>   - fast Atomic acquire for the non contended case and atomic release 
>     for non waiter case is fully functional now
> 
>   - use task_t references instead of thread_info pointers
> 
>   - BKL handling for semaphore style locks changed so that BKL is
>     dropped before the scheduler is entered and reaquired in the return
>     path. This solves a possible deadlock situation in the BKL reacquire
>     path of the scheduler.
> 
> another change is the reworking of the SLAB code: it now closely matches 
> the upstream SLAB code, and it should now work on NUMA systems too 
> (untested though).
> 
> the tasklet code was reworked too to be PREEMPT_RT friendly: the new PI 
> code unearthed a fundamental livelock scenario with PREEMPT_RT, and the 
> fix was to rework the tasklet code to get rid of the 'retrigger 
> softirqs' approach.
> 
> other changes: various hrtimers fixes, latency tracer enhancements - and 
> more. (Robust-futexes are not expected to work in this release.)
> 
> please report any new breakages, and re-report any old breakages as 
> well.
> 
> to build a 2.6.15-rt17 tree, the following patches should be applied:
> 
>   http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.15.tar.bz2
>   http://redhat.com/~mingo/realtime-preempt/patch-2.6.15-rt17

http://people.redhat.com/mingo/realtime-preempt/patch-2.6.16-rt17 ???

-tim

> 
> 	Ingo
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-21 15:55 2.6.15-rt17 Ingo Molnar
                   ` (3 preceding siblings ...)
  2006-02-21 17:23 ` 2.6.15-rt17 Timothy R. Chavez
@ 2006-02-21 18:24 ` Daniel Walker
  2006-02-21 18:46   ` 2.6.15-rt17 Thomas Gleixner
  2006-02-22 16:50 ` 2.6.15-rt17 Esben Nielsen
  2006-02-23 13:49 ` 2.6.15-rt17 Bill Huey
  6 siblings, 1 reply; 29+ messages in thread
From: Daniel Walker @ 2006-02-21 18:24 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Esben Nielsen, Thomas Gleixner, Steven Rostedt

On Tue, 2006-02-21 at 16:55 +0100, Ingo Molnar wrote:
> i have released the 2.6.15-rt17 tree, which can be downloaded from the 
> usual place:
> 
>    http://redhat.com/~mingo/realtime-preempt/

The patch at thie URL is patch-2.6.16-rt17 . But it doesn't look like
it's for 2.6.16-rc4 .

Daniel


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-21 18:24 ` 2.6.15-rt17 Daniel Walker
@ 2006-02-21 18:46   ` Thomas Gleixner
  0 siblings, 0 replies; 29+ messages in thread
From: Thomas Gleixner @ 2006-02-21 18:46 UTC (permalink / raw)
  To: Daniel Walker; +Cc: Ingo Molnar, linux-kernel, Esben Nielsen, Steven Rostedt

On Tue, 2006-02-21 at 10:24 -0800, Daniel Walker wrote:
> On Tue, 2006-02-21 at 16:55 +0100, Ingo Molnar wrote:
> > i have released the 2.6.15-rt17 tree, which can be downloaded from the 
> > usual place:
> > 
> >    http://redhat.com/~mingo/realtime-preempt/
> 
> The patch at thie URL is patch-2.6.16-rt17 . But it doesn't look like
> it's for 2.6.16-rc4 .

It's agains 2.6.15

	tglx



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-21 17:16 ` 2.6.15-rt17 Michal Piotrowski
@ 2006-02-21 18:53   ` Michal Piotrowski
  2006-02-21 20:37     ` 2.6.15-rt17 Thomas Gleixner
  2006-02-22 12:22   ` 2.6.15-rt17 Steven Rostedt
  1 sibling, 1 reply; 29+ messages in thread
From: Michal Piotrowski @ 2006-02-21 18:53 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Esben Nielsen, Thomas Gleixner, Steven Rostedt

On 21/02/06, Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
[snip]
> Here is config http://www.stardust.webpages.pl/files/rt/2.6.15-rt17/rt-config
> Here is dmesg http://www.stardust.webpages.pl/files/rt/2.6.15-rt17/rt-dmesg

Here is updated config
http://www.stardust.webpages.pl/files/rt/2.6.15-rt17/rt-config2
Here is updated dmesg
http://www.stardust.webpages.pl/files/rt/2.6.15-rt17/rt-dmesg2

BTW. thanks for fixing CONFIG_DEBUG_HIGHMEM=y

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-21 18:53   ` 2.6.15-rt17 Michal Piotrowski
@ 2006-02-21 20:37     ` Thomas Gleixner
  2006-02-21 21:15       ` 2.6.15-rt17 Michal Piotrowski
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Gleixner @ 2006-02-21 20:37 UTC (permalink / raw)
  To: Michal Piotrowski
  Cc: Ingo Molnar, linux-kernel, Esben Nielsen, Steven Rostedt

[-- Attachment #1: Type: text/plain, Size: 1380 bytes --]

On Tue, 2006-02-21 at 19:53 +0100, Michal Piotrowski wrote:
> On 21/02/06, Michal Piotrowski <michal.k.k.piotrowski@gmail.com> wrote:
> [snip]
> > Here is config http://www.stardust.webpages.pl/files/rt/2.6.15-rt17/rt-config
> > Here is dmesg http://www.stardust.webpages.pl/files/rt/2.6.15-rt17/rt-dmesg
> 
> Here is updated config
> http://www.stardust.webpages.pl/files/rt/2.6.15-rt17/rt-config2
> Here is updated dmesg
> http://www.stardust.webpages.pl/files/rt/2.6.15-rt17/rt-dmesg2
> 
> BTW. thanks for fixing CONFIG_DEBUG_HIGHMEM=y

The stack maximum messages are harmless. They just report a new
watermark high quite verbose. You can turn them off by disabling
DEBUG_STACKOVERFLOW.

skge eth0: Link is up at 100 Mbps, full duplex, flow control tx and rx
WARNING: softirq-tasklet/8 changed soft IRQ-flags.
 [<c0103b33>] dump_stack+0x1b/0x1f (20)
 [<c0134035>] illegal_API_call+0x41/0x46 (20)
 [<c0134084>] local_irq_disable+0x1d/0x1f (8)
 [<f8839bf3>] skge_extirq+0x117/0x138 [skge] (32)
 [<c0120a73>] __tasklet_action+0xb8/0xfd (28)
 [<c0120afc>] tasklet_action+0x44/0x4e (28)
 [<c0120ce8>] ksoftirqd+0x10f/0x19e (32)
 [<c012dfa6>] kthread+0x7b/0xa9 (36)
 [<c01010dd>] kernel_thread_helper+0x5/0xb (1037991964)

This one is interesting. It brought my attention to that piece of code
which is a long standing problem on one of my boxen anyway. Patch
attached.

	tglx







[-- Attachment #2: skge-fix.patch --]
[-- Type: text/x-patch, Size: 1526 bytes --]

Index: linux-2.6.15/drivers/net/skge.c
===================================================================
--- linux-2.6.15.orig/drivers/net/skge.c
+++ linux-2.6.15/drivers/net/skge.c
@@ -2820,10 +2820,10 @@ static void skge_extirq(unsigned long da
 	}
 	spin_unlock(&hw->phy_lock);
 
-	local_irq_disable();
+	spin_lock_irq(&hw->hw_lock);
 	hw->intr_mask |= IS_EXT_REG;
 	skge_write32(hw, B0_IMSK, hw->intr_mask);
-	local_irq_enable();
+	spin_unlock_irq(&hw->hw_lock);
 }
 
 static inline void skge_wakeup(struct net_device *dev)
@@ -2842,6 +2842,8 @@ static irqreturn_t skge_intr(int irq, vo
 	if (status == 0 || status == ~0) /* hotplug or shared irq */
 		return IRQ_NONE;
 
+	spin_lock(&hw->hw_lock);
+
 	status &= hw->intr_mask;
 	if (status & IS_R1_F) {
 		hw->intr_mask &= ~IS_R1_F;
@@ -2893,6 +2895,8 @@ static irqreturn_t skge_intr(int irq, vo
 
 	skge_write32(hw, B0_IMSK, hw->intr_mask);
 
+	spin_unlock(&hw->hw_lock);
+
 	return IRQ_HANDLED;
 }
 
@@ -3249,6 +3253,7 @@ static int __devinit skge_probe(struct p
 	}
 
 	hw->pdev = pdev;
+	spin_lock_init(&hw->hw_lock);
 	spin_lock_init(&hw->phy_lock);
 	tasklet_init(&hw->ext_tasklet, skge_extirq, (unsigned long) hw);
 
Index: linux-2.6.15/drivers/net/skge.h
===================================================================
--- linux-2.6.15.orig/drivers/net/skge.h
+++ linux-2.6.15/drivers/net/skge.h
@@ -2472,6 +2472,7 @@ struct skge_hw {
 	u16		     phy_addr;
 
 	struct tasklet_struct ext_tasklet;
+	spinlock_t	     hw_lock;
 	spinlock_t	     phy_lock;
 };
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-21 20:37     ` 2.6.15-rt17 Thomas Gleixner
@ 2006-02-21 21:15       ` Michal Piotrowski
  0 siblings, 0 replies; 29+ messages in thread
From: Michal Piotrowski @ 2006-02-21 21:15 UTC (permalink / raw)
  To: tglx; +Cc: Ingo Molnar, linux-kernel, Esben Nielsen, Steven Rostedt

Hi,

On 21/02/06, Thomas Gleixner <tglx@linutronix.de> wrote:
> The stack maximum messages are harmless. They just report a new
> watermark high quite verbose. You can turn them off by disabling
> DEBUG_STACKOVERFLOW.

Sorry, I forgot about this.

>
> skge eth0: Link is up at 100 Mbps, full duplex, flow control tx and rx
> WARNING: softirq-tasklet/8 changed soft IRQ-flags.
>  [<c0103b33>] dump_stack+0x1b/0x1f (20)
>  [<c0134035>] illegal_API_call+0x41/0x46 (20)
>  [<c0134084>] local_irq_disable+0x1d/0x1f (8)
>  [<f8839bf3>] skge_extirq+0x117/0x138 [skge] (32)
>  [<c0120a73>] __tasklet_action+0xb8/0xfd (28)
>  [<c0120afc>] tasklet_action+0x44/0x4e (28)
>  [<c0120ce8>] ksoftirqd+0x10f/0x19e (32)
>  [<c012dfa6>] kthread+0x7b/0xa9 (36)
>  [<c01010dd>] kernel_thread_helper+0x5/0xb (1037991964)
>
> This one is interesting. It brought my attention to that piece of code
> which is a long standing problem on one of my boxen anyway. Patch
> attached.
>
>         tglx
>

Thanks, problem solved.

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-21 16:21 ` 2.6.15-rt17 K.R. Foley
@ 2006-02-22  7:51   ` Ingo Molnar
  0 siblings, 0 replies; 29+ messages in thread
From: Ingo Molnar @ 2006-02-22  7:51 UTC (permalink / raw)
  To: K.R. Foley; +Cc: linux-kernel, Esben Nielsen, Thomas Gleixner, Steven Rostedt


* K.R. Foley <kr@cybsft.com> wrote:

> Also would you apply the attached patch to fix the RTC_HISTOGRAM 
> Kconfig option. In it's current state "tristate" it allows building as 
> a module, which of course doesn't work.

thanks, applied - should show up in -rt18.

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-21 17:16 ` 2.6.15-rt17 Michal Piotrowski
  2006-02-21 18:53   ` 2.6.15-rt17 Michal Piotrowski
@ 2006-02-22 12:22   ` Steven Rostedt
  2006-02-22 13:47     ` 2.6.15-rt17 Michal Piotrowski
  2006-02-24  6:38     ` 2.6.15-rt17 Ingo Molnar
  1 sibling, 2 replies; 29+ messages in thread
From: Steven Rostedt @ 2006-02-22 12:22 UTC (permalink / raw)
  To: Michal Piotrowski
  Cc: Ingo Molnar, linux-kernel, Esben Nielsen, Thomas Gleixner


On Tue, 21 Feb 2006, Michal Piotrowski wrote:

> Hi,
>
> On 21/02/06, Ingo Molnar <mingo@elte.hu> wrote:
> > i have released the 2.6.15-rt17 tree, which can be downloaded from the
> > usual place:
> [snip]
> > to build a 2.6.15-rt17 tree, the following patches should be applied:
> >
> >   http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.15.tar.bz2
> >   http://redhat.com/~mingo/realtime-preempt/patch-2.6.15-rt17
> >
> >         Ingo
>
> I have noticed some bugs
>
> ----------------------------->
> | new stack-footprint maximum: swapper/0, 788 bytes (out of 4044 bytes).
> ------------|
> {   24} [<c0135e7d>] debug_stackoverflow+0x80/0xb2
> {   28} [<c01364e9>] __mcount+0x3b/0xb2
> {   20} [<c010ebd8>] mcount+0x14/0x18
> {  124} [<c01dc5cf>] number+0xe/0x204
> {   76} [<c01dcbc3>] vsnprintf+0x3fe/0x438
> {   28} [<c01dcc18>] vscnprintf+0x1b/0x2b
> {  128} [<c011bd04>] vprintk+0x7a/0x23d
> {   20} [<c011bc88>] printk+0x18/0x1a
> {  172} [<c010ec2a>] MP_processor_info+0x4a/0x1b4
> {   36} [<c010ee2e>] mp_register_lapic+0x9a/0xa0
> {   24} [<c0482d5e>] acpi_parse_lapic+0x45/0x56
> {   28} [<c048dda6>] acpi_table_parse_madt_family+0xb0/0x100
> {   28} [<c048de10>] acpi_table_parse_madt+0x1a/0x1c
> {   32} [<c048312a>] acpi_parse_madt_lapic_entries+0x49/0x9a
> {    8} [<c048327d>] acpi_process_madt+0x23/0xb3
> {   24} [<c0483540>] acpi_boot_init+0x3c/0x4c
> {   20} [<c04805d4>] setup_arch+0x1af/0x1ed
> {   24} [<c047c72f>] start_kernel+0x30/0x19a
>

Ingo,

Maybe the following patch is needed, so that people know that this is not
a bug.

-- Steve

Index: linux-2.6.15-rt17/kernel/latency.c
===================================================================
--- linux-2.6.15-rt17.orig/kernel/latency.c	2006-02-21 11:44:54.000000000 -0500
+++ linux-2.6.15-rt17/kernel/latency.c	2006-02-22 07:20:55.000000000 -0500
@@ -418,6 +418,8 @@ static notrace void __print_worst_stack(
 	printk("| new stack-footprint maximum: %s/%d, %ld bytes (out of %ld bytes).\n",
 		worst_stack_comm, worst_stack_pid,
 		MAX_STACK-worst_stack_left, (long)MAX_STACK);
+	printk("| This is not a BUG\n");
+	printk("| turn off CONFIG_DEBUG_STACK_OVERFLOW if you don't want this\n");
 	printk("------------|\n");

 	show_stackframe();

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-22 12:22   ` 2.6.15-rt17 Steven Rostedt
@ 2006-02-22 13:47     ` Michal Piotrowski
  2006-02-24  6:38     ` 2.6.15-rt17 Ingo Molnar
  1 sibling, 0 replies; 29+ messages in thread
From: Michal Piotrowski @ 2006-02-22 13:47 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Ingo Molnar, linux-kernel, Esben Nielsen, Thomas Gleixner

Hi,

On 22/02/06, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> Ingo,
>
> Maybe the following patch is needed, so that people know that this is not
> a bug.
>
> -- Steve

It will be a very useful piece of information for kernel testers (aka.
not hackers ;).

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-21 15:55 2.6.15-rt17 Ingo Molnar
                   ` (4 preceding siblings ...)
  2006-02-21 18:24 ` 2.6.15-rt17 Daniel Walker
@ 2006-02-22 16:50 ` Esben Nielsen
  2006-02-22 23:17   ` 2.6.15-rt17 Thomas Gleixner
  2006-02-23 13:49 ` 2.6.15-rt17 Bill Huey
  6 siblings, 1 reply; 29+ messages in thread
From: Esben Nielsen @ 2006-02-22 16:50 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Thomas Gleixner, Steven Rostedt



On Tue, 21 Feb 2006, Ingo Molnar wrote:

> i have released the 2.6.15-rt17 tree, which can be downloaded from the
> usual place:
>
>    http://redhat.com/~mingo/realtime-preempt/
>
> lots of changes all across the map. There are several bigger changes:
>
> the biggest change is the new PI code from Esben Nielsen, Thomas
> Gleixner and Steven Rostedt. This big rework simplifies and streamlines
> the PI code, and fixes a couple of bugs and races:
>
I didn't know anyone looked at my patch! I am ofcourse happy about it :-)

I switftly looked at 2.6.15-rt17 to see how it turned out. I found one
problem in get_blocked_on(task):

First the task->pi_lock is locked, lock = task->blocked_on->lock, then
task->pi_lock is unlocked. Now lock is used. This is not safe.
Once task->pi_lock is unlocked we can be preempted and wait for a long
while the lock can be freed (if forinstance a driver is unloaded). You
either have to hold task->pi_lock or lock->wait_lock while refering to
lock. Otherwise the lock can be deallocated behind your bag. I assume that
nobody deallocate a lock with waiters.

That was why I had _reversed_ the lock ordering relative to normal in the
original patch: First lock task->pi_lock. Assign lock. Lock
lock->wait_lock. Then unlock task->pi_lock. Now it is safe to refer to
lock. To avoid deadlocks I used _raw_spin_trylock() when locking the 2.
lock.

To make a long story short:

--- linux-2.6.15-rt17/kernel/rt.c.orig  2006-02-22 16:53:44.000000000
+0100
+++ linux-2.6.15-rt17/kernel/rt.c       2006-02-22 16:56:21.000000000
+0100
@@ -873,10 +873,17 @@
        }

        lock = task->blocked_on->lock;
+
+       /* Now we have to take lock->wait_lock _before_ releasing
+          task->pi_lock. Otherwise lock can be deallocated while we are
+          refering to it as the subsystem has no way of knowing about us
+          hanging around in here. */
+       if (!_raw_spin_trylock(&lock->wait_lock)) {
+               _raw_spin_unlock(&task->pi_lock);
+               goto try_again;
+        }
        _raw_spin_unlock(&task->pi_lock);

-       if (!_raw_spin_trylock(&lock->wait_lock))
-               goto try_again;

        owner = lock_owner(lock);
        if (owner)

It compiles, but I have not tested it. I can't see why it shouldn't work.

Esben


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-22 16:50 ` 2.6.15-rt17 Esben Nielsen
@ 2006-02-22 23:17   ` Thomas Gleixner
  2006-02-25 17:11     ` 2.6.15-rt17 Esben Nielsen
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Gleixner @ 2006-02-22 23:17 UTC (permalink / raw)
  To: Esben Nielsen; +Cc: Ingo Molnar, linux-kernel, Steven Rostedt

On Wed, 2006-02-22 at 17:50 +0100, Esben Nielsen wrote:
> I didn't know anyone looked at my patch! I am ofcourse happy about it :-)

Was just delayed due to other work in progress.

> That was why I had _reversed_ the lock ordering relative to normal in the
> original patch: First lock task->pi_lock. Assign lock. Lock
> lock->wait_lock. Then unlock task->pi_lock. Now it is safe to refer to
> lock. To avoid deadlocks I used _raw_spin_trylock() when locking the 2.
> lock.

Stupid me. I messed that one up. Should show up in the next -rt

Thanks

	tglx



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-21 15:55 2.6.15-rt17 Ingo Molnar
                   ` (5 preceding siblings ...)
  2006-02-22 16:50 ` 2.6.15-rt17 Esben Nielsen
@ 2006-02-23 13:49 ` Bill Huey
  2006-02-23 13:53   ` 2.6.15-rt17 Ingo Molnar
  6 siblings, 1 reply; 29+ messages in thread
From: Bill Huey @ 2006-02-23 13:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Esben Nielsen, Thomas Gleixner, Steven Rostedt,
	Bill Huey (hui)

On Tue, Feb 21, 2006 at 04:55:48PM +0100, Ingo Molnar wrote:
> another change is the reworking of the SLAB code: it now closely matches 
> the upstream SLAB code, and it should now work on NUMA systems too 
> (untested though).

SLAB you say ? What's going on with this error ?

----
mm/built-in.o: In function `drain_alien_cache':slab.c:(.text+0x1fb2c):
undefined reference to `slab_spin_unlock_irqrestore'
make: *** [.tmp_vmlinux1] Error 1
----

bill


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-23 13:49 ` 2.6.15-rt17 Bill Huey
@ 2006-02-23 13:53   ` Ingo Molnar
  2006-02-23 14:05     ` 2.6.15-rt17 Bill Huey
  0 siblings, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2006-02-23 13:53 UTC (permalink / raw)
  To: Bill Huey; +Cc: linux-kernel, Esben Nielsen, Thomas Gleixner, Steven Rostedt


* Bill Huey <billh@gnuppy.monkey.org> wrote:

> On Tue, Feb 21, 2006 at 04:55:48PM +0100, Ingo Molnar wrote:
> > another change is the reworking of the SLAB code: it now closely matches 
> > the upstream SLAB code, and it should now work on NUMA systems too 
> > (untested though).
> 
> SLAB you say ? What's going on with this error ?
> 
> ----
> mm/built-in.o: In function `drain_alien_cache':slab.c:(.text+0x1fb2c):
> undefined reference to `slab_spin_unlock_irqrestore'
> make: *** [.tmp_vmlinux1] Error 1
> ----

find the fix from S�bastien Dugu� below.

	Ingo

Index: linux-2.6.15-rt17-numa/mm/slab.c
===================================================================
--- linux-2.6.15-rt17-numa.orig/mm/slab.c	2006-02-22 10:38:26.000000000 +0100
+++ linux-2.6.15-rt17-numa/mm/slab.c	2006-02-22 11:12:42.000000000 +0100
@@ -163,8 +163,8 @@
 				 	spin_unlock_irq(lock)
 # define slab_spin_lock_irqsave(lock, flags, cpu) \
  	do { spin_lock_irqsave(lock, flags); (cpu) = smp_processor_id(); } while (0)
-# define slab_spin_lock_irqrestore(lock, flags, cpu) \
- 	do { spin_lock_irqrestore(lock, flags); } while (0)
+# define slab_spin_unlock_irqrestore(lock, flags, cpu) \
+ 	do { spin_unlock_irqrestore(lock, flags); } while (0)
 #else
 DEFINE_PER_CPU_LOCKED(int, slab_locks) = { 0, };
 # define slab_irq_disable(cpu)		get_cpu_var_locked(slab_locks, &(cpu))
@@ -183,8 +183,8 @@
 		do { spin_unlock(lock); slab_irq_enable(cpu); } while (0)
 # define slab_spin_lock_irqsave(lock, flags, cpu) \
  	do { slab_irq_disable(cpu); spin_lock_irqsave(lock, flags); } while (0)
-# define slab_spin_lock_irqrestore(lock, flags, cpu) \
- 	do { spin_lock_irqrestore(lock, flags); slab_irq_enable(cpu); } while (0)
+# define slab_spin_unlock_irqrestore(lock, flags, cpu) \
+ 	do { spin_unlock_irqrestore(lock, flags); slab_irq_enable(cpu); } while (0)
 #endif
 
 /* Shouldn't this be in a header file somewhere? */

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-23 13:53   ` 2.6.15-rt17 Ingo Molnar
@ 2006-02-23 14:05     ` Bill Huey
  0 siblings, 0 replies; 29+ messages in thread
From: Bill Huey @ 2006-02-23 14:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Esben Nielsen, Thomas Gleixner, Steven Rostedt,
	Bill Huey (hui)

On Thu, Feb 23, 2006 at 02:53:50PM +0100, Ingo Molnar wrote:
> * Bill Huey <billh@gnuppy.monkey.org> wrote:
> > SLAB you say ? What's going on with this error ?
> > undefined reference to `slab_spin_unlock_irqrestore'
> 
> find the fix from S???bastien Dugu??? below.

I recreate the fix since it was obvious and this new version fixes the previous
SLAB related crash in -rt16. Looking good.

bill


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-22 12:22   ` 2.6.15-rt17 Steven Rostedt
  2006-02-22 13:47     ` 2.6.15-rt17 Michal Piotrowski
@ 2006-02-24  6:38     ` Ingo Molnar
  2006-02-24  7:09       ` 2.6.15-rt17 Steven Rostedt
  1 sibling, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2006-02-24  6:38 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Michal Piotrowski, linux-kernel, Esben Nielsen, Thomas Gleixner


* Steven Rostedt <rostedt@goodmis.org> wrote:

> +	printk("| This is not a BUG\n");
> +	printk("| turn off CONFIG_DEBUG_STACK_OVERFLOW if you don't want this\n");

i added the patch below - we want to know about too high stack 
footprints.

	Ingo

Index: linux-rt.q/kernel/latency.c
===================================================================
--- linux-rt.q.orig/kernel/latency.c
+++ linux-rt.q/kernel/latency.c
@@ -414,10 +414,17 @@ static void show_stackframe(void)
 
 static notrace void __print_worst_stack(void)
 {
+	unsigned long fill_ratio;
 	printk("----------------------------->\n");
-	printk("| new stack-footprint maximum: %s/%d, %ld bytes (out of %ld bytes).\n",
+	printk("| new stack fill maximum: %s/%d, %ld bytes (out of %ld bytes).\n",
 		worst_stack_comm, worst_stack_pid,
 		MAX_STACK-worst_stack_left, (long)MAX_STACK);
+	fill_ratio = (MAX_STACK-worst_stack_left)*100/(long)MAX_STACK;
+	printk("| Stack fill ratio: %02ld%%", fill_ratio);
+	if (fill_ratio >= 90)
+		printk(" - BUG: that's quite high, please report this!\n");
+	else
+		printk(" - that's still OK, no need to report this.\n");
 	printk("------------|\n");
 
 	show_stackframe();

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-24  6:38     ` 2.6.15-rt17 Ingo Molnar
@ 2006-02-24  7:09       ` Steven Rostedt
  0 siblings, 0 replies; 29+ messages in thread
From: Steven Rostedt @ 2006-02-24  7:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Michal Piotrowski, linux-kernel, Esben Nielsen, Thomas Gleixner


On Fri, 24 Feb 2006, Ingo Molnar wrote:

>
> * Steven Rostedt <rostedt@goodmis.org> wrote:
>
> > +	printk("| This is not a BUG\n");
> > +	printk("| turn off CONFIG_DEBUG_STACK_OVERFLOW if you don't want this\n");
>
> i added the patch below - we want to know about too high stack
> footprints.

Yeah you're right, those should be reported.

>
>  		MAX_STACK-worst_stack_left, (long)MAX_STACK);
> +	fill_ratio = (MAX_STACK-worst_stack_left)*100/(long)MAX_STACK;
> +	printk("| Stack fill ratio: %02ld%%", fill_ratio);
> +	if (fill_ratio >= 90)
> +		printk(" - BUG: that's quite high, please report this!\n");
> +	else
> +		printk(" - that's still OK, no need to report this.\n");
>  	printk("------------|\n");

On the non bug case, I would still state the way to turn it off.  It
becomes quite annoying if there are no bugs, and a non kernel hacker may
not know how to disable it.

-- Steve



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-22 23:17   ` 2.6.15-rt17 Thomas Gleixner
@ 2006-02-25 17:11     ` Esben Nielsen
  2006-02-25 17:32       ` 2.6.15-rt17 Steven Rostedt
  0 siblings, 1 reply; 29+ messages in thread
From: Esben Nielsen @ 2006-02-25 17:11 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Ingo Molnar, linux-kernel, Steven Rostedt

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1763 bytes --]

On Thu, 23 Feb 2006, Thomas Gleixner wrote:

> On Wed, 2006-02-22 at 17:50 +0100, Esben Nielsen wrote:
> > I didn't know anyone looked at my patch! I am ofcourse happy about it :-)
>
> Was just delayed due to other work in progress.
>

You didn't use the "TestRTMutex" I send along with the patch :-(

Since I learned to use unittesting that way I have been a big fan. It does
catch a lot of stupid bugs without having to wait for the program to get
there while keeping the ability to debug with gdb and fix it right away.
And most important: you can keep the tests and check if your program still
parses them after a rewrite. Very usefull to prevent repeating bugs.

So here is the issues I have found:

1) grablockBKL.tst failes. Apparently you can "grab" the BKL now? Is this
intended? I have updated the test to accept this.

2) 2locksdeadlock.tst loops forever. It is a livelock: When two RT-tasks
"deadlocks" on two mutexes they keep waking up each other in other. I
quickly fixed that bug.

3) While fixing that I came to think about what happens when you signal a
task blocked on a task blocked on a task. I thus wrote
2lock3tasksBoostSignal.tst. Well, the priorities wasn't set back
correctly. I fixed that too.

I have attached the patch againt 2.6.17-rt17 (and therefore also
included the previous patch) along with the updated tester and tests.

Esben


> > That was why I had _reversed_ the lock ordering relative to normal in the
> > original patch: First lock task->pi_lock. Assign lock. Lock
> > lock->wait_lock. Then unlock task->pi_lock. Now it is safe to refer to
> > lock. To avoid deadlocks I used _raw_spin_trylock() when locking the 2.
> > lock.
>
> Stupid me. I messed that one up. Should show up in the next -rt
>
> Thanks
>
> 	tglx
>
>


[-- Attachment #2: Type: APPLICATION/x-gzip, Size: 24769 bytes --]

[-- Attachment #3: Type: TEXT/PLAIN, Size: 2845 bytes --]

--- linux-2.6.15-rt17/kernel/rt.c.orig	2006-02-22 16:53:44.000000000 +0100
+++ linux-2.6.15-rt17/kernel/rt.c	2006-02-25 17:49:53.000000000 +0100
@@ -873,10 +873,19 @@ pid_t get_blocked_on(task_t *task)
 	}
 
 	lock = task->blocked_on->lock;
+	
+	/* 
+	 * Now we have to take lock->wait_lock _before_ releasing
+	 * task->pi_lock. Otherwise lock can be deallocated while we are
+	 * refering to it as the subsystem has no way of knowing about us
+	 * hanging around in here.
+	 */
+	if (!_raw_spin_trylock(&lock->wait_lock)) {
+		_raw_spin_unlock(&task->pi_lock);
+		goto try_again;
+        }
 	_raw_spin_unlock(&task->pi_lock);
 
-	if (!_raw_spin_trylock(&lock->wait_lock))
-		goto try_again;
 
 	owner = lock_owner(lock);
 	if (owner)
@@ -964,14 +973,24 @@ static inline int calc_pi_prio(task_t *t
 }
 
 /*
- * Adjust priority of a task
+ * Adjust priority of a task. 
+ * If wakeup is set it will be woken when blocked on a lock when adjusted
  */
-static void adjust_prio(task_t *task)
+static void adjust_prio(task_t *task, int wakeup)
 {
 	int prio = calc_pi_prio(task);
 
-	if (task->prio != prio)
+	if (task->prio != prio) {
 		mutex_setprio(task, prio);
+		if (wakeup && task->blocked_on) {
+                        /*
+			 * The owner will have the blocked field set if it is
+			 * blocked on a lock. So in this case we want to wake
+			 * the owner up so it can boost who it is blocked on.
+			 */
+			wake_up_process_mutex(task);
+		}
+	}
 }
 
 /*
@@ -1034,16 +1053,7 @@ static long task_blocks_on_lock(struct r
 
 	/* Add the new top priority waiter to the owners waiter list */
 	plist_add(&waiter->pi_list, &owner->pi_waiters);
-	adjust_prio(owner);
-
-	/*
-	 * The owner will have the blocked field set if it is
-	 * blocked on a lock. So in this case we want to wake
-	 * the owner up so it can boost who it is blocked on.
-	 */
-	if (owner->blocked_on)
-		wake_up_process_mutex(owner);
-
+	adjust_prio(owner,1);
 	_raw_spin_unlock(&owner->pi_lock);
 	return ret;
 }
@@ -1139,7 +1149,7 @@ static void remove_waiter(struct rt_mute
 			next = lock_first_waiter(lock);
 			plist_add(&next->pi_list, &owner->pi_waiters);
 		}
-		adjust_prio(owner);
+		adjust_prio(owner,1);
 		_raw_spin_unlock(&owner->pi_lock);
 	}
 }
@@ -1201,7 +1211,7 @@ static void release_lock(struct rt_mutex
 
 	/*  Readjust priority, when necessary. */
 	_raw_spin_lock(&current->pi_lock);
-	adjust_prio(current);
+	adjust_prio(current,0);
 	_raw_spin_unlock(&current->pi_lock);
 }
 
@@ -1453,7 +1463,7 @@ static int __sched down_rtsem(struct rt_
 		 * PI boost has to go
 		 */
 		_raw_spin_lock(&current->pi_lock);
-		adjust_prio(current);
+		adjust_prio(current,0);
 		_raw_spin_unlock(&current->pi_lock);
 	}
 	trace_unlock_irqrestore(&tracelock, flags);

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-25 17:11     ` 2.6.15-rt17 Esben Nielsen
@ 2006-02-25 17:32       ` Steven Rostedt
  2006-02-25 21:29         ` 2.6.15-rt17 Esben Nielsen
  0 siblings, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2006-02-25 17:32 UTC (permalink / raw)
  To: Esben Nielsen; +Cc: Thomas Gleixner, Ingo Molnar, linux-kernel


On Sat, 25 Feb 2006, Esben Nielsen wrote:

> On Thu, 23 Feb 2006, Thomas Gleixner wrote:
>
> > On Wed, 2006-02-22 at 17:50 +0100, Esben Nielsen wrote:
> > > I didn't know anyone looked at my patch! I am ofcourse happy about it :-)
> >
> > Was just delayed due to other work in progress.
> >
>
> You didn't use the "TestRTMutex" I send along with the patch :-(
>
> Since I learned to use unittesting that way I have been a big fan. It does
> catch a lot of stupid bugs without having to wait for the program to get
> there while keeping the ability to debug with gdb and fix it right away.
> And most important: you can keep the tests and check if your program still
> parses them after a rewrite. Very usefull to prevent repeating bugs.
>
> So here is the issues I have found:
>
> 1) grablockBKL.tst failes. Apparently you can "grab" the BKL now? Is this
> intended? I have updated the test to accept this.

since the BKL is now released on semaphores, I guess this is ok.

>
> 2) 2locksdeadlock.tst loops forever. It is a livelock: When two RT-tasks
> "deadlocks" on two mutexes they keep waking up each other in other. I
> quickly fixed that bug.

I actually thought about that.  But it still is an improvement, since it
doesn't deadlock the kernel. Just all processes that are of lower
priority.  This still should be fixed, but it needs to not cause any
noticable overhead.

>
> 3) While fixing that I came to think about what happens when you signal a
> task blocked on a task blocked on a task. I thus wrote
> 2lock3tasksBoostSignal.tst. Well, the priorities wasn't set back
> correctly. I fixed that too.
>
> I have attached the patch againt 2.6.17-rt17 (and therefore also
> included the previous patch) along with the updated tester and tests.
>
> Esben

I'll take a look at this tomorrow.

-- Steve

>
>
> > > That was why I had _reversed_ the lock ordering relative to normal in the
> > > original patch: First lock task->pi_lock. Assign lock. Lock
> > > lock->wait_lock. Then unlock task->pi_lock. Now it is safe to refer to
> > > lock. To avoid deadlocks I used _raw_spin_trylock() when locking the 2.
> > > lock.
> >
> > Stupid me. I messed that one up. Should show up in the next -rt
> >
> > Thanks
> >
> > 	tglx
> >
> >
>
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-25 17:32       ` 2.6.15-rt17 Steven Rostedt
@ 2006-02-25 21:29         ` Esben Nielsen
  2006-02-25 22:25           ` 2.6.15-rt17 Esben Nielsen
  2006-02-26 15:14           ` 2.6.15-rt17 Steven Rostedt
  0 siblings, 2 replies; 29+ messages in thread
From: Esben Nielsen @ 2006-02-25 21:29 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Thomas Gleixner, Ingo Molnar, linux-kernel

On Sat, 25 Feb 2006, Steven Rostedt wrote:

>
> On Sat, 25 Feb 2006, Esben Nielsen wrote:
>
> > On Thu, 23 Feb 2006, Thomas Gleixner wrote:
> >
> > > On Wed, 2006-02-22 at 17:50 +0100, Esben Nielsen wrote:
> > > > I didn't know anyone looked at my patch! I am ofcourse happy about it :-)
> > >
> > > Was just delayed due to other work in progress.
> > >
> >
> > You didn't use the "TestRTMutex" I send along with the patch :-(
> >
> > Since I learned to use unittesting that way I have been a big fan. It does
> > catch a lot of stupid bugs without having to wait for the program to get
> > there while keeping the ability to debug with gdb and fix it right away.
> > And most important: you can keep the tests and check if your program still
> > parses them after a rewrite. Very usefull to prevent repeating bugs.
> >
> > So here is the issues I have found:
> >
> > 1) grablockBKL.tst failes. Apparently you can "grab" the BKL now? Is this
> > intended? I have updated the test to accept this.
>
> since the BKL is now released on semaphores, I guess this is ok.

When I started on the tester this was so as well. It is a basic feature of
CONFIG_PREEMPT_BKL that you release the BKL semaphore when you block. The
problem in PREEMT_RT is that if you block in spin_lock() you should _not_
give BKL away.
And now I remember why you couldn't grab BKL as any other lock before: The
pending owner flag was on the task. Now if you could grab BKL you could be
pending owner of both BKL and another mutex. With the new pending owner
implementation this is no longer a problem.

>
> >
> > 2) 2locksdeadlock.tst loops forever. It is a livelock: When two RT-tasks
> > "deadlocks" on two mutexes they keep waking up each other in other. I
> > quickly fixed that bug.
>
> I actually thought about that.  But it still is an improvement, since it
> doesn't deadlock the kernel. Just all processes that are of lower
> priority.

Which are almost all the processes since only RT tasks can do this....

>  This still should be fixed, but it needs to not cause any
> noticable overhead.

With this patch there is less overhead than before:
wake_up_process_mutex() is called less often.

>
> >
> > 3) While fixing that I came to think about what happens when you signal a
> > task blocked on a task blocked on a task. I thus wrote
> > 2lock3tasksBoostSignal.tst. Well, the priorities wasn't set back
> > correctly. I fixed that too.
> >

This time around wake_up_process_mutex() wasn't called when it ought to
be... Now that I think about it there still is a problem with the
patch I sent: First the priority is set down, then the task is woken up.
But then it can't continue to de-boost the next task... Let me write a
test with 4 tasks and 3 locks to demonstrate.


> > I have attached the patch againt 2.6.17-rt17 (and therefore also
> > included the previous patch) along with the updated tester and tests.
> >
> > Esben
>
> I'll take a look at this tomorrow.
>
> -- Steve
>
> >
> >
> > > > That was why I had _reversed_ the lock ordering relative to normal in the
> > > > original patch: First lock task->pi_lock. Assign lock. Lock
> > > > lock->wait_lock. Then unlock task->pi_lock. Now it is safe to refer to
> > > > lock. To avoid deadlocks I used _raw_spin_trylock() when locking the 2.
> > > > lock.
> > >
> > > Stupid me. I messed that one up. Should show up in the next -rt
> > >
> > > Thanks
> > >
> > > 	tglx
> > >
> > >
> >
> >
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-25 21:29         ` 2.6.15-rt17 Esben Nielsen
@ 2006-02-25 22:25           ` Esben Nielsen
  2006-02-26 15:14           ` 2.6.15-rt17 Steven Rostedt
  1 sibling, 0 replies; 29+ messages in thread
From: Esben Nielsen @ 2006-02-25 22:25 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Thomas Gleixner, Ingo Molnar, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1769 bytes --]

On Sat, 25 Feb 2006, Esben Nielsen wrote:

> On Sat, 25 Feb 2006, Steven Rostedt wrote:
> [...]
> This time around wake_up_process_mutex() wasn't called when it ought to
> be... Now that I think about it there still is a problem with the
> patch I sent: First the priority is set down, then the task is woken up.
> But then it can't continue to de-boost the next task... Let me write a
> test with 4 tasks and 3 locks to demonstrate.
>
Ok, attached is the test and a better patch which solves the problem.

Esben

>
> > > I have attached the patch againt 2.6.17-rt17 (and therefore also
> > > included the previous patch) along with the updated tester and tests.
> > >
> > > Esben
> >
> > I'll take a look at this tomorrow.
> >
> > -- Steve
> >
> > >
> > >
> > > > > That was why I had _reversed_ the lock ordering relative to normal in the
> > > > > original patch: First lock task->pi_lock. Assign lock. Lock
> > > > > lock->wait_lock. Then unlock task->pi_lock. Now it is safe to refer to
> > > > > lock. To avoid deadlocks I used _raw_spin_trylock() when locking the 2.
> > > > > lock.
> > > >
> > > > Stupid me. I messed that one up. Should show up in the next -rt
> > > >
> > > > Thanks
> > > >
> > > > 	tglx
> > > >
> > > >
> > >
> > >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> >
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

[-- Attachment #2: Type: TEXT/PLAIN, Size: 711 bytes --]

threads:    4       3         2             1
        lockintr  1     +         +             +
            +      lockintr 2     +             +
            +       +        lockintr 3         +
            +      lockintr 3     +             +
          lockintr 2    +         +             +
test:      prio 4  prio 3    prio 2       prio 1
test:       -       -         +             +
            +       +         +           lockintr 1
test:      prio 1  prio 1    prio 1       prio 1
test:       -       -         +             -

            +       +        signal 4       +
test:       -       -         +             +
test:      prio 4  prio 3    prio 2       prio 1
             

[-- Attachment #3: Type: TEXT/PLAIN, Size: 3849 bytes --]

--- linux-2.6.15-rt17/kernel/rt.c.orig	2006-02-22 16:53:44.000000000 +0100
+++ linux-2.6.15-rt17/kernel/rt.c	2006-02-25 23:20:34.000000000 +0100
@@ -873,10 +873,19 @@ pid_t get_blocked_on(task_t *task)
 	}
 
 	lock = task->blocked_on->lock;
+	
+	/* 
+	 * Now we have to take lock->wait_lock _before_ releasing
+	 * task->pi_lock. Otherwise lock can be deallocated while we are
+	 * refering to it as the subsystem has no way of knowing about us
+	 * hanging around in here.
+	 */
+	if (!_raw_spin_trylock(&lock->wait_lock)) {
+		_raw_spin_unlock(&task->pi_lock);
+		goto try_again;
+        }
 	_raw_spin_unlock(&task->pi_lock);
 
-	if (!_raw_spin_trylock(&lock->wait_lock))
-		goto try_again;
 
 	owner = lock_owner(lock);
 	if (owner)
@@ -964,14 +973,52 @@ static inline int calc_pi_prio(task_t *t
 }
 
 /*
- * Adjust priority of a task
+ * Adjust priority of a task. 
  */
-static void adjust_prio(task_t *task)
+static void adjust_prio_no_wakeup(task_t *task)
 {
 	int prio = calc_pi_prio(task);
 
-	if (task->prio != prio)
+	if (task->prio != prio) {
+		mutex_setprio(task, prio);
+	}
+}
+
+/*
+ * Adjust priority of a task and wake it up if the prio is changed 
+ * and it is blocked on a mutex
+ */
+static void adjust_prio_wakeup(task_t *task)
+{
+	int prio = calc_pi_prio(task);
+	
+	if (task->prio < prio) {
+		if (task->blocked_on) {
+                        /*
+			 * The owner will have the blocked field set if it is
+			 * blocked on a lock. So in this case we want to wake
+			 * the owner up so it can boost who it is blocked on.
+			 *
+			 * We have to wait with lowering it's priority until this is done
+			 * or we risk letting other high priority task hang around.
+			 */
+			wake_up_process_mutex(task);
+		}
+		else {
+			mutex_setprio(task, prio);
+		}
+	}
+	else  if (task->prio > prio) {
 		mutex_setprio(task, prio);
+		if (task->blocked_on) {
+                        /*
+			 * The owner will have the blocked field set if it is
+			 * blocked on a lock. So in this case we want to wake
+			 * the owner up so it can boost who it is blocked on.
+			 */
+			wake_up_process_mutex(task);
+		}
+	}
 }
 
 /*
@@ -1001,6 +1048,7 @@ static long task_blocks_on_lock(struct r
 
 	/* Enqueue the task into the lock waiter list */
 	_raw_spin_lock(&current->pi_lock);
+	adjust_prio_no_wakeup(current);
 	current->blocked_on = waiter;
 	waiter->lock = lock;
 	waiter->task = current;
@@ -1034,16 +1082,7 @@ static long task_blocks_on_lock(struct r
 
 	/* Add the new top priority waiter to the owners waiter list */
 	plist_add(&waiter->pi_list, &owner->pi_waiters);
-	adjust_prio(owner);
-
-	/*
-	 * The owner will have the blocked field set if it is
-	 * blocked on a lock. So in this case we want to wake
-	 * the owner up so it can boost who it is blocked on.
-	 */
-	if (owner->blocked_on)
-		wake_up_process_mutex(owner);
-
+	adjust_prio_wakeup(owner);
 	_raw_spin_unlock(&owner->pi_lock);
 	return ret;
 }
@@ -1139,7 +1178,7 @@ static void remove_waiter(struct rt_mute
 			next = lock_first_waiter(lock);
 			plist_add(&next->pi_list, &owner->pi_waiters);
 		}
-		adjust_prio(owner);
+		adjust_prio_wakeup(owner);
 		_raw_spin_unlock(&owner->pi_lock);
 	}
 }
@@ -1201,7 +1240,7 @@ static void release_lock(struct rt_mutex
 
 	/*  Readjust priority, when necessary. */
 	_raw_spin_lock(&current->pi_lock);
-	adjust_prio(current);
+	adjust_prio_no_wakeup(current);
 	_raw_spin_unlock(&current->pi_lock);
 }
 
@@ -1453,7 +1492,7 @@ static int __sched down_rtsem(struct rt_
 		 * PI boost has to go
 		 */
 		_raw_spin_lock(&current->pi_lock);
-		adjust_prio(current);
+		adjust_prio_no_wakeup(current);
 		_raw_spin_unlock(&current->pi_lock);
 	}
 	trace_unlock_irqrestore(&tracelock, flags);

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-25 21:29         ` 2.6.15-rt17 Esben Nielsen
  2006-02-25 22:25           ` 2.6.15-rt17 Esben Nielsen
@ 2006-02-26 15:14           ` Steven Rostedt
  2006-02-26 22:38             ` 2.6.15-rt17 Esben Nielsen
  1 sibling, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2006-02-26 15:14 UTC (permalink / raw)
  To: Esben Nielsen; +Cc: Thomas Gleixner, Ingo Molnar, linux-kernel


Please inline patches!  it makes it a lot easier to comment on.


On Sat, 25 Feb 2006, Esben Nielsen wrote:
> On Sat, 25 Feb 2006, Steven Rostedt wrote:
> > On Sat, 25 Feb 2006, Esben Nielsen wrote:
> >
> > >
> > > You didn't use the "TestRTMutex" I send along with the patch :-(

These are good to have, but it should not be included in a kernel patch.
You can always host this tool too.

> > >
> > > Since I learned to use unittesting that way I have been a big fan. It does
> > > catch a lot of stupid bugs without having to wait for the program to get
> > > there while keeping the ability to debug with gdb and fix it right away.
> > > And most important: you can keep the tests and check if your program still
> > > parses them after a rewrite. Very usefull to prevent repeating bugs.
> > >
> > > So here is the issues I have found:
> > >
> > > 1) grablockBKL.tst failes. Apparently you can "grab" the BKL now? Is this
> > > intended? I have updated the test to accept this.
> >
> > since the BKL is now released on semaphores, I guess this is ok.
>
> When I started on the tester this was so as well. It is a basic feature of
> CONFIG_PREEMPT_BKL that you release the BKL semaphore when you block. The
> problem in PREEMT_RT is that if you block in spin_lock() you should _not_
> give BKL away.

And we don't. The BKL is held, we fool the scheduler by lowering setting
the lock depth to below zero, so the scheduler doesn't think we own it.
Or did you test program not realize that?

here in ____down_mutex:


	if (waiter.task) {
		unsigned long saved_flags = current->flags & PF_NOSCHED;
		int saved_lock_depth = current->lock_depth;

		/*
		 * Prevent schedule() to drop BKL, while waiting for
		 * the lock ! We restore lock_depth when we come back.
		 */
		current->lock_depth = -1;
		current->flags &= ~PF_NOSCHED;
		raw_local_irq_enable();

		schedule();

		raw_local_irq_disable();
		current->flags |= saved_flags;
		current->lock_depth = saved_lock_depth;
		state = xchg(&current->state, TASK_UNINTERRUPTIBLE);
		if (unlikely(state == TASK_RUNNING))
			saved_state = TASK_RUNNING;
	}


> And now I remember why you couldn't grab BKL as any other lock before: The
> pending owner flag was on the task. Now if you could grab BKL you could be
> pending owner of both BKL and another mutex. With the new pending owner
> implementation this is no longer a problem.

Correct, but it still causes problems in the logic. Say task A blocks on
semaphore X, which is owned by task B (which had the BKL and released it
on schedule). But B is also blocked on semaphore Y.  Task A wakes B and
when B goes to grab the BKL it blocks. Thus you are boosting two lines of
pi. This is a really bad design, which was easily solved by releasing the
BKL outside, of the rt_mutex logic.  When B really does have semaphore Y
it then grabs the BKL.  This way if it fails, the pi logic is still sane.

 >
> >
> > >
> > > 2) 2locksdeadlock.tst loops forever. It is a livelock: When two RT-tasks
> > > "deadlocks" on two mutexes they keep waking up each other in other. I
> > > quickly fixed that bug.
> >
> > I actually thought about that.  But it still is an improvement, since it
> > doesn't deadlock the kernel. Just all processes that are of lower
> > priority.
>
> Which are almost all the processes since only RT tasks can do this....

And remember that this is also only a problem with futexes, since there
are no deadlocks in the kernel ;-)  So if you have an RT task higher than
all the others, it will still run. Allowing you to at least shutdown the
machine nicely :)

>
> >  This still should be fixed, but it needs to not cause any
> > noticable overhead.
>
> With this patch there is less overhead than before:
> wake_up_process_mutex() is called less often.
>

Does it detect deadlocks??

> >
> > >
> > > 3) While fixing that I came to think about what happens when you signal a
> > > task blocked on a task blocked on a task. I thus wrote
> > > 2lock3tasksBoostSignal.tst. Well, the priorities wasn't set back
> > > correctly. I fixed that too.
> > >
>
> This time around wake_up_process_mutex() wasn't called when it ought to
> be... Now that I think about it there still is a problem with the
> patch I sent: First the priority is set down, then the task is woken up.
> But then it can't continue to de-boost the next task... Let me write a
> test with 4 tasks and 3 locks to demonstrate.
>
>
> > > I have attached the patch againt 2.6.17-rt17 (and therefore also
> > > included the previous patch) along with the updated tester and tests.
> > >
> > > Esben
> >
> > I'll take a look at this tomorrow.
> >

I'm looking at the last patch you sent:

+static void adjust_prio_no_wakeup(task_t *task)
 {
        int prio = calc_pi_prio(task);

-       if (task->prio != prio)
+       if (task->prio != prio) {
+               mutex_setprio(task, prio);
+       }
+}

Remove the brackets.

+static void adjust_prio_wakeup(task_t *task)
+{
+       int prio = calc_pi_prio(task);
+
+       if (task->prio < prio) {

will always be false, since cal_pi_prio returns the highest priority of
either the task or one of it's waiters (the lower prio is, the higher the
priority).

-- Steve


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-26 15:14           ` 2.6.15-rt17 Steven Rostedt
@ 2006-02-26 22:38             ` Esben Nielsen
  2006-02-27  7:15               ` 2.6.15-rt17 Steven Rostedt
  0 siblings, 1 reply; 29+ messages in thread
From: Esben Nielsen @ 2006-02-26 22:38 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Thomas Gleixner, Ingo Molnar, linux-kernel



On Sun, 26 Feb 2006, Steven Rostedt wrote:

>
> Please inline patches!  it makes it a lot easier to comment on.
>
Ok, I set up the alternative editor in Pine now, so I can do it in the future.

>
> On Sat, 25 Feb 2006, Esben Nielsen wrote:
> > On Sat, 25 Feb 2006, Steven Rostedt wrote:
> > > On Sat, 25 Feb 2006, Esben Nielsen wrote:
> > >
> > > >
> > > > You didn't use the "TestRTMutex" I send along with the patch :-(
>
> These are good to have, but it should not be included in a kernel patch.
> You can always host this tool too.
>
On the contrary: The kernel should have a test/ directy at the top level and
a "make test" target which runs all sort of unit-tests. For all patches to be
accepted make test should be succesfull. Unit tests are best when maintained
along with the code it tests.

> > > >
> > > > Since I learned to use unittesting that way I have been a big fan. It does
> > > > catch a lot of stupid bugs without having to wait for the program to get
> > > > there while keeping the ability to debug with gdb and fix it right away.
> > > > And most important: you can keep the tests and check if your program still
> > > > parses them after a rewrite. Very usefull to prevent repeating bugs.
> > > >
> > > > So here is the issues I have found:
> > > >
> > > > 1) grablockBKL.tst failes. Apparently you can "grab" the BKL now? Is this
> > > > intended? I have updated the test to accept this.
> > >
> > > since the BKL is now released on semaphores, I guess this is ok.
> >
> > When I started on the tester this was so as well. It is a basic feature of
> > CONFIG_PREEMPT_BKL that you release the BKL semaphore when you block. The
> > problem in PREEMT_RT is that if you block in spin_lock() you should _not_
> > give BKL away.
>
> And we don't. The BKL is held, we fool the scheduler by lowering setting
> the lock depth to below zero, so the scheduler doesn't think we own it.
> Or did you test program not realize that?
>

Yes, it did. Here is teh test script (this time inlined):

#priorities:
threads:     101          102
#BKL is lock 0

           lock 0     spinlock 1
test:          +           +
test:      prio 101    prio 102

#Reverse the locks:
        spinlock 1       lock 0

#You do _not_ give away the BKL on a spin lock. Therefore we deadlock.
test:          -           -
test:     prio 101     prio 102


There is a test without spinlock, but just lock (meaning down). This will not
deadlock because BKL is given away.

> [...]
>
>
> > And now I remember why you couldn't grab BKL as any other lock before: The
> > pending owner flag was on the task. Now if you could grab BKL you could be
> > pending owner of both BKL and another mutex. With the new pending owner
> > implementation this is no longer a problem.
>
> Correct, but it still causes problems in the logic. Say task A blocks on
> semaphore X, which is owned by task B (which had the BKL and released it
> on schedule). But B is also blocked on semaphore Y.  Task A wakes B and
> when B goes to grab the BKL it blocks. Thus you are boosting two lines of
> pi. This is a really bad design, which was easily solved by releasing the
> BKL outside, of the rt_mutex logic.  When B really does have semaphore Y
> it then grabs the BKL.  This way if it fails, the pi logic is still sane.

*nod*

>
>  >
> > >
> > > >
> > > > 2) 2locksdeadlock.tst loops forever. It is a livelock: When two RT-tasks
> > > > "deadlocks" on two mutexes they keep waking up each other in other. I
> > > > quickly fixed that bug.
> > >
> > > I actually thought about that.  But it still is an improvement, since it
> > > doesn't deadlock the kernel. Just all processes that are of lower
> > > priority.
> >
> > Which are almost all the processes since only RT tasks can do this....
>
> And remember that this is also only a problem with futexes, since there
> are no deadlocks in the kernel ;-)  So if you have an RT task higher than
> all the others, it will still run. Allowing you to at least shutdown the
> machine nicely :)

Yeah, but futexes with PI will be something comming back in soonish I think.

>
> >
> > >  This still should be fixed, but it needs to not cause any
> > > noticable overhead.
> >
> > With this patch there is less overhead than before:
> > wake_up_process_mutex() is called less often.
> >
>
> Does it detect deadlocks??

No, it just only wake up the blocked owner after changing the priortu
when the priority is actually changed.

> [...]
> > >
> > > I'll take a look at this tomorrow.
> > >
>
> I'm looking at the last patch you sent:
>
> +static void adjust_prio_no_wakeup(task_t *task)
>  {
>         int prio = calc_pi_prio(task);
>
> -       if (task->prio != prio)
> +       if (task->prio != prio) {
> +               mutex_setprio(task, prio);
> +       }
> +}
>
> Remove the brackets.
>
> +static void adjust_prio_wakeup(task_t *task)
> +{
> +       int prio = calc_pi_prio(task);
> +
> +       if (task->prio < prio) {
>
> will always be false, since cal_pi_prio returns the highest priority of
> either the task or one of it's waiters (the lower prio is, the higher the
> priority).
>

No:

/*
 * Calculate task priority from the waiter list priority
 *
 * Return task->normal_prio when the waiter list is empty or when
 * the waiter is not allowed to do priority boosting
 */
static inline int calc_pi_prio(task_t *task)
{
	int prio = task->normal_prio;
...

When PI boosted task->prio < task->normal_prio and the condition can be true.
Notice that adjust_prio_wakeup() is called from remove_waiter().

> -- Steve
>

Esben


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-26 22:38             ` 2.6.15-rt17 Esben Nielsen
@ 2006-02-27  7:15               ` Steven Rostedt
  0 siblings, 0 replies; 29+ messages in thread
From: Steven Rostedt @ 2006-02-27  7:15 UTC (permalink / raw)
  To: Esben Nielsen; +Cc: Thomas Gleixner, Ingo Molnar, linux-kernel


On Sun, 26 Feb 2006, Esben Nielsen wrote:
>
> On Sun, 26 Feb 2006, Steven Rostedt wrote:
>
> >
> > Please inline patches!  it makes it a lot easier to comment on.
> >
> Ok, I set up the alternative editor in Pine now, so I can do it in the future.
>

Hmm, I'm using pine now remotely. Doesn't ^R read in a file? Or is that
just with the default editor?

> >
> > On Sat, 25 Feb 2006, Esben Nielsen wrote:
> > > On Sat, 25 Feb 2006, Steven Rostedt wrote:
> > > > On Sat, 25 Feb 2006, Esben Nielsen wrote:
> > > >
> > > > >
> > > > > You didn't use the "TestRTMutex" I send along with the patch :-(
> >
> > These are good to have, but it should not be included in a kernel patch.
> > You can always host this tool too.
> >
> On the contrary: The kernel should have a test/ directy at the top level and
> a "make test" target which runs all sort of unit-tests. For all patches to be
> accepted make test should be succesfull. Unit tests are best when maintained
> along with the code it tests.

I actually agree with you here, but this issue is with the kernel
mantainers, and not the -rt ones.  BTW, I wasn't able to get your test to
compile. But I don't have time to look at why.

>
> > > > >
> > > > > Since I learned to use unittesting that way I have been a big fan. It does
> > > > > catch a lot of stupid bugs without having to wait for the program to get
> > > > > there while keeping the ability to debug with gdb and fix it right away.
> > > > > And most important: you can keep the tests and check if your program still
> > > > > parses them after a rewrite. Very usefull to prevent repeating bugs.
> > > > >
> > > > > So here is the issues I have found:
> > > > >
> > > > > 1) grablockBKL.tst failes. Apparently you can "grab" the BKL now? Is this
> > > > > intended? I have updated the test to accept this.
> > > >
> > > > since the BKL is now released on semaphores, I guess this is ok.
> > >
> > > When I started on the tester this was so as well. It is a basic feature of
> > > CONFIG_PREEMPT_BKL that you release the BKL semaphore when you block. The
> > > problem in PREEMT_RT is that if you block in spin_lock() you should _not_
> > > give BKL away.
> >
> > And we don't. The BKL is held, we fool the scheduler by lowering setting
> > the lock depth to below zero, so the scheduler doesn't think we own it.
> > Or did you test program not realize that?
> >
>
> Yes, it did. Here is teh test script (this time inlined):
>
> #priorities:
> threads:     101          102
> #BKL is lock 0
>
>            lock 0     spinlock 1
> test:          +           +
> test:      prio 101    prio 102
>
> #Reverse the locks:
>         spinlock 1       lock 0
>
> #You do _not_ give away the BKL on a spin lock. Therefore we deadlock.

I'm confused here.  You're saying that we did not give away the BKL on
spin lock. But that _is_ the right behavior.

> test:          -           -
> test:     prio 101     prio 102
>
>
> There is a test without spinlock, but just lock (meaning down). This will not
> deadlock because BKL is given away.

And we give it away on semaphore, which is also the right behavior.

>
> > [...]
> >
> >
> > > And now I remember why you couldn't grab BKL as any other lock before: The
> > > pending owner flag was on the task. Now if you could grab BKL you could be
> > > pending owner of both BKL and another mutex. With the new pending owner
> > > implementation this is no longer a problem.
> >
> > Correct, but it still causes problems in the logic. Say task A blocks on
> > semaphore X, which is owned by task B (which had the BKL and released it
> > on schedule). But B is also blocked on semaphore Y.  Task A wakes B and
> > when B goes to grab the BKL it blocks. Thus you are boosting two lines of
> > pi. This is a really bad design, which was easily solved by releasing the
> > BKL outside, of the rt_mutex logic.  When B really does have semaphore Y
> > it then grabs the BKL.  This way if it fails, the pi logic is still sane.
>
> *nod*
>
> >
> >  >
> > > >
> > > > >
> > > > > 2) 2locksdeadlock.tst loops forever. It is a livelock: When two RT-tasks
> > > > > "deadlocks" on two mutexes they keep waking up each other in other. I
> > > > > quickly fixed that bug.
> > > >
> > > > I actually thought about that.  But it still is an improvement, since it
> > > > doesn't deadlock the kernel. Just all processes that are of lower
> > > > priority.
> > >
> > > Which are almost all the processes since only RT tasks can do this....
> >
> > And remember that this is also only a problem with futexes, since there
> > are no deadlocks in the kernel ;-)  So if you have an RT task higher than
> > all the others, it will still run. Allowing you to at least shutdown the
> > machine nicely :)
>
> Yeah, but futexes with PI will be something comming back in soonish I think.

It may still be a while, since there's still lots of bugs and issues to
solve.  I wasn't saying that the current solution is correct wrt the
spinning threads.  This is most certainly a bug. I was just pointing out
that it's better than the previous hard deadlock that we had.  But this
still needs a way to be fixed with low overhead.

> >
> > >
> > > >  This still should be fixed, but it needs to not cause any
> > > > noticable overhead.
> > >
> > > With this patch there is less overhead than before:
> > > wake_up_process_mutex() is called less often.
> > >
> >
> > Does it detect deadlocks??
>
> No, it just only wake up the blocked owner after changing the priortu
> when the priority is actually changed.

I was talking about a low overhead wrt detecting deadlocks.

>
> > [...]
> > > >
> > > > I'll take a look at this tomorrow.
> > > >
> >
> > I'm looking at the last patch you sent:
> >
> > +static void adjust_prio_no_wakeup(task_t *task)
> >  {
> >         int prio = calc_pi_prio(task);
> >
> > -       if (task->prio != prio)
> > +       if (task->prio != prio) {
> > +               mutex_setprio(task, prio);
> > +       }
> > +}
> >
> > Remove the brackets.
> >
> > +static void adjust_prio_wakeup(task_t *task)
> > +{
> > +       int prio = calc_pi_prio(task);
> > +
> > +       if (task->prio < prio) {
> >
> > will always be false, since cal_pi_prio returns the highest priority of
> > either the task or one of it's waiters (the lower prio is, the higher the
> > priority).
> >
>
> No:
>
> /*
>  * Calculate task priority from the waiter list priority
>  *
>  * Return task->normal_prio when the waiter list is empty or when
>  * the waiter is not allowed to do priority boosting
>  */
> static inline int calc_pi_prio(task_t *task)
> {
> 	int prio = task->normal_prio;
> ...
>
> When PI boosted task->prio < task->normal_prio and the condition can be true.
> Notice that adjust_prio_wakeup() is called from remove_waiter().
>

/me blushes

OK, I didn't notice the normal_prio part. You're right there.  When I get
some more time I'll look more into the rest of your patch.

-- Steve


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
  2006-02-28 19:21 2.6.15-rt17 Karsten Wiese
@ 2006-02-28 19:43 ` Ingo Molnar
  0 siblings, 0 replies; 29+ messages in thread
From: Ingo Molnar @ 2006-02-28 19:43 UTC (permalink / raw)
  To: Karsten Wiese; +Cc: linux-kernel


* Karsten Wiese <annabellesgarden@yahoo.de> wrote:

> > the tasklet code was reworked too to be PREEMPT_RT friendly: the new PI 
> > code unearthed a fundamental livelock scenario with PREEMPT_RT, and the 
> > fix was to rework the tasklet code to get rid of the 'retrigger 
> > softirqs' approach.
> 
> following patch re enables tasklet_hi.
> needed by ALSA MIDI.

ahh ... this indeed explains some of the bugreports! Applied.

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.15-rt17
@ 2006-02-28 19:21 Karsten Wiese
  2006-02-28 19:43 ` 2.6.15-rt17 Ingo Molnar
  0 siblings, 1 reply; 29+ messages in thread
From: Karsten Wiese @ 2006-02-28 19:21 UTC (permalink / raw)
  To: mingo, linux-kernel

> the tasklet code was reworked too to be PREEMPT_RT friendly: the new PI 
> code unearthed a fundamental livelock scenario with PREEMPT_RT, and the 
> fix was to rework the tasklet code to get rid of the 'retrigger 
> softirqs' approach.

following patch re enables tasklet_hi.
needed by ALSA MIDI.

      Karsten


--- /tmp/softirq.c~	2006-02-28 20:17:03.000000000 +0100
+++ /tmp/softirq.c	2006-02-28 20:17:03.000000000 +0100
@@ -351,13 +351,13 @@
 static DEFINE_PER_CPU(struct tasklet_head, tasklet_hi_vec) = { NULL };
 
 static void inline
-__tasklet_common_schedule(struct tasklet_struct *t, struct tasklet_head *head)
+__tasklet_common_schedule(struct tasklet_struct *t, struct tasklet_head *head, unsigned int nr)
 {
 	if (tasklet_trylock(t)) {
 		WARN_ON(t->next != NULL);
 		t->next = head->list;
 		head->list = t;
-		raise_softirq_irqoff(TASKLET_SOFTIRQ);
+		raise_softirq_irqoff(nr);
 		tasklet_unlock(t);
 	}
 }
@@ -367,7 +367,7 @@
 	unsigned long flags;
 
 	raw_local_irq_save(flags);
-	__tasklet_common_schedule(t, &__get_cpu_var(tasklet_vec));
+	__tasklet_common_schedule(t, &__get_cpu_var(tasklet_vec), TASKLET_SOFTIRQ);
 	raw_local_irq_restore(flags);
 }
 
@@ -378,7 +378,7 @@
 	unsigned long flags;
 
 	raw_local_irq_save(flags);
-	__tasklet_common_schedule(t, &__get_cpu_var(tasklet_hi_vec));
+	__tasklet_common_schedule(t, &__get_cpu_var(tasklet_hi_vec), HI_SOFTIRQ);
 	raw_local_irq_restore(flags);
 }
 
_

	

	
		
___________________________________________________________ 
Telefonate ohne weitere Kosten vom PC zum PC: http://messenger.yahoo.de

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2006-02-28 19:44 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-21 15:55 2.6.15-rt17 Ingo Molnar
2006-02-21 16:11 ` 2.6.15-rt17 K.R. Foley
2006-02-21 16:21 ` 2.6.15-rt17 K.R. Foley
2006-02-22  7:51   ` 2.6.15-rt17 Ingo Molnar
2006-02-21 17:16 ` 2.6.15-rt17 Michal Piotrowski
2006-02-21 18:53   ` 2.6.15-rt17 Michal Piotrowski
2006-02-21 20:37     ` 2.6.15-rt17 Thomas Gleixner
2006-02-21 21:15       ` 2.6.15-rt17 Michal Piotrowski
2006-02-22 12:22   ` 2.6.15-rt17 Steven Rostedt
2006-02-22 13:47     ` 2.6.15-rt17 Michal Piotrowski
2006-02-24  6:38     ` 2.6.15-rt17 Ingo Molnar
2006-02-24  7:09       ` 2.6.15-rt17 Steven Rostedt
2006-02-21 17:23 ` 2.6.15-rt17 Timothy R. Chavez
2006-02-21 18:24 ` 2.6.15-rt17 Daniel Walker
2006-02-21 18:46   ` 2.6.15-rt17 Thomas Gleixner
2006-02-22 16:50 ` 2.6.15-rt17 Esben Nielsen
2006-02-22 23:17   ` 2.6.15-rt17 Thomas Gleixner
2006-02-25 17:11     ` 2.6.15-rt17 Esben Nielsen
2006-02-25 17:32       ` 2.6.15-rt17 Steven Rostedt
2006-02-25 21:29         ` 2.6.15-rt17 Esben Nielsen
2006-02-25 22:25           ` 2.6.15-rt17 Esben Nielsen
2006-02-26 15:14           ` 2.6.15-rt17 Steven Rostedt
2006-02-26 22:38             ` 2.6.15-rt17 Esben Nielsen
2006-02-27  7:15               ` 2.6.15-rt17 Steven Rostedt
2006-02-23 13:49 ` 2.6.15-rt17 Bill Huey
2006-02-23 13:53   ` 2.6.15-rt17 Ingo Molnar
2006-02-23 14:05     ` 2.6.15-rt17 Bill Huey
2006-02-28 19:21 2.6.15-rt17 Karsten Wiese
2006-02-28 19:43 ` 2.6.15-rt17 Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).