* [Xenomai-help] native: A 32k stack is not always a 'reasonable' size @ 2010-07-06 19:25 Peter Soetens 2010-07-07 9:06 ` Gilles Chanteperdrix 2010-07-11 13:15 ` Gilles Chanteperdrix 0 siblings, 2 replies; 21+ messages in thread From: Peter Soetens @ 2010-07-06 19:25 UTC (permalink / raw) To: xenomai-help At least, not for Orocos applications. We've had hard to debug application segfaults that used just a 'little' bit more than 32k. We had to raise the stack size to 128k to get reliably through our application startup. I stem from the old 'mlockall ate my RAM' generation where we typically reduced stack sizes in order to have some crumbles left for the heap. But 32k wasn't really what we were aiming for. Maybe we should explicitly document the 32k limit and its limitations for certain applications...? Just my 2ct, Peter PS: can anyone allow 'sspr' (=me) to edit/add stuff on the wiki ? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-06 19:25 [Xenomai-help] native: A 32k stack is not always a 'reasonable' size Peter Soetens @ 2010-07-07 9:06 ` Gilles Chanteperdrix 2010-07-07 20:57 ` Peter Soetens 2010-07-11 13:15 ` Gilles Chanteperdrix 1 sibling, 1 reply; 21+ messages in thread From: Gilles Chanteperdrix @ 2010-07-07 9:06 UTC (permalink / raw) To: Peter Soetens; +Cc: xenomai-help Peter Soetens wrote: > At least, not for Orocos applications. We've had hard to debug > application segfaults that used just a 'little' bit more than 32k. We > had to raise the stack size to 128k to get reliably through our > application startup. I stem from the old 'mlockall ate my RAM' > generation where we typically reduced stack sizes in order to have > some crumbles left for the heap. But 32k wasn't really what we were > aiming for. > > Maybe we should explicitly document the 32k limit and its limitations > for certain applications...? Again, things have been fixed in 2.5.3 with regard to stack sizes, could you check that you have the same behaviour? As for 32KiB, it is only a default stack size, it is only reasonable in the sense that 2MiB is unreasonable on a low-end system. 32KiB was picked because it allows printf to work. Now, whatever stack size we choose, there will be applications which need more, this does not really make the default unreasonable. > PS: can anyone allow 'sspr' (=me) to edit/add stuff on the wiki ? Looks like you passed an incorrect mail address for this account, so it could not be verified, did you fix this? -- Gilles. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-07 9:06 ` Gilles Chanteperdrix @ 2010-07-07 20:57 ` Peter Soetens 2010-07-07 21:19 ` Gilles Chanteperdrix 0 siblings, 1 reply; 21+ messages in thread From: Peter Soetens @ 2010-07-07 20:57 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: xenomai-help On Wed, Jul 7, 2010 at 11:06 AM, Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org> wrote: > Peter Soetens wrote: >> At least, not for Orocos applications. We've had hard to debug >> application segfaults that used just a 'little' bit more than 32k. We >> had to raise the stack size to 128k to get reliably through our >> application startup. I stem from the old 'mlockall ate my RAM' >> generation where we typically reduced stack sizes in order to have >> some crumbles left for the heap. But 32k wasn't really what we were >> aiming for. >> >> Maybe we should explicitly document the 32k limit and its limitations >> for certain applications...? > > Again, things have been fixed in 2.5.3 with regard to stack sizes, could > you check that you have the same behaviour? I think we had, but I'm uncertain right now. > > As for 32KiB, it is only a default stack size, it is only reasonable in > the sense that 2MiB is unreasonable on a low-end system. 32KiB was > picked because it allows printf to work. Now, whatever stack size we > choose, there will be applications which need more, this does not really > make the default unreasonable. I knew you would say that. It deserves an entry in the faq or some trouble shooting document though. > >> PS: can anyone allow 'sspr' (=me) to edit/add stuff on the wiki ? > > Looks like you passed an incorrect mail address for this account, so it > could not be verified, did you fix this? I did. Didn't realize there was a problem. Peter ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-07 20:57 ` Peter Soetens @ 2010-07-07 21:19 ` Gilles Chanteperdrix 2010-07-07 22:31 ` Peter Soetens 0 siblings, 1 reply; 21+ messages in thread From: Gilles Chanteperdrix @ 2010-07-07 21:19 UTC (permalink / raw) To: Peter Soetens; +Cc: xenomai-help Peter Soetens wrote: > On Wed, Jul 7, 2010 at 11:06 AM, Gilles Chanteperdrix > <gilles.chanteperdrix@xenomai.org> wrote: >> Peter Soetens wrote: >>> At least, not for Orocos applications. We've had hard to debug >>> application segfaults that used just a 'little' bit more than 32k. We >>> had to raise the stack size to 128k to get reliably through our >>> application startup. I stem from the old 'mlockall ate my RAM' >>> generation where we typically reduced stack sizes in order to have >>> some crumbles left for the heap. But 32k wasn't really what we were >>> aiming for. >>> >>> Maybe we should explicitly document the 32k limit and its limitations >>> for certain applications...? >> Again, things have been fixed in 2.5.3 with regard to stack sizes, could >> you check that you have the same behaviour? > > I think we had, but I'm uncertain right now. > >> As for 32KiB, it is only a default stack size, it is only reasonable in >> the sense that 2MiB is unreasonable on a low-end system. 32KiB was >> picked because it allows printf to work. Now, whatever stack size we >> choose, there will be applications which need more, this does not really >> make the default unreasonable. > > I knew you would say that. It deserves an entry in the faq or some > trouble shooting document though. It is documented. For instance, rt_task_create says: stksize The size of the stack (in bytes) for the new task. If zero is passed, a reasonable pre-defined size will be substituted. What else can we say? Documenting that this size is 32 KiB would be wrong, because we do not want applications to rely on a particular value, in case we want to change it. And the fact that if your stack is too small, you will get problems is kind of obvious. For anyone having played with stack sizes with Linux or any proprietary RTOS, at least. -- Gilles. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-07 21:19 ` Gilles Chanteperdrix @ 2010-07-07 22:31 ` Peter Soetens 2010-07-07 23:08 ` Gilles Chanteperdrix 0 siblings, 1 reply; 21+ messages in thread From: Peter Soetens @ 2010-07-07 22:31 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: xenomai-help On Wed, Jul 7, 2010 at 11:19 PM, Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org> wrote: > Peter Soetens wrote: >> On Wed, Jul 7, 2010 at 11:06 AM, Gilles Chanteperdrix >> <gilles.chanteperdrix@xenomai.org> wrote: >>> Peter Soetens wrote: >>>> At least, not for Orocos applications. We've had hard to debug >>>> application segfaults that used just a 'little' bit more than 32k. We >>>> had to raise the stack size to 128k to get reliably through our >>>> application startup. I stem from the old 'mlockall ate my RAM' >>>> generation where we typically reduced stack sizes in order to have >>>> some crumbles left for the heap. But 32k wasn't really what we were >>>> aiming for. >>>> >>>> Maybe we should explicitly document the 32k limit and its limitations >>>> for certain applications...? >>> Again, things have been fixed in 2.5.3 with regard to stack sizes, could >>> you check that you have the same behaviour? >> >> I think we had, but I'm uncertain right now. >> >>> As for 32KiB, it is only a default stack size, it is only reasonable in >>> the sense that 2MiB is unreasonable on a low-end system. 32KiB was >>> picked because it allows printf to work. Now, whatever stack size we >>> choose, there will be applications which need more, this does not really >>> make the default unreasonable. >> >> I knew you would say that. It deserves an entry in the faq or some >> trouble shooting document though. > > It is documented. For instance, rt_task_create says: > stksize The size of the stack (in bytes) for the new task. If > zero is passed, a reasonable pre-defined size will be substituted. > > What else can we say? Documenting that this size is 32 KiB would be > wrong, because we do not want applications to rely on a particular > value, in case we want to change it. And the fact that if your stack is > too small, you will get problems is kind of obvious. For anyone having > played with stack sizes with Linux or any proprietary RTOS, at least. And what with new RTOS/Xenomai users ? You have to take the user perspective here. The problem with stack overflows is that they occur when the development of a program has progressed a while and applications reached a certain level of complexity (otherwise the overflow wouldn't have happend in the first place). So it suddenly starts to segfault (from time to time). What he does is this: he fires up the debugger to get a backtrace, sees trouble and wrongly assumes that gdb can't really handle these Xenomai threads and tries to eliminate causes of the crashes.. The user comes quickly to the conclusion that 'putting it all together' causes the crash (the single unit tests pass) and is looking for a software integration problem. In reality, it's the stack. If you've been through all this and then came to the correct conclusion the same day, you've been burnt before, or are the exception. In my view, 32k is a premature optimization. At least, it shows the side effects of one. Peter ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-07 22:31 ` Peter Soetens @ 2010-07-07 23:08 ` Gilles Chanteperdrix 2010-07-08 8:37 ` Philippe Gerum 0 siblings, 1 reply; 21+ messages in thread From: Gilles Chanteperdrix @ 2010-07-07 23:08 UTC (permalink / raw) To: Peter Soetens; +Cc: xenomai-help Peter Soetens wrote: > On Wed, Jul 7, 2010 at 11:19 PM, Gilles Chanteperdrix > <gilles.chanteperdrix@xenomai.org> wrote: >> Peter Soetens wrote: >>> On Wed, Jul 7, 2010 at 11:06 AM, Gilles Chanteperdrix >>> <gilles.chanteperdrix@xenomai.org> wrote: >>>> Peter Soetens wrote: >>>>> At least, not for Orocos applications. We've had hard to debug >>>>> application segfaults that used just a 'little' bit more than 32k. We >>>>> had to raise the stack size to 128k to get reliably through our >>>>> application startup. I stem from the old 'mlockall ate my RAM' >>>>> generation where we typically reduced stack sizes in order to have >>>>> some crumbles left for the heap. But 32k wasn't really what we were >>>>> aiming for. >>>>> >>>>> Maybe we should explicitly document the 32k limit and its limitations >>>>> for certain applications...? >>>> Again, things have been fixed in 2.5.3 with regard to stack sizes, could >>>> you check that you have the same behaviour? >>> I think we had, but I'm uncertain right now. >>> >>>> As for 32KiB, it is only a default stack size, it is only reasonable in >>>> the sense that 2MiB is unreasonable on a low-end system. 32KiB was >>>> picked because it allows printf to work. Now, whatever stack size we >>>> choose, there will be applications which need more, this does not really >>>> make the default unreasonable. >>> I knew you would say that. It deserves an entry in the faq or some >>> trouble shooting document though. >> It is documented. For instance, rt_task_create says: >> stksize The size of the stack (in bytes) for the new task. If >> zero is passed, a reasonable pre-defined size will be substituted. >> >> What else can we say? Documenting that this size is 32 KiB would be >> wrong, because we do not want applications to rely on a particular >> value, in case we want to change it. And the fact that if your stack is >> too small, you will get problems is kind of obvious. For anyone having >> played with stack sizes with Linux or any proprietary RTOS, at least. > > And what with new RTOS/Xenomai users ? > > You have to take the user perspective here. The problem with stack > overflows is that they occur when the development of a program has > progressed a while and applications reached a certain level of > complexity (otherwise the overflow wouldn't have happend in the first > place). So it suddenly starts to segfault (from time to time). What he > does is this: he fires up the debugger to get a backtrace, sees > trouble and wrongly assumes that gdb can't really handle these Xenomai > threads and tries to eliminate causes of the crashes.. Last time I tried, debugging a stack overflow with gdb was possible. You can print the stack pointer and compare the value with the contents of /proc/pid/maps. The user comes > quickly to the conclusion that 'putting it all together' causes the > crash (the single unit tests pass) and is looking for a software > integration problem. In reality, it's the stack. > > If you've been through all this and then came to the correct > conclusion the same day, you've been burnt before, or are the > exception. > > In my view, 32k is a premature optimization. At least, it shows the > side effects of one. I guess you run Xenomai on one of these big irons, do you? Because if you ran on a low-end machine, you would have understand why we can not keep the 2MB default limit. 32 KiB looks already like a pretty large limit, so, maybe there is a problem in your application? The I-pipe patch for ARM detects stack overflows, I guess we can modify the kernel on all architectures to do the same thing on all architectures. -- Gilles. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-07 23:08 ` Gilles Chanteperdrix @ 2010-07-08 8:37 ` Philippe Gerum 2010-07-08 8:58 ` Gilles Chanteperdrix 0 siblings, 1 reply; 21+ messages in thread From: Philippe Gerum @ 2010-07-08 8:37 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: xenomai-help On Thu, 2010-07-08 at 01:08 +0200, Gilles Chanteperdrix wrote: > Peter Soetens wrote: > > On Wed, Jul 7, 2010 at 11:19 PM, Gilles Chanteperdrix > > <gilles.chanteperdrix@xenomai.org> wrote: > >> Peter Soetens wrote: > >>> On Wed, Jul 7, 2010 at 11:06 AM, Gilles Chanteperdrix > >>> <gilles.chanteperdrix@xenomai.org> wrote: > >>>> Peter Soetens wrote: > >>>>> At least, not for Orocos applications. We've had hard to debug > >>>>> application segfaults that used just a 'little' bit more than 32k. We > >>>>> had to raise the stack size to 128k to get reliably through our > >>>>> application startup. I stem from the old 'mlockall ate my RAM' > >>>>> generation where we typically reduced stack sizes in order to have > >>>>> some crumbles left for the heap. But 32k wasn't really what we were > >>>>> aiming for. > >>>>> > >>>>> Maybe we should explicitly document the 32k limit and its limitations > >>>>> for certain applications...? > >>>> Again, things have been fixed in 2.5.3 with regard to stack sizes, could > >>>> you check that you have the same behaviour? > >>> I think we had, but I'm uncertain right now. > >>> > >>>> As for 32KiB, it is only a default stack size, it is only reasonable in > >>>> the sense that 2MiB is unreasonable on a low-end system. 32KiB was > >>>> picked because it allows printf to work. Now, whatever stack size we > >>>> choose, there will be applications which need more, this does not really > >>>> make the default unreasonable. > >>> I knew you would say that. It deserves an entry in the faq or some > >>> trouble shooting document though. > >> It is documented. For instance, rt_task_create says: > >> stksize The size of the stack (in bytes) for the new task. If > >> zero is passed, a reasonable pre-defined size will be substituted. > >> > >> What else can we say? Documenting that this size is 32 KiB would be > >> wrong, because we do not want applications to rely on a particular > >> value, in case we want to change it. And the fact that if your stack is > >> too small, you will get problems is kind of obvious. For anyone having > >> played with stack sizes with Linux or any proprietary RTOS, at least. > > > > And what with new RTOS/Xenomai users ? > > > > You have to take the user perspective here. The problem with stack > > overflows is that they occur when the development of a program has > > progressed a while and applications reached a certain level of > > complexity (otherwise the overflow wouldn't have happend in the first > > place). So it suddenly starts to segfault (from time to time). What he > > does is this: he fires up the debugger to get a backtrace, sees > > trouble and wrongly assumes that gdb can't really handle these Xenomai > > threads and tries to eliminate causes of the crashes.. > > Last time I tried, debugging a stack overflow with gdb was possible. You > can print the stack pointer and compare the value with the contents of > /proc/pid/maps. > > The user comes > > quickly to the conclusion that 'putting it all together' causes the > > crash (the single unit tests pass) and is looking for a software > > integration problem. In reality, it's the stack. > > > > If you've been through all this and then came to the correct > > conclusion the same day, you've been burnt before, or are the > > exception. > > > > In my view, 32k is a premature optimization. At least, it shows the > > side effects of one. > > I guess you run Xenomai on one of these big irons, do you? Because if > you ran on a low-end machine, you would have understand why we can not > keep the 2MB default limit. 32 KiB looks already like a pretty large > limit, so, maybe there is a problem in your application? > > The I-pipe patch for ARM detects stack overflows, I guess we can modify > the kernel on all architectures to do the same thing on all architectures. > Peter made a good point considering the various braindamage outcomes a stack smashing issue could trigger. I'm unsure whether anyone can immediately suspect a stack overflow to be the cause of any random application behavior; typically, that issue could cause a branch to any random IP value on x86 since the return address is living on the stack and could get trashed, but not necessarily on architectures with branch-and-link registers. In the former case, GDB is of little help, except for single-stepping until the offending statement is reached and we can observe the trashing live, which means that we actually did the work of spotting the issue manually. It turns out that people with large applications and lots of contexts often end up naked in the cold most of the time when facing those things, and the only option left to them is to go backward on the integration path, in order to find a possibly faulty component. Before people can reasonably compare %sp values, they need some help to narrow the search, otherwise, it's hopeless. To this end, maybe an option would be to enable gcc's -fstack-protector[-all] -fstack-check when the debug switch is given to the configure script, provided the compiler in use supports this. Granted, a stack overflow is not identical to a smashing, but quite often the stack memory unduly consumed by a thread belongs to some other memory object, and therefore usually gets trashed when that object is modified. At least, enabling some canary word checking in that case may help. -- Philippe. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-08 8:37 ` Philippe Gerum @ 2010-07-08 8:58 ` Gilles Chanteperdrix 2010-07-08 9:31 ` Philippe Gerum 2010-07-08 9:50 ` Philippe Gerum 0 siblings, 2 replies; 21+ messages in thread From: Gilles Chanteperdrix @ 2010-07-08 8:58 UTC (permalink / raw) To: Philippe Gerum; +Cc: xenomai-help Philippe Gerum wrote: > On Thu, 2010-07-08 at 01:08 +0200, Gilles Chanteperdrix wrote: >> Peter Soetens wrote: >>> On Wed, Jul 7, 2010 at 11:19 PM, Gilles Chanteperdrix >>> <gilles.chanteperdrix@xenomai.org> wrote: >>>> Peter Soetens wrote: >>>>> On Wed, Jul 7, 2010 at 11:06 AM, Gilles Chanteperdrix >>>>> <gilles.chanteperdrix@xenomai.org> wrote: >>>>>> Peter Soetens wrote: >>>>>>> At least, not for Orocos applications. We've had hard to debug >>>>>>> application segfaults that used just a 'little' bit more than 32k. We >>>>>>> had to raise the stack size to 128k to get reliably through our >>>>>>> application startup. I stem from the old 'mlockall ate my RAM' >>>>>>> generation where we typically reduced stack sizes in order to have >>>>>>> some crumbles left for the heap. But 32k wasn't really what we were >>>>>>> aiming for. >>>>>>> >>>>>>> Maybe we should explicitly document the 32k limit and its limitations >>>>>>> for certain applications...? >>>>>> Again, things have been fixed in 2.5.3 with regard to stack sizes, could >>>>>> you check that you have the same behaviour? >>>>> I think we had, but I'm uncertain right now. >>>>> >>>>>> As for 32KiB, it is only a default stack size, it is only reasonable in >>>>>> the sense that 2MiB is unreasonable on a low-end system. 32KiB was >>>>>> picked because it allows printf to work. Now, whatever stack size we >>>>>> choose, there will be applications which need more, this does not really >>>>>> make the default unreasonable. >>>>> I knew you would say that. It deserves an entry in the faq or some >>>>> trouble shooting document though. >>>> It is documented. For instance, rt_task_create says: >>>> stksize The size of the stack (in bytes) for the new task. If >>>> zero is passed, a reasonable pre-defined size will be substituted. >>>> >>>> What else can we say? Documenting that this size is 32 KiB would be >>>> wrong, because we do not want applications to rely on a particular >>>> value, in case we want to change it. And the fact that if your stack is >>>> too small, you will get problems is kind of obvious. For anyone having >>>> played with stack sizes with Linux or any proprietary RTOS, at least. >>> And what with new RTOS/Xenomai users ? >>> >>> You have to take the user perspective here. The problem with stack >>> overflows is that they occur when the development of a program has >>> progressed a while and applications reached a certain level of >>> complexity (otherwise the overflow wouldn't have happend in the first >>> place). So it suddenly starts to segfault (from time to time). What he >>> does is this: he fires up the debugger to get a backtrace, sees >>> trouble and wrongly assumes that gdb can't really handle these Xenomai >>> threads and tries to eliminate causes of the crashes.. >> Last time I tried, debugging a stack overflow with gdb was possible. You >> can print the stack pointer and compare the value with the contents of >> /proc/pid/maps. >> >> The user comes >>> quickly to the conclusion that 'putting it all together' causes the >>> crash (the single unit tests pass) and is looking for a software >>> integration problem. In reality, it's the stack. >>> >>> If you've been through all this and then came to the correct >>> conclusion the same day, you've been burnt before, or are the >>> exception. >>> >>> In my view, 32k is a premature optimization. At least, it shows the >>> side effects of one. >> I guess you run Xenomai on one of these big irons, do you? Because if >> you ran on a low-end machine, you would have understand why we can not >> keep the 2MB default limit. 32 KiB looks already like a pretty large >> limit, so, maybe there is a problem in your application? >> >> The I-pipe patch for ARM detects stack overflows, I guess we can modify >> the kernel on all architectures to do the same thing on all architectures. >> > > Peter made a good point considering the various braindamage outcomes a > stack smashing issue could trigger. I'm unsure whether anyone can > immediately suspect a stack overflow to be the cause of any random > application behavior; typically, that issue could cause a branch to any > random IP value on x86 since the return address is living on the stack > and could get trashed, but not necessarily on architectures with > branch-and-link registers. In the former case, GDB is of little help, > except for single-stepping until the offending statement is reached and > we can observe the trashing live, which means that we actually did the > work of spotting the issue manually. > > It turns out that people with large applications and lots of contexts > often end up naked in the cold most of the time when facing those > things, and the only option left to them is to go backward on the > integration path, in order to find a possibly faulty component. Before > people can reasonably compare %sp values, they need some help to narrow > the search, otherwise, it's hopeless. > > To this end, maybe an option would be to enable gcc's > -fstack-protector[-all] -fstack-check when the debug switch is given to > the configure script, provided the compiler in use supports this. > > Granted, a stack overflow is not identical to a smashing, but quite > often the stack memory unduly consumed by a thread belongs to some other > memory object, and therefore usually gets trashed when that object is > modified. At least, enabling some canary word checking in that case may > help. I do not think so. The glibc maps an unreadable/unwritable page below the stack. So, what you get is a segmentation fault. Unless, of course, you overflow more than one page. But we can map more than one page by using pthread_attr_setguardsize, if one page is not enough. We can detect the stack overflow in kernel-space, there it is easy to detect, the problem is that x86 users, which are the ones more likely to be hit by a stack overflow, may not be watching the console, so may not see the message. Or we can install a handler for SIGSEGV which detects stack overflows (it will be a litlle harder than in kernel-space) and prints a clear message in that case but we will have to use an alternate stack for the signal handler (obviously, the SIGSEGV handler can not be stacked over the stack overflow). Or we can increase the default stack size, but in my view, we will only be delaying the problem a bit further down the "new users" development process. -- Gilles. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-08 8:58 ` Gilles Chanteperdrix @ 2010-07-08 9:31 ` Philippe Gerum 2010-07-08 9:35 ` Gilles Chanteperdrix 2010-07-08 9:50 ` Philippe Gerum 1 sibling, 1 reply; 21+ messages in thread From: Philippe Gerum @ 2010-07-08 9:31 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: xenomai-help On Thu, 2010-07-08 at 10:58 +0200, Gilles Chanteperdrix wrote: > Philippe Gerum wrote: > > On Thu, 2010-07-08 at 01:08 +0200, Gilles Chanteperdrix wrote: > >> Peter Soetens wrote: > >>> On Wed, Jul 7, 2010 at 11:19 PM, Gilles Chanteperdrix > >>> <gilles.chanteperdrix@xenomai.org> wrote: > >>>> Peter Soetens wrote: > >>>>> On Wed, Jul 7, 2010 at 11:06 AM, Gilles Chanteperdrix > >>>>> <gilles.chanteperdrix@xenomai.org> wrote: > >>>>>> Peter Soetens wrote: > >>>>>>> At least, not for Orocos applications. We've had hard to debug > >>>>>>> application segfaults that used just a 'little' bit more than 32k. We > >>>>>>> had to raise the stack size to 128k to get reliably through our > >>>>>>> application startup. I stem from the old 'mlockall ate my RAM' > >>>>>>> generation where we typically reduced stack sizes in order to have > >>>>>>> some crumbles left for the heap. But 32k wasn't really what we were > >>>>>>> aiming for. > >>>>>>> > >>>>>>> Maybe we should explicitly document the 32k limit and its limitations > >>>>>>> for certain applications...? > >>>>>> Again, things have been fixed in 2.5.3 with regard to stack sizes, could > >>>>>> you check that you have the same behaviour? > >>>>> I think we had, but I'm uncertain right now. > >>>>> > >>>>>> As for 32KiB, it is only a default stack size, it is only reasonable in > >>>>>> the sense that 2MiB is unreasonable on a low-end system. 32KiB was > >>>>>> picked because it allows printf to work. Now, whatever stack size we > >>>>>> choose, there will be applications which need more, this does not really > >>>>>> make the default unreasonable. > >>>>> I knew you would say that. It deserves an entry in the faq or some > >>>>> trouble shooting document though. > >>>> It is documented. For instance, rt_task_create says: > >>>> stksize The size of the stack (in bytes) for the new task. If > >>>> zero is passed, a reasonable pre-defined size will be substituted. > >>>> > >>>> What else can we say? Documenting that this size is 32 KiB would be > >>>> wrong, because we do not want applications to rely on a particular > >>>> value, in case we want to change it. And the fact that if your stack is > >>>> too small, you will get problems is kind of obvious. For anyone having > >>>> played with stack sizes with Linux or any proprietary RTOS, at least. > >>> And what with new RTOS/Xenomai users ? > >>> > >>> You have to take the user perspective here. The problem with stack > >>> overflows is that they occur when the development of a program has > >>> progressed a while and applications reached a certain level of > >>> complexity (otherwise the overflow wouldn't have happend in the first > >>> place). So it suddenly starts to segfault (from time to time). What he > >>> does is this: he fires up the debugger to get a backtrace, sees > >>> trouble and wrongly assumes that gdb can't really handle these Xenomai > >>> threads and tries to eliminate causes of the crashes.. > >> Last time I tried, debugging a stack overflow with gdb was possible. You > >> can print the stack pointer and compare the value with the contents of > >> /proc/pid/maps. > >> > >> The user comes > >>> quickly to the conclusion that 'putting it all together' causes the > >>> crash (the single unit tests pass) and is looking for a software > >>> integration problem. In reality, it's the stack. > >>> > >>> If you've been through all this and then came to the correct > >>> conclusion the same day, you've been burnt before, or are the > >>> exception. > >>> > >>> In my view, 32k is a premature optimization. At least, it shows the > >>> side effects of one. > >> I guess you run Xenomai on one of these big irons, do you? Because if > >> you ran on a low-end machine, you would have understand why we can not > >> keep the 2MB default limit. 32 KiB looks already like a pretty large > >> limit, so, maybe there is a problem in your application? > >> > >> The I-pipe patch for ARM detects stack overflows, I guess we can modify > >> the kernel on all architectures to do the same thing on all architectures. > >> > > > > Peter made a good point considering the various braindamage outcomes a > > stack smashing issue could trigger. I'm unsure whether anyone can > > immediately suspect a stack overflow to be the cause of any random > > application behavior; typically, that issue could cause a branch to any > > random IP value on x86 since the return address is living on the stack > > and could get trashed, but not necessarily on architectures with > > branch-and-link registers. In the former case, GDB is of little help, > > except for single-stepping until the offending statement is reached and > > we can observe the trashing live, which means that we actually did the > > work of spotting the issue manually. > > > > It turns out that people with large applications and lots of contexts > > often end up naked in the cold most of the time when facing those > > things, and the only option left to them is to go backward on the > > integration path, in order to find a possibly faulty component. Before > > people can reasonably compare %sp values, they need some help to narrow > > the search, otherwise, it's hopeless. > > > > To this end, maybe an option would be to enable gcc's > > -fstack-protector[-all] -fstack-check when the debug switch is given to > > the configure script, provided the compiler in use supports this. > > > > Granted, a stack overflow is not identical to a smashing, but quite > > often the stack memory unduly consumed by a thread belongs to some other > > memory object, and therefore usually gets trashed when that object is > > modified. At least, enabling some canary word checking in that case may > > help. > > I do not think so. The glibc maps an unreadable/unwritable page below > the stack. So, what you get is a segmentation fault. Unless, of course, > you overflow more than one page. But we can map more than one page by > using pthread_attr_setguardsize, if one page is not enough. The page guard is restricted to MMU-enabled systems, we have two over six of our architectures running without MMU. In this case, the only option left that may work is the stack protector based on the canary word checking. Relying on pthread_attr_setguardsize() when available will trigger the same amount of uncertainty than we have now with setting the minimum stack size. Which guard value would a sane default? one, two, four pages? > > We can detect the stack overflow in kernel-space, there it is easy to > detect, the problem is that x86 users, which are the ones more likely to > be hit by a stack overflow, may not be watching the console, so may not > see the message. > Kernel-space is another issue, people writing applications in kernel space are mostly on their own these days, and others implementing drivers are expected to always consider stack space as a scarce resource anyway. But helping with solving userland problems seems to be the most urgent thing to do, since common practices in that environment may conflict badly with real-time restrictions and requirements. > Or we can install a handler for SIGSEGV which detects stack overflows > (it will be a litlle harder than in kernel-space) and prints a clear > message in that case but we will have to use an alternate stack for the > signal handler (obviously, the SIGSEGV handler can not be stacked over > the stack overflow). > > Or we can increase the default stack size, but in my view, we will only > be delaying the problem a bit further down the "new users" development > process. > I agree with your view here, but this also creates the requirement for helping people to detect stack trashing early enough. -- Philippe. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-08 9:31 ` Philippe Gerum @ 2010-07-08 9:35 ` Gilles Chanteperdrix 2010-07-08 9:58 ` Philippe Gerum 0 siblings, 1 reply; 21+ messages in thread From: Gilles Chanteperdrix @ 2010-07-08 9:35 UTC (permalink / raw) To: Philippe Gerum; +Cc: xenomai-help Philippe Gerum wrote: > On Thu, 2010-07-08 at 10:58 +0200, Gilles Chanteperdrix wrote: >> Philippe Gerum wrote: >>> On Thu, 2010-07-08 at 01:08 +0200, Gilles Chanteperdrix wrote: >>>> Peter Soetens wrote: >>>>> On Wed, Jul 7, 2010 at 11:19 PM, Gilles Chanteperdrix >>>>> <gilles.chanteperdrix@xenomai.org> wrote: >>>>>> Peter Soetens wrote: >>>>>>> On Wed, Jul 7, 2010 at 11:06 AM, Gilles Chanteperdrix >>>>>>> <gilles.chanteperdrix@xenomai.org> wrote: >>>>>>>> Peter Soetens wrote: >>>>>>>>> At least, not for Orocos applications. We've had hard to debug >>>>>>>>> application segfaults that used just a 'little' bit more than 32k. We >>>>>>>>> had to raise the stack size to 128k to get reliably through our >>>>>>>>> application startup. I stem from the old 'mlockall ate my RAM' >>>>>>>>> generation where we typically reduced stack sizes in order to have >>>>>>>>> some crumbles left for the heap. But 32k wasn't really what we were >>>>>>>>> aiming for. >>>>>>>>> >>>>>>>>> Maybe we should explicitly document the 32k limit and its limitations >>>>>>>>> for certain applications...? >>>>>>>> Again, things have been fixed in 2.5.3 with regard to stack sizes, could >>>>>>>> you check that you have the same behaviour? >>>>>>> I think we had, but I'm uncertain right now. >>>>>>> >>>>>>>> As for 32KiB, it is only a default stack size, it is only reasonable in >>>>>>>> the sense that 2MiB is unreasonable on a low-end system. 32KiB was >>>>>>>> picked because it allows printf to work. Now, whatever stack size we >>>>>>>> choose, there will be applications which need more, this does not really >>>>>>>> make the default unreasonable. >>>>>>> I knew you would say that. It deserves an entry in the faq or some >>>>>>> trouble shooting document though. >>>>>> It is documented. For instance, rt_task_create says: >>>>>> stksize The size of the stack (in bytes) for the new task. If >>>>>> zero is passed, a reasonable pre-defined size will be substituted. >>>>>> >>>>>> What else can we say? Documenting that this size is 32 KiB would be >>>>>> wrong, because we do not want applications to rely on a particular >>>>>> value, in case we want to change it. And the fact that if your stack is >>>>>> too small, you will get problems is kind of obvious. For anyone having >>>>>> played with stack sizes with Linux or any proprietary RTOS, at least. >>>>> And what with new RTOS/Xenomai users ? >>>>> >>>>> You have to take the user perspective here. The problem with stack >>>>> overflows is that they occur when the development of a program has >>>>> progressed a while and applications reached a certain level of >>>>> complexity (otherwise the overflow wouldn't have happend in the first >>>>> place). So it suddenly starts to segfault (from time to time). What he >>>>> does is this: he fires up the debugger to get a backtrace, sees >>>>> trouble and wrongly assumes that gdb can't really handle these Xenomai >>>>> threads and tries to eliminate causes of the crashes.. >>>> Last time I tried, debugging a stack overflow with gdb was possible. You >>>> can print the stack pointer and compare the value with the contents of >>>> /proc/pid/maps. >>>> >>>> The user comes >>>>> quickly to the conclusion that 'putting it all together' causes the >>>>> crash (the single unit tests pass) and is looking for a software >>>>> integration problem. In reality, it's the stack. >>>>> >>>>> If you've been through all this and then came to the correct >>>>> conclusion the same day, you've been burnt before, or are the >>>>> exception. >>>>> >>>>> In my view, 32k is a premature optimization. At least, it shows the >>>>> side effects of one. >>>> I guess you run Xenomai on one of these big irons, do you? Because if >>>> you ran on a low-end machine, you would have understand why we can not >>>> keep the 2MB default limit. 32 KiB looks already like a pretty large >>>> limit, so, maybe there is a problem in your application? >>>> >>>> The I-pipe patch for ARM detects stack overflows, I guess we can modify >>>> the kernel on all architectures to do the same thing on all architectures. >>>> >>> Peter made a good point considering the various braindamage outcomes a >>> stack smashing issue could trigger. I'm unsure whether anyone can >>> immediately suspect a stack overflow to be the cause of any random >>> application behavior; typically, that issue could cause a branch to any >>> random IP value on x86 since the return address is living on the stack >>> and could get trashed, but not necessarily on architectures with >>> branch-and-link registers. In the former case, GDB is of little help, >>> except for single-stepping until the offending statement is reached and >>> we can observe the trashing live, which means that we actually did the >>> work of spotting the issue manually. >>> >>> It turns out that people with large applications and lots of contexts >>> often end up naked in the cold most of the time when facing those >>> things, and the only option left to them is to go backward on the >>> integration path, in order to find a possibly faulty component. Before >>> people can reasonably compare %sp values, they need some help to narrow >>> the search, otherwise, it's hopeless. >>> >>> To this end, maybe an option would be to enable gcc's >>> -fstack-protector[-all] -fstack-check when the debug switch is given to >>> the configure script, provided the compiler in use supports this. >>> >>> Granted, a stack overflow is not identical to a smashing, but quite >>> often the stack memory unduly consumed by a thread belongs to some other >>> memory object, and therefore usually gets trashed when that object is >>> modified. At least, enabling some canary word checking in that case may >>> help. >> I do not think so. The glibc maps an unreadable/unwritable page below >> the stack. So, what you get is a segmentation fault. Unless, of course, >> you overflow more than one page. But we can map more than one page by >> using pthread_attr_setguardsize, if one page is not enough. > > The page guard is restricted to MMU-enabled systems, we have two over > six of our architectures running without MMU. In this case, the only > option left that may work is the stack protector based on the canary > word checking. > > Relying on pthread_attr_setguardsize() when available will trigger the > same amount of uncertainty than we have now with setting the minimum > stack size. Which guard value would a sane default? one, two, four > pages? > >> We can detect the stack overflow in kernel-space, there it is easy to >> detect, the problem is that x86 users, which are the ones more likely to >> be hit by a stack overflow, may not be watching the console, so may not >> see the message. >> > > Kernel-space is another issue, people writing applications in kernel > space are mostly on their own these days, and others implementing > drivers are expected to always consider stack space as a scarce resource > anyway. But helping with solving userland problems seems to be the most > urgent thing to do, since common practices in that environment may > conflict badly with real-time restrictions and requirements. I mean detecting the user-space stack overflows when handling user-space page faults in kernel-space. But granted, that also only works for systems with an MMU. The following piece of code does it in the I-pipe patch for ARM with FCSE enabled: + down_read(&mm->mmap_sem); + if (find_vma(mm, addr) == find_vma(mm, regs->ARM_sp)) + printk(KERN_INFO "FCSE: process %u(%s) probably overflowed stack at 0x%08lx.\n", + current->pid, current->comm, regs->ARM_pc); + up_read(&mm->mmap_sem); > >> Or we can install a handler for SIGSEGV which detects stack overflows >> (it will be a litlle harder than in kernel-space) and prints a clear >> message in that case but we will have to use an alternate stack for the >> signal handler (obviously, the SIGSEGV handler can not be stacked over >> the stack overflow). >> >> Or we can increase the default stack size, but in my view, we will only >> be delaying the problem a bit further down the "new users" development >> process. >> > > I agree with your view here, but this also creates the requirement for > helping people to detect stack trashing early enough. > -- Gilles. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-08 9:35 ` Gilles Chanteperdrix @ 2010-07-08 9:58 ` Philippe Gerum 2010-07-08 10:04 ` Gilles Chanteperdrix 2010-07-08 11:52 ` Gilles Chanteperdrix 0 siblings, 2 replies; 21+ messages in thread From: Philippe Gerum @ 2010-07-08 9:58 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: xenomai-help On Thu, 2010-07-08 at 11:35 +0200, Gilles Chanteperdrix wrote: > Philippe Gerum wrote: > > On Thu, 2010-07-08 at 10:58 +0200, Gilles Chanteperdrix wrote: > >> Philippe Gerum wrote: > >>> On Thu, 2010-07-08 at 01:08 +0200, Gilles Chanteperdrix wrote: > >>>> Peter Soetens wrote: > >>>>> On Wed, Jul 7, 2010 at 11:19 PM, Gilles Chanteperdrix > >>>>> <gilles.chanteperdrix@xenomai.org> wrote: > >>>>>> Peter Soetens wrote: > >>>>>>> On Wed, Jul 7, 2010 at 11:06 AM, Gilles Chanteperdrix > >>>>>>> <gilles.chanteperdrix@xenomai.org> wrote: > >>>>>>>> Peter Soetens wrote: > >>>>>>>>> At least, not for Orocos applications. We've had hard to debug > >>>>>>>>> application segfaults that used just a 'little' bit more than 32k. We > >>>>>>>>> had to raise the stack size to 128k to get reliably through our > >>>>>>>>> application startup. I stem from the old 'mlockall ate my RAM' > >>>>>>>>> generation where we typically reduced stack sizes in order to have > >>>>>>>>> some crumbles left for the heap. But 32k wasn't really what we were > >>>>>>>>> aiming for. > >>>>>>>>> > >>>>>>>>> Maybe we should explicitly document the 32k limit and its limitations > >>>>>>>>> for certain applications...? > >>>>>>>> Again, things have been fixed in 2.5.3 with regard to stack sizes, could > >>>>>>>> you check that you have the same behaviour? > >>>>>>> I think we had, but I'm uncertain right now. > >>>>>>> > >>>>>>>> As for 32KiB, it is only a default stack size, it is only reasonable in > >>>>>>>> the sense that 2MiB is unreasonable on a low-end system. 32KiB was > >>>>>>>> picked because it allows printf to work. Now, whatever stack size we > >>>>>>>> choose, there will be applications which need more, this does not really > >>>>>>>> make the default unreasonable. > >>>>>>> I knew you would say that. It deserves an entry in the faq or some > >>>>>>> trouble shooting document though. > >>>>>> It is documented. For instance, rt_task_create says: > >>>>>> stksize The size of the stack (in bytes) for the new task. If > >>>>>> zero is passed, a reasonable pre-defined size will be substituted. > >>>>>> > >>>>>> What else can we say? Documenting that this size is 32 KiB would be > >>>>>> wrong, because we do not want applications to rely on a particular > >>>>>> value, in case we want to change it. And the fact that if your stack is > >>>>>> too small, you will get problems is kind of obvious. For anyone having > >>>>>> played with stack sizes with Linux or any proprietary RTOS, at least. > >>>>> And what with new RTOS/Xenomai users ? > >>>>> > >>>>> You have to take the user perspective here. The problem with stack > >>>>> overflows is that they occur when the development of a program has > >>>>> progressed a while and applications reached a certain level of > >>>>> complexity (otherwise the overflow wouldn't have happend in the first > >>>>> place). So it suddenly starts to segfault (from time to time). What he > >>>>> does is this: he fires up the debugger to get a backtrace, sees > >>>>> trouble and wrongly assumes that gdb can't really handle these Xenomai > >>>>> threads and tries to eliminate causes of the crashes.. > >>>> Last time I tried, debugging a stack overflow with gdb was possible. You > >>>> can print the stack pointer and compare the value with the contents of > >>>> /proc/pid/maps. > >>>> > >>>> The user comes > >>>>> quickly to the conclusion that 'putting it all together' causes the > >>>>> crash (the single unit tests pass) and is looking for a software > >>>>> integration problem. In reality, it's the stack. > >>>>> > >>>>> If you've been through all this and then came to the correct > >>>>> conclusion the same day, you've been burnt before, or are the > >>>>> exception. > >>>>> > >>>>> In my view, 32k is a premature optimization. At least, it shows the > >>>>> side effects of one. > >>>> I guess you run Xenomai on one of these big irons, do you? Because if > >>>> you ran on a low-end machine, you would have understand why we can not > >>>> keep the 2MB default limit. 32 KiB looks already like a pretty large > >>>> limit, so, maybe there is a problem in your application? > >>>> > >>>> The I-pipe patch for ARM detects stack overflows, I guess we can modify > >>>> the kernel on all architectures to do the same thing on all architectures. > >>>> > >>> Peter made a good point considering the various braindamage outcomes a > >>> stack smashing issue could trigger. I'm unsure whether anyone can > >>> immediately suspect a stack overflow to be the cause of any random > >>> application behavior; typically, that issue could cause a branch to any > >>> random IP value on x86 since the return address is living on the stack > >>> and could get trashed, but not necessarily on architectures with > >>> branch-and-link registers. In the former case, GDB is of little help, > >>> except for single-stepping until the offending statement is reached and > >>> we can observe the trashing live, which means that we actually did the > >>> work of spotting the issue manually. > >>> > >>> It turns out that people with large applications and lots of contexts > >>> often end up naked in the cold most of the time when facing those > >>> things, and the only option left to them is to go backward on the > >>> integration path, in order to find a possibly faulty component. Before > >>> people can reasonably compare %sp values, they need some help to narrow > >>> the search, otherwise, it's hopeless. > >>> > >>> To this end, maybe an option would be to enable gcc's > >>> -fstack-protector[-all] -fstack-check when the debug switch is given to > >>> the configure script, provided the compiler in use supports this. > >>> > >>> Granted, a stack overflow is not identical to a smashing, but quite > >>> often the stack memory unduly consumed by a thread belongs to some other > >>> memory object, and therefore usually gets trashed when that object is > >>> modified. At least, enabling some canary word checking in that case may > >>> help. > >> I do not think so. The glibc maps an unreadable/unwritable page below > >> the stack. So, what you get is a segmentation fault. Unless, of course, > >> you overflow more than one page. But we can map more than one page by > >> using pthread_attr_setguardsize, if one page is not enough. > > > > The page guard is restricted to MMU-enabled systems, we have two over > > six of our architectures running without MMU. In this case, the only > > option left that may work is the stack protector based on the canary > > word checking. > > > > Relying on pthread_attr_setguardsize() when available will trigger the > > same amount of uncertainty than we have now with setting the minimum > > stack size. Which guard value would a sane default? one, two, four > > pages? > > > >> We can detect the stack overflow in kernel-space, there it is easy to > >> detect, the problem is that x86 users, which are the ones more likely to > >> be hit by a stack overflow, may not be watching the console, so may not > >> see the message. > >> > > > > Kernel-space is another issue, people writing applications in kernel > > space are mostly on their own these days, and others implementing > > drivers are expected to always consider stack space as a scarce resource > > anyway. But helping with solving userland problems seems to be the most > > urgent thing to do, since common practices in that environment may > > conflict badly with real-time restrictions and requirements. > > I mean detecting the user-space stack overflows when handling user-space > page faults in kernel-space. But granted, that also only works for > systems with an MMU. The following piece of code does it in the I-pipe > patch for ARM with FCSE enabled: > > + down_read(&mm->mmap_sem); > + if (find_vma(mm, addr) == find_vma(mm, regs->ARM_sp)) > + printk(KERN_INFO "FCSE: process %u(%s) probably overflowed stack > at 0x%08lx.\n", > + current->pid, current->comm, regs->ARM_pc); > + up_read(&mm->mmap_sem); > My understanding is that such code detects faulty references within the _valid_ address space, typically when hitting a page guard area. But I guess that this won't work when treading on stack memory outside of the address space, e.g. below the red zone for instance, isn't it? AFAIU, those things may happen when the heading space of preposterously large stack-based objects are addressed. > > > > >> Or we can install a handler for SIGSEGV which detects stack overflows > >> (it will be a litlle harder than in kernel-space) and prints a clear > >> message in that case but we will have to use an alternate stack for the > >> signal handler (obviously, the SIGSEGV handler can not be stacked over > >> the stack overflow). > >> > >> Or we can increase the default stack size, but in my view, we will only > >> be delaying the problem a bit further down the "new users" development > >> process. > >> > > > > I agree with your view here, but this also creates the requirement for > > helping people to detect stack trashing early enough. > > > > -- Philippe. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-08 9:58 ` Philippe Gerum @ 2010-07-08 10:04 ` Gilles Chanteperdrix 2010-07-08 10:09 ` Gilles Chanteperdrix 2010-07-08 11:52 ` Gilles Chanteperdrix 1 sibling, 1 reply; 21+ messages in thread From: Gilles Chanteperdrix @ 2010-07-08 10:04 UTC (permalink / raw) To: Philippe Gerum; +Cc: xenomai-help Philippe Gerum wrote: > On Thu, 2010-07-08 at 11:35 +0200, Gilles Chanteperdrix wrote: >> Philippe Gerum wrote: >>> On Thu, 2010-07-08 at 10:58 +0200, Gilles Chanteperdrix wrote: >>>> Philippe Gerum wrote: >>>>> On Thu, 2010-07-08 at 01:08 +0200, Gilles Chanteperdrix wrote: >>>>>> Peter Soetens wrote: >>>>>>> On Wed, Jul 7, 2010 at 11:19 PM, Gilles Chanteperdrix >>>>>>> <gilles.chanteperdrix@xenomai.org> wrote: >>>>>>>> Peter Soetens wrote: >>>>>>>>> On Wed, Jul 7, 2010 at 11:06 AM, Gilles Chanteperdrix >>>>>>>>> <gilles.chanteperdrix@xenomai.org> wrote: >>>>>>>>>> Peter Soetens wrote: >>>>>>>>>>> At least, not for Orocos applications. We've had hard to debug >>>>>>>>>>> application segfaults that used just a 'little' bit more than 32k. We >>>>>>>>>>> had to raise the stack size to 128k to get reliably through our >>>>>>>>>>> application startup. I stem from the old 'mlockall ate my RAM' >>>>>>>>>>> generation where we typically reduced stack sizes in order to have >>>>>>>>>>> some crumbles left for the heap. But 32k wasn't really what we were >>>>>>>>>>> aiming for. >>>>>>>>>>> >>>>>>>>>>> Maybe we should explicitly document the 32k limit and its limitations >>>>>>>>>>> for certain applications...? >>>>>>>>>> Again, things have been fixed in 2.5.3 with regard to stack sizes, could >>>>>>>>>> you check that you have the same behaviour? >>>>>>>>> I think we had, but I'm uncertain right now. >>>>>>>>> >>>>>>>>>> As for 32KiB, it is only a default stack size, it is only reasonable in >>>>>>>>>> the sense that 2MiB is unreasonable on a low-end system. 32KiB was >>>>>>>>>> picked because it allows printf to work. Now, whatever stack size we >>>>>>>>>> choose, there will be applications which need more, this does not really >>>>>>>>>> make the default unreasonable. >>>>>>>>> I knew you would say that. It deserves an entry in the faq or some >>>>>>>>> trouble shooting document though. >>>>>>>> It is documented. For instance, rt_task_create says: >>>>>>>> stksize The size of the stack (in bytes) for the new task. If >>>>>>>> zero is passed, a reasonable pre-defined size will be substituted. >>>>>>>> >>>>>>>> What else can we say? Documenting that this size is 32 KiB would be >>>>>>>> wrong, because we do not want applications to rely on a particular >>>>>>>> value, in case we want to change it. And the fact that if your stack is >>>>>>>> too small, you will get problems is kind of obvious. For anyone having >>>>>>>> played with stack sizes with Linux or any proprietary RTOS, at least. >>>>>>> And what with new RTOS/Xenomai users ? >>>>>>> >>>>>>> You have to take the user perspective here. The problem with stack >>>>>>> overflows is that they occur when the development of a program has >>>>>>> progressed a while and applications reached a certain level of >>>>>>> complexity (otherwise the overflow wouldn't have happend in the first >>>>>>> place). So it suddenly starts to segfault (from time to time). What he >>>>>>> does is this: he fires up the debugger to get a backtrace, sees >>>>>>> trouble and wrongly assumes that gdb can't really handle these Xenomai >>>>>>> threads and tries to eliminate causes of the crashes.. >>>>>> Last time I tried, debugging a stack overflow with gdb was possible. You >>>>>> can print the stack pointer and compare the value with the contents of >>>>>> /proc/pid/maps. >>>>>> >>>>>> The user comes >>>>>>> quickly to the conclusion that 'putting it all together' causes the >>>>>>> crash (the single unit tests pass) and is looking for a software >>>>>>> integration problem. In reality, it's the stack. >>>>>>> >>>>>>> If you've been through all this and then came to the correct >>>>>>> conclusion the same day, you've been burnt before, or are the >>>>>>> exception. >>>>>>> >>>>>>> In my view, 32k is a premature optimization. At least, it shows the >>>>>>> side effects of one. >>>>>> I guess you run Xenomai on one of these big irons, do you? Because if >>>>>> you ran on a low-end machine, you would have understand why we can not >>>>>> keep the 2MB default limit. 32 KiB looks already like a pretty large >>>>>> limit, so, maybe there is a problem in your application? >>>>>> >>>>>> The I-pipe patch for ARM detects stack overflows, I guess we can modify >>>>>> the kernel on all architectures to do the same thing on all architectures. >>>>>> >>>>> Peter made a good point considering the various braindamage outcomes a >>>>> stack smashing issue could trigger. I'm unsure whether anyone can >>>>> immediately suspect a stack overflow to be the cause of any random >>>>> application behavior; typically, that issue could cause a branch to any >>>>> random IP value on x86 since the return address is living on the stack >>>>> and could get trashed, but not necessarily on architectures with >>>>> branch-and-link registers. In the former case, GDB is of little help, >>>>> except for single-stepping until the offending statement is reached and >>>>> we can observe the trashing live, which means that we actually did the >>>>> work of spotting the issue manually. >>>>> >>>>> It turns out that people with large applications and lots of contexts >>>>> often end up naked in the cold most of the time when facing those >>>>> things, and the only option left to them is to go backward on the >>>>> integration path, in order to find a possibly faulty component. Before >>>>> people can reasonably compare %sp values, they need some help to narrow >>>>> the search, otherwise, it's hopeless. >>>>> >>>>> To this end, maybe an option would be to enable gcc's >>>>> -fstack-protector[-all] -fstack-check when the debug switch is given to >>>>> the configure script, provided the compiler in use supports this. >>>>> >>>>> Granted, a stack overflow is not identical to a smashing, but quite >>>>> often the stack memory unduly consumed by a thread belongs to some other >>>>> memory object, and therefore usually gets trashed when that object is >>>>> modified. At least, enabling some canary word checking in that case may >>>>> help. >>>> I do not think so. The glibc maps an unreadable/unwritable page below >>>> the stack. So, what you get is a segmentation fault. Unless, of course, >>>> you overflow more than one page. But we can map more than one page by >>>> using pthread_attr_setguardsize, if one page is not enough. >>> The page guard is restricted to MMU-enabled systems, we have two over >>> six of our architectures running without MMU. In this case, the only >>> option left that may work is the stack protector based on the canary >>> word checking. >>> >>> Relying on pthread_attr_setguardsize() when available will trigger the >>> same amount of uncertainty than we have now with setting the minimum >>> stack size. Which guard value would a sane default? one, two, four >>> pages? >>> >>>> We can detect the stack overflow in kernel-space, there it is easy to >>>> detect, the problem is that x86 users, which are the ones more likely to >>>> be hit by a stack overflow, may not be watching the console, so may not >>>> see the message. >>>> >>> Kernel-space is another issue, people writing applications in kernel >>> space are mostly on their own these days, and others implementing >>> drivers are expected to always consider stack space as a scarce resource >>> anyway. But helping with solving userland problems seems to be the most >>> urgent thing to do, since common practices in that environment may >>> conflict badly with real-time restrictions and requirements. >> I mean detecting the user-space stack overflows when handling user-space >> page faults in kernel-space. But granted, that also only works for >> systems with an MMU. The following piece of code does it in the I-pipe >> patch for ARM with FCSE enabled: >> >> + down_read(&mm->mmap_sem); >> + if (find_vma(mm, addr) == find_vma(mm, regs->ARM_sp)) >> + printk(KERN_INFO "FCSE: process %u(%s) probably overflowed stack >> at 0x%08lx.\n", >> + current->pid, current->comm, regs->ARM_pc); >> + up_read(&mm->mmap_sem); >> > > My understanding is that such code detects faulty references within the > _valid_ address space, typically when hitting a page guard area. But I > guess that this won't work when treading on stack memory outside of the > address space, e.g. below the red zone for instance, isn't it? AFAIU, > those things may happen when the heading space of preposterously large > stack-based objects are addressed. Yes, exactly, but that would have been enough to detect Peter's problem. I thought gcc had an option to yell when the objects on stack grow beyond some size, but I can not find it. -- Gilles. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-08 10:04 ` Gilles Chanteperdrix @ 2010-07-08 10:09 ` Gilles Chanteperdrix 0 siblings, 0 replies; 21+ messages in thread From: Gilles Chanteperdrix @ 2010-07-08 10:09 UTC (permalink / raw) To: Philippe Gerum; +Cc: xenomai-help Gilles Chanteperdrix wrote: > Philippe Gerum wrote: >> On Thu, 2010-07-08 at 11:35 +0200, Gilles Chanteperdrix wrote: >>> + down_read(&mm->mmap_sem); >>> + if (find_vma(mm, addr) == find_vma(mm, regs->ARM_sp)) >>> + printk(KERN_INFO "FCSE: process %u(%s) probably overflowed stack >>> at 0x%08lx.\n", >>> + current->pid, current->comm, regs->ARM_pc); >>> + up_read(&mm->mmap_sem); >>> >> My understanding is that such code detects faulty references within the >> _valid_ address space, typically when hitting a page guard area. But I >> guess that this won't work when treading on stack memory outside of the >> address space, e.g. below the red zone for instance, isn't it? AFAIU, >> those things may happen when the heading space of preposterously large >> stack-based objects are addressed. > > Yes, exactly, but that would have been enough to detect Peter's problem. > I thought gcc had an option to yell when the objects on stack grow > beyond some size, but I can not find it. The option is enabled by the kernel when enabling CONFIG_WARN_FRAME, and is -Wframe-larger-than It seems to require gcc 4.4 though. -- Gilles. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-08 9:58 ` Philippe Gerum 2010-07-08 10:04 ` Gilles Chanteperdrix @ 2010-07-08 11:52 ` Gilles Chanteperdrix 1 sibling, 0 replies; 21+ messages in thread From: Gilles Chanteperdrix @ 2010-07-08 11:52 UTC (permalink / raw) To: Philippe Gerum; +Cc: xenomai-help Philippe Gerum wrote: > On Thu, 2010-07-08 at 11:35 +0200, Gilles Chanteperdrix wrote: >> + down_read(&mm->mmap_sem); >> + if (find_vma(mm, addr) == find_vma(mm, regs->ARM_sp)) >> + printk(KERN_INFO "FCSE: process %u(%s) probably overflowed stack >> at 0x%08lx.\n", >> + current->pid, current->comm, regs->ARM_pc); >> + up_read(&mm->mmap_sem); >> > > My understanding is that such code detects faulty references within the > _valid_ address space, typically when hitting a page guard area. But I > guess that this won't work when treading on stack memory outside of the > address space, e.g. below the red zone for instance, isn't it? AFAIU, > those things may happen when the heading space of preposterously large > stack-based objects are addressed. We only get the case where addr and sp are both in the guard page, or both in a memory mapping hole. We can improve a bit by trying: if (!find_vma(mm, regs->ARM_sp) || find_vma(mm, addr) == find_vma(mm, regs->ARM_sp)) We will also catch the case where addr is in the guard page, whereas sp is in a memory mapping hole. But as I said in the other mail I just sent, this will only work on machines with holes between thread stacks. -- Gilles. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-08 8:58 ` Gilles Chanteperdrix 2010-07-08 9:31 ` Philippe Gerum @ 2010-07-08 9:50 ` Philippe Gerum 2010-07-08 9:55 ` Gilles Chanteperdrix 1 sibling, 1 reply; 21+ messages in thread From: Philippe Gerum @ 2010-07-08 9:50 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: xenomai-help On Thu, 2010-07-08 at 10:58 +0200, Gilles Chanteperdrix wrote: > Philippe Gerum wrote: > > On Thu, 2010-07-08 at 01:08 +0200, Gilles Chanteperdrix wrote: > >> Peter Soetens wrote: > >>> On Wed, Jul 7, 2010 at 11:19 PM, Gilles Chanteperdrix > >>> <gilles.chanteperdrix@xenomai.org> wrote: > >>>> Peter Soetens wrote: > >>>>> On Wed, Jul 7, 2010 at 11:06 AM, Gilles Chanteperdrix > >>>>> <gilles.chanteperdrix@xenomai.org> wrote: > >>>>>> Peter Soetens wrote: > >>>>>>> At least, not for Orocos applications. We've had hard to debug > >>>>>>> application segfaults that used just a 'little' bit more than 32k. We > >>>>>>> had to raise the stack size to 128k to get reliably through our > >>>>>>> application startup. I stem from the old 'mlockall ate my RAM' > >>>>>>> generation where we typically reduced stack sizes in order to have > >>>>>>> some crumbles left for the heap. But 32k wasn't really what we were > >>>>>>> aiming for. > >>>>>>> > >>>>>>> Maybe we should explicitly document the 32k limit and its limitations > >>>>>>> for certain applications...? > >>>>>> Again, things have been fixed in 2.5.3 with regard to stack sizes, could > >>>>>> you check that you have the same behaviour? > >>>>> I think we had, but I'm uncertain right now. > >>>>> > >>>>>> As for 32KiB, it is only a default stack size, it is only reasonable in > >>>>>> the sense that 2MiB is unreasonable on a low-end system. 32KiB was > >>>>>> picked because it allows printf to work. Now, whatever stack size we > >>>>>> choose, there will be applications which need more, this does not really > >>>>>> make the default unreasonable. > >>>>> I knew you would say that. It deserves an entry in the faq or some > >>>>> trouble shooting document though. > >>>> It is documented. For instance, rt_task_create says: > >>>> stksize The size of the stack (in bytes) for the new task. If > >>>> zero is passed, a reasonable pre-defined size will be substituted. > >>>> > >>>> What else can we say? Documenting that this size is 32 KiB would be > >>>> wrong, because we do not want applications to rely on a particular > >>>> value, in case we want to change it. And the fact that if your stack is > >>>> too small, you will get problems is kind of obvious. For anyone having > >>>> played with stack sizes with Linux or any proprietary RTOS, at least. > >>> And what with new RTOS/Xenomai users ? > >>> > >>> You have to take the user perspective here. The problem with stack > >>> overflows is that they occur when the development of a program has > >>> progressed a while and applications reached a certain level of > >>> complexity (otherwise the overflow wouldn't have happend in the first > >>> place). So it suddenly starts to segfault (from time to time). What he > >>> does is this: he fires up the debugger to get a backtrace, sees > >>> trouble and wrongly assumes that gdb can't really handle these Xenomai > >>> threads and tries to eliminate causes of the crashes.. > >> Last time I tried, debugging a stack overflow with gdb was possible. You > >> can print the stack pointer and compare the value with the contents of > >> /proc/pid/maps. > >> > >> The user comes > >>> quickly to the conclusion that 'putting it all together' causes the > >>> crash (the single unit tests pass) and is looking for a software > >>> integration problem. In reality, it's the stack. > >>> > >>> If you've been through all this and then came to the correct > >>> conclusion the same day, you've been burnt before, or are the > >>> exception. > >>> > >>> In my view, 32k is a premature optimization. At least, it shows the > >>> side effects of one. > >> I guess you run Xenomai on one of these big irons, do you? Because if > >> you ran on a low-end machine, you would have understand why we can not > >> keep the 2MB default limit. 32 KiB looks already like a pretty large > >> limit, so, maybe there is a problem in your application? > >> > >> The I-pipe patch for ARM detects stack overflows, I guess we can modify > >> the kernel on all architectures to do the same thing on all architectures. > >> > > > > Peter made a good point considering the various braindamage outcomes a > > stack smashing issue could trigger. I'm unsure whether anyone can > > immediately suspect a stack overflow to be the cause of any random > > application behavior; typically, that issue could cause a branch to any > > random IP value on x86 since the return address is living on the stack > > and could get trashed, but not necessarily on architectures with > > branch-and-link registers. In the former case, GDB is of little help, > > except for single-stepping until the offending statement is reached and > > we can observe the trashing live, which means that we actually did the > > work of spotting the issue manually. > > > > It turns out that people with large applications and lots of contexts > > often end up naked in the cold most of the time when facing those > > things, and the only option left to them is to go backward on the > > integration path, in order to find a possibly faulty component. Before > > people can reasonably compare %sp values, they need some help to narrow > > the search, otherwise, it's hopeless. > > > > To this end, maybe an option would be to enable gcc's > > -fstack-protector[-all] -fstack-check when the debug switch is given to > > the configure script, provided the compiler in use supports this. > > > > Granted, a stack overflow is not identical to a smashing, but quite > > often the stack memory unduly consumed by a thread belongs to some other > > memory object, and therefore usually gets trashed when that object is > > modified. At least, enabling some canary word checking in that case may > > help. > > I do not think so. The glibc maps an unreadable/unwritable page below > the stack. So, what you get is a segmentation fault. Unless, of course, > you overflow more than one page. But we can map more than one page by > using pthread_attr_setguardsize, if one page is not enough. Actually, I guess that the stack guard area will not be contiguous to any valid page in most cases, so the size of that area should not be the main issue; i.e. at worst, the code would write to an unmapped address and raise a fault the same way. But despite this, identifying whether we had a stack overflow is still a pain, because that situation sometimes deeply confuses GDB. Or confuses the developer because function prologues and other hidden code do refer to stack memory, so unless we trace the program at instruction level, in single-stepping mode, we are toast. In short, I'd say that the issue is not that much about pulling the break when a stack overflow is detected (which happens in a way or another anyway), but rather about obtaining a reasonably precise hint as to _where_ the problem occurs. -- Philippe. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-08 9:50 ` Philippe Gerum @ 2010-07-08 9:55 ` Gilles Chanteperdrix 2010-07-08 10:19 ` Philippe Gerum 0 siblings, 1 reply; 21+ messages in thread From: Gilles Chanteperdrix @ 2010-07-08 9:55 UTC (permalink / raw) To: Philippe Gerum; +Cc: xenomai-help Philippe Gerum wrote: > On Thu, 2010-07-08 at 10:58 +0200, Gilles Chanteperdrix wrote: >> Philippe Gerum wrote: >>> On Thu, 2010-07-08 at 01:08 +0200, Gilles Chanteperdrix wrote: >>>> Peter Soetens wrote: >>>>> On Wed, Jul 7, 2010 at 11:19 PM, Gilles Chanteperdrix >>>>> <gilles.chanteperdrix@xenomai.org> wrote: >>>>>> Peter Soetens wrote: >>>>>>> On Wed, Jul 7, 2010 at 11:06 AM, Gilles Chanteperdrix >>>>>>> <gilles.chanteperdrix@xenomai.org> wrote: >>>>>>>> Peter Soetens wrote: >>>>>>>>> At least, not for Orocos applications. We've had hard to debug >>>>>>>>> application segfaults that used just a 'little' bit more than 32k. We >>>>>>>>> had to raise the stack size to 128k to get reliably through our >>>>>>>>> application startup. I stem from the old 'mlockall ate my RAM' >>>>>>>>> generation where we typically reduced stack sizes in order to have >>>>>>>>> some crumbles left for the heap. But 32k wasn't really what we were >>>>>>>>> aiming for. >>>>>>>>> >>>>>>>>> Maybe we should explicitly document the 32k limit and its limitations >>>>>>>>> for certain applications...? >>>>>>>> Again, things have been fixed in 2.5.3 with regard to stack sizes, could >>>>>>>> you check that you have the same behaviour? >>>>>>> I think we had, but I'm uncertain right now. >>>>>>> >>>>>>>> As for 32KiB, it is only a default stack size, it is only reasonable in >>>>>>>> the sense that 2MiB is unreasonable on a low-end system. 32KiB was >>>>>>>> picked because it allows printf to work. Now, whatever stack size we >>>>>>>> choose, there will be applications which need more, this does not really >>>>>>>> make the default unreasonable. >>>>>>> I knew you would say that. It deserves an entry in the faq or some >>>>>>> trouble shooting document though. >>>>>> It is documented. For instance, rt_task_create says: >>>>>> stksize The size of the stack (in bytes) for the new task. If >>>>>> zero is passed, a reasonable pre-defined size will be substituted. >>>>>> >>>>>> What else can we say? Documenting that this size is 32 KiB would be >>>>>> wrong, because we do not want applications to rely on a particular >>>>>> value, in case we want to change it. And the fact that if your stack is >>>>>> too small, you will get problems is kind of obvious. For anyone having >>>>>> played with stack sizes with Linux or any proprietary RTOS, at least. >>>>> And what with new RTOS/Xenomai users ? >>>>> >>>>> You have to take the user perspective here. The problem with stack >>>>> overflows is that they occur when the development of a program has >>>>> progressed a while and applications reached a certain level of >>>>> complexity (otherwise the overflow wouldn't have happend in the first >>>>> place). So it suddenly starts to segfault (from time to time). What he >>>>> does is this: he fires up the debugger to get a backtrace, sees >>>>> trouble and wrongly assumes that gdb can't really handle these Xenomai >>>>> threads and tries to eliminate causes of the crashes.. >>>> Last time I tried, debugging a stack overflow with gdb was possible. You >>>> can print the stack pointer and compare the value with the contents of >>>> /proc/pid/maps. >>>> >>>> The user comes >>>>> quickly to the conclusion that 'putting it all together' causes the >>>>> crash (the single unit tests pass) and is looking for a software >>>>> integration problem. In reality, it's the stack. >>>>> >>>>> If you've been through all this and then came to the correct >>>>> conclusion the same day, you've been burnt before, or are the >>>>> exception. >>>>> >>>>> In my view, 32k is a premature optimization. At least, it shows the >>>>> side effects of one. >>>> I guess you run Xenomai on one of these big irons, do you? Because if >>>> you ran on a low-end machine, you would have understand why we can not >>>> keep the 2MB default limit. 32 KiB looks already like a pretty large >>>> limit, so, maybe there is a problem in your application? >>>> >>>> The I-pipe patch for ARM detects stack overflows, I guess we can modify >>>> the kernel on all architectures to do the same thing on all architectures. >>>> >>> Peter made a good point considering the various braindamage outcomes a >>> stack smashing issue could trigger. I'm unsure whether anyone can >>> immediately suspect a stack overflow to be the cause of any random >>> application behavior; typically, that issue could cause a branch to any >>> random IP value on x86 since the return address is living on the stack >>> and could get trashed, but not necessarily on architectures with >>> branch-and-link registers. In the former case, GDB is of little help, >>> except for single-stepping until the offending statement is reached and >>> we can observe the trashing live, which means that we actually did the >>> work of spotting the issue manually. >>> >>> It turns out that people with large applications and lots of contexts >>> often end up naked in the cold most of the time when facing those >>> things, and the only option left to them is to go backward on the >>> integration path, in order to find a possibly faulty component. Before >>> people can reasonably compare %sp values, they need some help to narrow >>> the search, otherwise, it's hopeless. >>> >>> To this end, maybe an option would be to enable gcc's >>> -fstack-protector[-all] -fstack-check when the debug switch is given to >>> the configure script, provided the compiler in use supports this. >>> >>> Granted, a stack overflow is not identical to a smashing, but quite >>> often the stack memory unduly consumed by a thread belongs to some other >>> memory object, and therefore usually gets trashed when that object is >>> modified. At least, enabling some canary word checking in that case may >>> help. >> I do not think so. The glibc maps an unreadable/unwritable page below >> the stack. So, what you get is a segmentation fault. Unless, of course, >> you overflow more than one page. But we can map more than one page by >> using pthread_attr_setguardsize, if one page is not enough. > > Actually, I guess that the stack guard area will not be contiguous to > any valid page in most cases, so the size of that area should not be the > main issue; i.e. at worst, the code would write to an unmapped address > and raise a fault the same way. But despite this, identifying whether we > had a stack overflow is still a pain, because that situation sometimes > deeply confuses GDB. Or confuses the developer because function > prologues and other hidden code do refer to stack memory, so unless we > trace the program at instruction level, in single-stepping mode, we are > toast. Unfortunately, the thread stacks get allocated with mmap, so, they all get "stacked", no pun intended. They are only separated with the guard pages, so, yes, if you overflow badly, you may override an other thread's stack. And since you will have a tendency to overflow the other thread's stack top, it will take some time before you detect the overrun. > > In short, I'd say that the issue is not that much about pulling the > break when a stack overflow is detected (which happens in a way or > another anyway), but rather about obtaining a reasonably precise hint as > to _where_ the problem occurs. > Hence the proposition of kernel instrumentation, or of SIGSEGV handler. -- Gilles. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-08 9:55 ` Gilles Chanteperdrix @ 2010-07-08 10:19 ` Philippe Gerum 2010-07-08 11:47 ` Gilles Chanteperdrix 0 siblings, 1 reply; 21+ messages in thread From: Philippe Gerum @ 2010-07-08 10:19 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: xenomai-help On Thu, 2010-07-08 at 11:55 +0200, Gilles Chanteperdrix wrote: > Philippe Gerum wrote: > > On Thu, 2010-07-08 at 10:58 +0200, Gilles Chanteperdrix wrote: > >> Philippe Gerum wrote: > >>> On Thu, 2010-07-08 at 01:08 +0200, Gilles Chanteperdrix wrote: > >>>> Peter Soetens wrote: > >>>>> On Wed, Jul 7, 2010 at 11:19 PM, Gilles Chanteperdrix > >>>>> <gilles.chanteperdrix@xenomai.org> wrote: > >>>>>> Peter Soetens wrote: > >>>>>>> On Wed, Jul 7, 2010 at 11:06 AM, Gilles Chanteperdrix > >>>>>>> <gilles.chanteperdrix@xenomai.org> wrote: > >>>>>>>> Peter Soetens wrote: > >>>>>>>>> At least, not for Orocos applications. We've had hard to debug > >>>>>>>>> application segfaults that used just a 'little' bit more than 32k. We > >>>>>>>>> had to raise the stack size to 128k to get reliably through our > >>>>>>>>> application startup. I stem from the old 'mlockall ate my RAM' > >>>>>>>>> generation where we typically reduced stack sizes in order to have > >>>>>>>>> some crumbles left for the heap. But 32k wasn't really what we were > >>>>>>>>> aiming for. > >>>>>>>>> > >>>>>>>>> Maybe we should explicitly document the 32k limit and its limitations > >>>>>>>>> for certain applications...? > >>>>>>>> Again, things have been fixed in 2.5.3 with regard to stack sizes, could > >>>>>>>> you check that you have the same behaviour? > >>>>>>> I think we had, but I'm uncertain right now. > >>>>>>> > >>>>>>>> As for 32KiB, it is only a default stack size, it is only reasonable in > >>>>>>>> the sense that 2MiB is unreasonable on a low-end system. 32KiB was > >>>>>>>> picked because it allows printf to work. Now, whatever stack size we > >>>>>>>> choose, there will be applications which need more, this does not really > >>>>>>>> make the default unreasonable. > >>>>>>> I knew you would say that. It deserves an entry in the faq or some > >>>>>>> trouble shooting document though. > >>>>>> It is documented. For instance, rt_task_create says: > >>>>>> stksize The size of the stack (in bytes) for the new task. If > >>>>>> zero is passed, a reasonable pre-defined size will be substituted. > >>>>>> > >>>>>> What else can we say? Documenting that this size is 32 KiB would be > >>>>>> wrong, because we do not want applications to rely on a particular > >>>>>> value, in case we want to change it. And the fact that if your stack is > >>>>>> too small, you will get problems is kind of obvious. For anyone having > >>>>>> played with stack sizes with Linux or any proprietary RTOS, at least. > >>>>> And what with new RTOS/Xenomai users ? > >>>>> > >>>>> You have to take the user perspective here. The problem with stack > >>>>> overflows is that they occur when the development of a program has > >>>>> progressed a while and applications reached a certain level of > >>>>> complexity (otherwise the overflow wouldn't have happend in the first > >>>>> place). So it suddenly starts to segfault (from time to time). What he > >>>>> does is this: he fires up the debugger to get a backtrace, sees > >>>>> trouble and wrongly assumes that gdb can't really handle these Xenomai > >>>>> threads and tries to eliminate causes of the crashes.. > >>>> Last time I tried, debugging a stack overflow with gdb was possible. You > >>>> can print the stack pointer and compare the value with the contents of > >>>> /proc/pid/maps. > >>>> > >>>> The user comes > >>>>> quickly to the conclusion that 'putting it all together' causes the > >>>>> crash (the single unit tests pass) and is looking for a software > >>>>> integration problem. In reality, it's the stack. > >>>>> > >>>>> If you've been through all this and then came to the correct > >>>>> conclusion the same day, you've been burnt before, or are the > >>>>> exception. > >>>>> > >>>>> In my view, 32k is a premature optimization. At least, it shows the > >>>>> side effects of one. > >>>> I guess you run Xenomai on one of these big irons, do you? Because if > >>>> you ran on a low-end machine, you would have understand why we can not > >>>> keep the 2MB default limit. 32 KiB looks already like a pretty large > >>>> limit, so, maybe there is a problem in your application? > >>>> > >>>> The I-pipe patch for ARM detects stack overflows, I guess we can modify > >>>> the kernel on all architectures to do the same thing on all architectures. > >>>> > >>> Peter made a good point considering the various braindamage outcomes a > >>> stack smashing issue could trigger. I'm unsure whether anyone can > >>> immediately suspect a stack overflow to be the cause of any random > >>> application behavior; typically, that issue could cause a branch to any > >>> random IP value on x86 since the return address is living on the stack > >>> and could get trashed, but not necessarily on architectures with > >>> branch-and-link registers. In the former case, GDB is of little help, > >>> except for single-stepping until the offending statement is reached and > >>> we can observe the trashing live, which means that we actually did the > >>> work of spotting the issue manually. > >>> > >>> It turns out that people with large applications and lots of contexts > >>> often end up naked in the cold most of the time when facing those > >>> things, and the only option left to them is to go backward on the > >>> integration path, in order to find a possibly faulty component. Before > >>> people can reasonably compare %sp values, they need some help to narrow > >>> the search, otherwise, it's hopeless. > >>> > >>> To this end, maybe an option would be to enable gcc's > >>> -fstack-protector[-all] -fstack-check when the debug switch is given to > >>> the configure script, provided the compiler in use supports this. > >>> > >>> Granted, a stack overflow is not identical to a smashing, but quite > >>> often the stack memory unduly consumed by a thread belongs to some other > >>> memory object, and therefore usually gets trashed when that object is > >>> modified. At least, enabling some canary word checking in that case may > >>> help. > >> I do not think so. The glibc maps an unreadable/unwritable page below > >> the stack. So, what you get is a segmentation fault. Unless, of course, > >> you overflow more than one page. But we can map more than one page by > >> using pthread_attr_setguardsize, if one page is not enough. > > > > Actually, I guess that the stack guard area will not be contiguous to > > any valid page in most cases, so the size of that area should not be the > > main issue; i.e. at worst, the code would write to an unmapped address > > and raise a fault the same way. But despite this, identifying whether we > > had a stack overflow is still a pain, because that situation sometimes > > deeply confuses GDB. Or confuses the developer because function > > prologues and other hidden code do refer to stack memory, so unless we > > trace the program at instruction level, in single-stepping mode, we are > > toast. > > Unfortunately, the thread stacks get allocated with mmap, so, they all > get "stacked", no pun intended. They are only separated with the guard > pages, so, yes, if you overflow badly, you may override an other > thread's stack. And since you will have a tendency to overflow the other > thread's stack top, it will take some time before you detect the overrun. > If I understand the glibc code properly, the stack cache is not pre-filled, but merely serves to recycle old stacks from terminated stacks. So, at least until a stack area could actually be reused from that cache, fresh new stack space for new threads is always obtained via mmap(), which means that we may have non-contiguous stack spaces most of the time. It seems that things would start to hit the crapper when some recycling takes place, in which case an overflow situation could cause a stack to overflow on its neighbor. -- Philippe. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-08 10:19 ` Philippe Gerum @ 2010-07-08 11:47 ` Gilles Chanteperdrix 2010-07-08 15:01 ` Philippe Gerum 0 siblings, 1 reply; 21+ messages in thread From: Gilles Chanteperdrix @ 2010-07-08 11:47 UTC (permalink / raw) To: Philippe Gerum; +Cc: xenomai-help Philippe Gerum wrote: > If I understand the glibc code properly, the stack cache is not > pre-filled, but merely serves to recycle old stacks from terminated > stacks. So, at least until a stack area could actually be reused from > that cache, fresh new stack space for new threads is always obtained via > mmap(), which means that we may have non-contiguous stack spaces most of > the time. It seems that things would start to hit the crapper when some > recycling takes place, in which case an overflow situation could cause a > stack to overflow on its neighbor. I am not sure I understand what you mean. So, I am going to try and show you what I mean. I run the following program: #include <stdio.h> #include <pthread.h> #include <unistd.h> void *thread(void *cookie) { int x; printf("sp: %p\n", &x); pause(); return cookie; } int main(void) { pthread_t ida, idb; pthread_create(&ida, NULL, thread, NULL); pthread_create(&idb, NULL, thread, NULL); pthread_join(ida, NULL); return 0; } On an ARMv7 (no FCSE involved) platform. It prints: sp: 0x411a2ddc sp: 0x409a2ddc I then dump the process mappings, and I get everything contiguous: 401a4000-401a5000 ---p 00000000 00:00 0 401a5000-409a4000 rw-p 00000000 00:00 0 409a4000-409a5000 ---p 00000000 00:00 0 409a5000-411a4000 rw-p 00000000 00:00 0 So, it looks to me like if the thread with the highest stack address go past below the guard page limit, it will overrun the other thread's stack. On x86, this is a different story. I guess because the kernel or glibc has a stack top randomization strategy. -- Gilles. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-08 11:47 ` Gilles Chanteperdrix @ 2010-07-08 15:01 ` Philippe Gerum 2010-07-08 16:33 ` Gilles Chanteperdrix 0 siblings, 1 reply; 21+ messages in thread From: Philippe Gerum @ 2010-07-08 15:01 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: xenomai-help On Thu, 2010-07-08 at 13:47 +0200, Gilles Chanteperdrix wrote: > Philippe Gerum wrote: > > If I understand the glibc code properly, the stack cache is not > > pre-filled, but merely serves to recycle old stacks from terminated > > stacks. So, at least until a stack area could actually be reused from > > that cache, fresh new stack space for new threads is always obtained via > > mmap(), which means that we may have non-contiguous stack spaces most of > > the time. It seems that things would start to hit the crapper when some > > recycling takes place, in which case an overflow situation could cause a > > stack to overflow on its neighbor. > > I am not sure I understand what you mean. So, I am going to try and show > you what I mean. I run the following program: > > #include <stdio.h> > > #include <pthread.h> > #include <unistd.h> > > void *thread(void *cookie) > { > int x; > printf("sp: %p\n", &x); > pause(); > return cookie; > } > > int main(void) > { > pthread_t ida, idb; > pthread_create(&ida, NULL, thread, NULL); > pthread_create(&idb, NULL, thread, NULL); > pthread_join(ida, NULL); > return 0; > } > > On an ARMv7 (no FCSE involved) platform. It prints: > sp: 0x411a2ddc > sp: 0x409a2ddc > > I then dump the process mappings, and I get everything contiguous: > 401a4000-401a5000 ---p 00000000 00:00 0 > 401a5000-409a4000 rw-p 00000000 00:00 0 > 409a4000-409a5000 ---p 00000000 00:00 0 > 409a5000-411a4000 rw-p 00000000 00:00 0 > > So, it looks to me like if the thread with the highest stack address go > past below the guard page limit, it will overrun the other thread's stack. I mean that glibc does not pre-allocate pieces of anon memory to honor requests for stack chunks, it gets them on the fly from an internal cache if one matches, or mmaps its. Besides, the cache itself is only fed with recycled stacks from terminated threads it seems, so we can't predict whether all stacks there would be contiguous. For instance, I'm assuming that tweaking your code like below would likely prevent the stack segments from being contiguous: pthread_create(&ida, NULL, thread, NULL); + mmap(NULL, 8*1024*1024, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); pthread_create(&idb, NULL, thread, NULL); pthread_join(ida, NULL); If so, it is indeed likely that segments would be contiguous if threads are started the way you did; on the other hand, it is possible that a more complex application does not suffer this. Granted, this does not help us that much anyway. My point is that nothing guarantees us either contiguous or sparse stack address ranges, so we probably should not rely on those assumptions. > > On x86, this is a different story. I guess because the kernel or glibc > has a stack top randomization strategy. > -- Philippe. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-08 15:01 ` Philippe Gerum @ 2010-07-08 16:33 ` Gilles Chanteperdrix 0 siblings, 0 replies; 21+ messages in thread From: Gilles Chanteperdrix @ 2010-07-08 16:33 UTC (permalink / raw) To: Philippe Gerum; +Cc: xenomai-help Philippe Gerum wrote: > I mean that glibc does not pre-allocate pieces of anon memory to honor > requests for stack chunks, it gets them on the fly from an internal > cache if one matches, or mmaps its. Besides, the cache itself is only > fed with recycled stacks from terminated threads it seems, so we can't > predict whether all stacks there would be contiguous. > > For instance, I'm assuming that tweaking your code like below would > likely prevent the stack segments from being contiguous: > > pthread_create(&ida, NULL, thread, NULL); > + mmap(NULL, 8*1024*1024, PROT_READ, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > pthread_create(&idb, NULL, thread, NULL); > pthread_join(ida, NULL); > > If so, it is indeed likely that segments would be contiguous if threads > are started the way you did; on the other hand, it is possible that a > more complex application does not suffer this. Granted, this does not > help us that much anyway. > > My point is that nothing guarantees us either contiguous or sparse stack > address ranges, so we probably should not rely on those assumptions. So the worst case, in case of massive stack overflow, or in a system without MMU is silent corruption of unrelated data. I am not sure of what we can do about that. Not sure -fstack-protector/-fstack-check is a solution. -- Gilles. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Xenomai-help] native: A 32k stack is not always a 'reasonable' size 2010-07-06 19:25 [Xenomai-help] native: A 32k stack is not always a 'reasonable' size Peter Soetens 2010-07-07 9:06 ` Gilles Chanteperdrix @ 2010-07-11 13:15 ` Gilles Chanteperdrix 1 sibling, 0 replies; 21+ messages in thread From: Gilles Chanteperdrix @ 2010-07-11 13:15 UTC (permalink / raw) To: Peter Soetens; +Cc: xenomai-help Peter Soetens wrote: > PS: can anyone allow 'sspr' (=me) to edit/add stuff on the wiki ? Should be done now. -- Gilles. ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2010-07-11 13:15 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-07-06 19:25 [Xenomai-help] native: A 32k stack is not always a 'reasonable' size Peter Soetens 2010-07-07 9:06 ` Gilles Chanteperdrix 2010-07-07 20:57 ` Peter Soetens 2010-07-07 21:19 ` Gilles Chanteperdrix 2010-07-07 22:31 ` Peter Soetens 2010-07-07 23:08 ` Gilles Chanteperdrix 2010-07-08 8:37 ` Philippe Gerum 2010-07-08 8:58 ` Gilles Chanteperdrix 2010-07-08 9:31 ` Philippe Gerum 2010-07-08 9:35 ` Gilles Chanteperdrix 2010-07-08 9:58 ` Philippe Gerum 2010-07-08 10:04 ` Gilles Chanteperdrix 2010-07-08 10:09 ` Gilles Chanteperdrix 2010-07-08 11:52 ` Gilles Chanteperdrix 2010-07-08 9:50 ` Philippe Gerum 2010-07-08 9:55 ` Gilles Chanteperdrix 2010-07-08 10:19 ` Philippe Gerum 2010-07-08 11:47 ` Gilles Chanteperdrix 2010-07-08 15:01 ` Philippe Gerum 2010-07-08 16:33 ` Gilles Chanteperdrix 2010-07-11 13:15 ` Gilles Chanteperdrix
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.