From mboxrd@z Thu Jan 1 00:00:00 1970 Subject: Re: Fwd: Debugging system freeze, SIGXCPU References: <0509ec7d-20b3-bc38-7a04-7516f24249a1@xenomai.org> From: Jan Kiszka Message-ID: <22ae4956-2359-6fda-02ed-c174df3dd015@siemens.com> Date: Mon, 25 Feb 2019 18:28:12 +0100 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit List-Id: Discussions about the Xenomai project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Ari Mozes , xenomai@xenomai.org On 25.02.19 17:57, Ari Mozes via Xenomai wrote: > Philippe, > Thank you for the information and the URL. > I read through the thread, and I agree with comments that it would be > helpful to be able to identify/blacklist/etc problematic calls when > porting over existing code to a true RT scenario. In our case the > original code was written with "RT-like" behavior in mind, but as > there is a lot of code already in place, approaches to identify > existing problematic calls would be helpful. You could wrap such calls like we do for malloc/free in libcobalt. But wrapping only works if the direct caller is processed that way - and is not some pre-built external library. Therefore: Do not use libraries that you didn't validate from within time-sensitive code paths. Also libstdc++ may contain more surprises. Jan > I will continue to familiarize myself with the nitty-gritty details, > but anything that makes the process easier is always welcome :-) > > Ari > > > On Mon, Feb 25, 2019 at 11:08 AM Philippe Gerum wrote: >> >> On 2/25/19 2:32 PM, Ari Mozes via Xenomai wrote: >>> Resending this question with testcase. >>> Can someone give the testcase a try to see if it reproduces the problem I >>> am seeing? Is more information needed? >>> It takes a couple of minutes before I see the issue occur. >> >> The random lockup is due to std::chrono::high_resolution_clock::now() >> invoking the vDSO form of clock_gettime(). >> >> SIGXCPU aka Xenomai's SIGDEBUG may be sent by the core in various >> situations, but since the code does not set the T_WARNSW for any task, >> the only explanation is receiving a Xenomai watchdog notification. See >> the help information about CONFIG_XENO_OPT_WATCHDOG in your kernel >> configuration. >> >> After a few secs spinning in the vDSO code which may not be called from >> real-time context, the Xenomai core pulls the break and sends SIGXCPU to >> the offending process, unless the system locks up before the watchdog >> could even trigger. >> >> Solution: use clock_gettime(CLOCK_HOST_REALTIME) instead of >> std::chrono::high_resolution_clock::now() for getting timestamps. >> >> A related discussion is available at this URL: >> https://www.xenomai.org/pipermail/xenomai/2018-December/040133.html >> >> -- >> Philippe. > > > -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux