PR# 19000 [RJ] No thread local storage for eiffel_signal_handler (et.al.).
Problem Report Summary
Submitter: axarosenberg
Category: Runtime
Priority: Medium
Date: 2014/12/11
Class: Bug
Severity: Serious
Number: 19000
Release: 14.11.9.6209
Confidential: No
Status: Closed
Responsible:
Environment: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Synopsis: [RJ] No thread local storage for eiffel_signal_handler (et.al.).
Description
We have been having problems with a program that mysteriously exits with a segmentation violation. It has never been reproduced so today I ran 4 copies, each in the C debugger. As I my machine ran low on memory I received a segmentation violation. Why? This program was threaded and the error happens in 'eiffel_signal_handler' at: if (esigblk) However, the current thread of execution was not created by Eiffel and no thread local storage had been allocated. The variable 'rt_globals' was NULL so you should be able to test for this case (I don't know how NULL is guaranteed). But how did I get to 'eiffel_signal_handler'? This was called from the C run-time routine 'ctrlevent_capture' (in winsig.c, the source is available). The event was CTRL_CLOSE_EVENT. Apparently, Windows was trying to shutdown my program. This same code is also called when you close the console window so it is very easy to reproduce. I used a do_nothing loop and simply closing the window causes the same problem. So, I turned off threads and tested the do_nothing loop. Still errors! Usually segvio in expop, sometimes stack overflow, unwind errors. I'll leave that one to you. I still don't know what my original problem is and this is very serious. We have been getting this error on machines that I'm fairly certain are not running out of memory. To test this you must run in the C debugger and trap "Access violation" when it is THROWN, not just when unhandled. Randy
To Reproduce
Problem Report Interactions
I've added a fix similar to what you suggested except the non-MT case only applies to Windows. For non Windows platforms, we would require linking against the thread library to get the thread ID which is not the case at the moment. It is fixed in rev#96465.
It's a little more complicated. I've been checking to see what happens to a single-threaded program when you press ctrl-c. Since this happens in a new thread, the Eiffel thread continues to run while the new thread runs 'failure'. This causes the Eiffel thread to crash. In the old days when hardware had only one processor this wasn't really a problem since 'failure' exits. I think you can use the same approach we already discussed but you need to look at the thread id. If 'eiffel_signal_handler' has a thread-id that isn't the root's thread-id, take the fast exit. #ifdef EIF_THREADS if (rt_globals == NULL) #else if (root_thread_id != current_thread_id()) #endif { if (sig != SIGINT && sig != SIGBREAK) printf("\nSignal %d while in a non-Eiffel thread.\n", sig); exit(2); There are actually 5 different control signals for which Windows creates a new thread. They are described here: http://msdn.microsoft.com/en-us/library/windows/desktop/ms683242(v=vs.85).aspx .... Output truncated, Click download to get the full message
I don't want a mysterious exit. I would like a message on the console indicating the nature of the problem. The message should include the signal number and the fact that we are not in an Eiffel thread. Then, instead of exiting, I have read that it is better to restore the saved signal handler (I believe you have that) and resignal. Since the saved handler was probably SIG_DFL - it will probably exit. If you are uncomfortable with printing a message, you could make it optional and I would call a feature to turn it on. Since SIGINT and SIGBREAK will usually be in a new thread you could eliminate the message in those cases. #ifdef EIF_THREADS if (rt_globals == NULL) { if (sig != SIGINT && sig != SIGBREAK) printf("\nSignal %d while in non-Eiffel thread.\n", sig); exit(2);
We can indeed handle the case where the signal handler is not executed from an EiffelThread. However what is not clear is what we do in this case. This holds true for all platforms. So far, we can set a flag in the other running EiffelThreads, but it is not clear when we should check for that flag in the running threads so that you get a proper response time without penalizing too much performance. In the meantime, it seems that exiting the process might be the best thing to do. Is this something you feel comfortable with?
I think you are concentrating on the case where a new thread is created. But don't forget, any time a signal is delivered (either synchronously or asynchronously), 'eiffel_signal_handler' can be called; EVEN IF THE THREAD IS NOT EIFFEL. In that case you should check 'rt_globals' to see if it is NULL. At that point I'm not sure what to do. I guess it would be nice to know that the problem was in a non-Eiffel thread.
There is indeed an issue as Windows creates a new thread to handle this event. Even if we add a check to see if this was an Eiffel created thread or not, it does not solve the issue on what should be done. Several outcomes are possible: 1 - kill the application without raising an exception. 2 - do nothing but prints out some message in the console. 3 - find a way to throw an exception in one of the running Eiffel thread, most likely the main thread of the application. We can easily do 1 & 2. As for 3, last time I checked I could not make this work properly, but I'll have a second look.
Manu posted this question in 2007: http://www.adras.com/Using-SetConsoleCtrlHandler.t2814-192.html