PR# 13816 Allocating large ARRAY [DOUBLE] causes floating point exception
Problem Report Summary
Submitter: prestoat2000
Category: Runtime
Priority: Medium
Date: 2007/12/21
Class: Bug
Severity: Serious
Number: 13816
Release: 6.1.71477
Confidential: No
Status: Analyzed
Responsible: alexk_es
Environment: Mozilla/5.0 (X11; U; SunOS i86pc; en-US; rv:1.8.1.9) Gecko/20071111 Firefox/2.0.0.9
Solaris 10 on x86
Synopsis: Allocating large ARRAY [DOUBLE] causes floating point exception
Description
Allocating a large ARRAY [DOUBLE] (16777210 to 16777214 elements) causes a floating point exception instead of a "no more memory" or "object too big" exception. This seems to indicate a bug in the runtime.
To Reproduce
Compile with attached class and config file. Execute system with argument 16777210 (or any bigger number <= 16777214). System execution terminates with: ------------------------------------------------------------------------------- RUN-TIME root's set-up Floating point exception: <0000000000000000> Floating point exception. Exit -------------------------------------------------------------------------------
Problem Report Interactions
I extended test runtime015 to show that there are still some array sizes that cause a seg fault instead of "no more memory".
The incorrect tcf instructions (using execute_work instead of execute_final when system had just been finalized) were actually in test runtime012. I fixed them. But the other issue pointed out previously remains.
This problem does not seem to be completely fixed. If I melt with the class and config file from the original report (not freeze) and then execute marten 44% EIFGENs/test/W_code/test 16777210 Segmentation fault Execution of frozen code ends normally, with no seg fault or exception trace. Also, I noticed in the tcf file for eweasel test runtime015 several places where system was finalized but then the workbench version was executed (lines 41 and 54 of the tcf). Unless I am missing something, these need to be execute_final instructions instead of execute_work. So I think the tcf file needs to be extended to cover the melted case, as well as fixing the incorrect "execute" instructions.
Fixed in rev#80953 of EiffelStudio 6.5 intermediate release.
Extended test#runtime015 to cover this issue.
Ran system under dbx and got the following call stack when FPE occurs: signal FPE (integer divide by zero) in scollect at 0x8cc07be 0x08cc07be: scollect+0x0352: idivl %ecx,%eax (dbx) where =>[1] scollect(0x8cc16c4, 0x0), at 0x8cc07be [2] plsc(0x80470c4, 0x8cfd78c, 0x80470d8, 0x8cb0510, 0x80470d8, 0x8cb0518), at 0x8cc16bc [3] reclaim(0xfeffa7d8, 0x8cb0479, 0x8cfd78c, 0x80470c4, 0x8047218, 0x80470d8), at 0x8cc09a6 [4] main(0x2, 0x8047108, 0x8047114, 0x80470fc), at 0x8cb0518 In `scollect', the only divisions by a non-constant are division by `nbstat' or `nbstat - 1'. So apparently `nbstat' is 0 or 1. `nbstat' comes from nbstat = ++nb_stats[i]; where `i' is an argument to `scollect'. Since the call is from `plsc', it looks like `i' is GST_PART (which is 0). I'll let you figure out the rest.