Bug creation and email sending has been disabled, file new bugs at gcc.gnu.org/bugzilla
Bug 10 - ARM: runnable/opover2.d fails: xopEquals fallback aborts instead of throwing Exception
Summary: ARM: runnable/opover2.d fails: xopEquals fallback aborts instead of throwing ...
Status: RESOLVED FIXED
Alias: None
Product: GDC
Classification: Unclassified
Component: gdc (show other bugs)
Version: development
Hardware: ARM Linux
: --- normal
Assignee: Johannes Pfau
URL:
Depends on:
Blocks:
 
Reported: 2012-09-25 10:22 CEST by Johannes Pfau
Modified: 2014-02-26 19:12 CET (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Johannes Pfau 2012-09-25 10:22:44 CEST
Reduced example:
--------------
void main()
{
    // xopEquals NG
    struct S2a
    {
        bool opEquals(S2a rhs) //Add const here and everything OK
        {return false; }
    }
    S2a s;
    //Should throw exception, instead aborts
    typeid(S2a).equals(&s, &s);
}
---------------

Compile:
gdc opover2.d -o opover2.exe
./opover2.exe
Aborted

Incomplete gdb backtrace:
---------------
Program received signal SIGABRT, Aborted.
0x40106bfc in raise () from /lib/arm-linux-gnueabihf/libc.so.6
(gdb) bt
#0  0x40106bfc in raise () from /lib/arm-linux-gnueabihf/libc.so.6
#1  0x4010a97c in abort () from /lib/arm-linux-gnueabihf/libc.so.6
#2  0x00008bd8 in ?? ()
#3  0x00008bd8 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
---------------

I started looking into this issue, this is what I found:
The TypeInfoStruct.equals function is set in TypeInfoStructDeclaration::toDt (typinf.c):
---------------
    if (sd->xeq)
        dtxoff(pdt, sd->xeq->toSymbol(), 0, TYnptr);
    else
        dtsize_t(pdt, 0);
---------------

xeq is built in clone.c, StructDeclaration::buildXopEquals.
* If no opEquals function is defined, it return NULL. This is working.
* If an opEquals function was found, it tries to produce code to call this function. If this succeeds, everything is working. If this fails, for example because opEquals isn't const, it returns the object._xopEquals function. This is what happens in this test case and isn't working.

object._xopEquals is defined in druntime:
bool _xopEquals(in void*, in void*)
{
    throw new Error("TypeInfo.equals is not implemented");
}

calling it directly is working:
object._xopEquals(null, null); //Throws as expected.

I'm not sure if this is the root cause, but this is fishy: (TypeInfo_Struct)
override equals_t equals(in void* p1, in void* p2) @trusted pure nothrow const
but:
bool _xopEquals(in void*, in void*)

Could be another issue of the recent commit, fixing pure/nothrow attributes on functions with asserts:
d-decls.cc(FuncDeclaration::toSymbol): Don't set any pure/nothrow attributes if asserts are generated in code.

This can't detect the thrown Error in this case, as it's thown via a function pointer.
Comment 1 Johannes Pfau 2012-09-25 10:47:18 CEST
OK, this was indeed the cause. Removing pure nothrow from "equals_t equals" and "equals_t function(in void*, in void*)   xopEquals;" fixes this issue. We could hack this in druntime, but I think we need a more general solution.

Imho we shouldn't mark functions as nothrow in the backend if the complete source code is not available: So: invoking a function via pointer / delegate => not nothrow. Invoking a nothrow function where we only have a definition in a .di header => not nothrow. And if a function invokes other nothrow functions these have to be checked for assertion / throw Error as well.

The same may apply to pure. Maybe there's a better solution, but it seems that the dmd schizophrenia "throwing Errors in nothrow functions is OK" get's us into trouble here, as GCC probably doesn't set up exception handling for NOTHROW functions at all.
Comment 2 Iain Buclaw 2012-09-25 20:22:03 CEST
Although I would have thought it to have the same behaviour, I can't seem to reproduce it on x86/x86_64.
Comment 3 Andrej 2012-09-25 20:47:25 CEST
Can someone please explain to me why the gdc bugzilla keeps emailing me all of a sudden? I have all email notifications disabled and I'm not on any CC list.
Comment 4 Iain Buclaw 2012-09-25 21:06:03 CEST
I guess it's because it's attached to the D.gnu ML, has always been that way... D's bugzilla is attached to D.gnu too (for those few bugs still assigned to GDC).
Comment 5 Andrej 2012-09-25 21:14:59 CEST
(In reply to comment #4)
> I guess it's because it's attached to the D.gnu ML, has always been that way...
> D's bugzilla is attached to D.gnu too (for those few bugs still assigned to
> GDC).

Oh wait, I'm sorry, I thought it was mailing me directly. Sorry for the noise. :)
Comment 6 Johannes Pfau 2012-09-26 09:20:06 CEST
@Iain: I forgot to explicitly state that, but the test case does indeed not fail on x86/x86-64, it only fails on ARM. It probably depends on the unwinder and/or exception ABI.
Comment 8 Johannes Pfau 2014-02-12 07:35:00 CET
@Andrei I already fixed that in the ARM branch. Sorry for not mentioning this before. The GDC master branch was broken for some time and I couldn't merge the ARM changes.
As the 2.065 changes on the GDC branch have been reverted I'll start merging the ARM stuff later today.

https://github.com/jpf91/GDC/commit/9bb8b803f9e25ce98b555e8eaf765f0e73605796
Comment 9 Iain Buclaw 2014-02-12 08:19:44 CET
(In reply to comment #8)
> @Andrei I already fixed that in the ARM branch. Sorry for not mentioning this
> before. The GDC master branch was broken for some time and I couldn't merge the
> ARM changes.
> As the 2.065 changes on the GDC branch have been reverted I'll start merging
> the ARM stuff later today.
> 
> https://github.com/jpf91/GDC/commit/9bb8b803f9e25ce98b555e8eaf765f0e73605796

Congrats, you've won! :o)
Comment 10 Johannes Pfau 2014-02-12 12:39:42 CET
Indeed, somehow feels like cheating though ;-)

@Iain do you agree with the solution or should I open a pull request for further discussion?
Comment 11 Andrei Alexandrescu 2014-02-12 16:37:46 CET
I'm fine with paying for a solution that was almost done, unless you guys would like me to reallocate the monies to another, more important bug. Please advise.
Comment 12 Johannes Pfau 2014-02-13 08:06:11 CET
I had a look at the other bugs but it's difficult to estimate how much work is needed for those that qualify as 'more important' (NRVO, non-POD types). NRVO needs some work/clarifications on the D spec first and then it's possible that we can't even implement it in a nice way with the GCC backend. Non-POD types is a simple change in GDC but it'll require lots of frontend work. Overall I don't think a bounty on these would help.

The rest are either easier to solve or we don't have the hardware to test them (OSX) or unsupported (windows). I'd propose we keep the bounty on this bug but I'll promise to fix #62 and #65 as well. These are probably easy to fix debug-info bugs that are kinda low priority so this is some incentive to actually fix them.
Comment 13 Iain Buclaw 2014-02-14 09:50:14 CET
(In reply to comment #12)
> I had a look at the other bugs but it's difficult to estimate how much work is
> needed for those that qualify as 'more important' (NRVO, non-POD types). NRVO
> needs some work/clarifications on the D spec first and then it's possible that
> we can't even implement it in a nice way with the GCC backend. Non-POD types is
> a simple change in GDC but it'll require lots of frontend work. Overall I don't
> think a bounty on these would help.
> 
> The rest are either easier to solve or we don't have the hardware to test them
> (OSX)

I was actually donated an OSX machine... I should really turn it on sometime rather than let it collect dust.  :o)
Comment 14 Iain Buclaw 2014-02-14 10:09:13 CET
(In reply to comment #8)
> @Andrei I already fixed that in the ARM branch. Sorry for not mentioning this
> before. The GDC master branch was broken for some time and I couldn't merge the
> ARM changes.
> As the 2.065 changes on the GDC branch have been reverted I'll start merging
> the ARM stuff later today.
> 
> https://github.com/jpf91/GDC/commit/9bb8b803f9e25ce98b555e8eaf765f0e73605796

I've just seen the check-in this morning.  It's quite sad that we can't actually apply any of these optimisations that Walter seems to bang on about all the time in reality.

Out of curiosity, is XopEquals marked as pure nothrow?
Comment 15 Johannes Pfau 2014-02-14 19:15:01 CET
Yes, it's kinda sad. However I don't think any other solution will work. TREE_NOTHROW means 'wont ever throw anything' and the backend seems to rely on this so we just can't map that to the 'nothrow' keyword.


* TypeInfo_Struct.xopEquals in object_d. is marked as '@safe pure nothrow'
* it's called from TypeInfo_Struct.equals '@trusted pure nothrow const'
* _xopEquals in object.d is not marked as nothrow (but it throws an Error so it could be)
* the compiler generates a stub for TypeInfo_Struct.xopEquals which is not nothrow. This stub then calls _xopEquals

So the autogenerated stub and _xopEquals could be marked as nothrow as well but it wouldn't change anything.

This reported bug is not the only cases were this causes problems. These test cases also fail as long as we set TREE_NOTHROW:

FAIL: runnable/gdc.d execution test
FAIL: runnable/opover2.d execution test
FAIL: runnable/test41.d execution test
FAIL: runnable/testcontracts.d execution test
FAIL: runnable/testdstress.d execution test
FAIL: runnable/testline.d execution test
FAIL: runnable/xtest46.d execution test

Here's the GCC commit which introduced removing unwind tables for TREE_NOTHROW functions, for reference: http://gcc.gnu.org/ml/gcc-patches/2008-03/msg00958.html
Comment 16 Johannes Pfau 2014-02-15 11:17:08 CET
Here's also a reduced test case:
(compile with -frelease)
-------------------------
nothrow void main()
{
    throw new Error("Fail");
}
-------------------------
./test
[1]    654 abort (core dumped)  ./test

This really happens in all cases where a nothrow function throws something.


Fixed in https://github.com/D-Programming-GDC/GDC/commit/e25e4d0fba7b97d8ba6ec333d1cea003987d06d6

BTW: @Iain if you want to run the test suite on your ARM box it should pass now! And all unit tests should pass as well except for std.math.internal.gammafunction.
Comment 17 Iain Buclaw 2014-02-15 12:53:46 CET
(In reply to comment #16)
> Here's also a reduced test case:
> (compile with -frelease)
> -------------------------
> nothrow void main()
> {
>     throw new Error("Fail");
> }
> -------------------------
> ./test
> [1]    654 abort (core dumped)  ./test
> 
> This really happens in all cases where a nothrow function throws something.
> 
> 

It's called unrecoverable for a reason. :o)

I do wonder if there's perhaps something more we can do in our deh implementation rather than just abort... This is something that we need to think about along with some other details that need attending in gcc.deh (the OOM loop problem, exception chaining).

> Fixed in
> https://github.com/D-Programming-GDC/GDC/commit/e25e4d0fba7b97d8ba6ec333d1cea003987d06d6
> 
> BTW: @Iain if you want to run the test suite on your ARM box it should pass
> now! And all unit tests should pass as well except for
> std.math.internal.gammafunction.


Thanks!
Comment 18 Johannes Pfau 2014-02-15 19:55:53 CET
> It's called unrecoverable for a reason. :o)

I actually thought about that as well but I think people will just blame GDC if we're the only compiler were Errors are really unrecoverable....

> I do wonder if there's perhaps something more we can do in our deh implementation rather than just abort..

I'm not sure if we can change it in the deh implementation but we could rewrite all throw statements in nothrow code. The only cases where this wouldn't help is when the users casts a 'throw' delegate to a 'nothrow' one. However, casts are undefined behavior anyway...
Comment 19 Andrei Alexandrescu 2014-02-26 19:12:23 CET
Payment made. We consider it a vote of confidence for the great work on this project. Congratulations and keep up the good work!