\o/ at last it works! The code now builds with gcc and even runs !
I was migrating some of the assembly stuff to C (like we should try to do anyway) and then it started to work. I am not sure exactly what went wrong before. I can imagine it has to do with wrongly set up function calls forcing a switch from thumb mode to arm mode, which is not allowed on cortex-m3 and will simply lock up with a hard fault. If a jump address does not have the LSB set, this will happen. So probably there was one single bit wrong in the old code