I've had a bit of a poke through the [AD] archives ... the key dates seem to be end of 2007/early 2008, but I didn't find anything really clear on how the decision was made. Key people were Evert, Peter Wang and Milan Mimica. I don't know if they are still active?
I don't remember the C versions being faster, just that ASM code was not that much faster to justify the complexity of maintaining it. (it was a big mass of macros to generate all the possible variants.)
It was only ever written for i386, and only using GAS syntax, so it was no good for ARM, PPC, AMD64 or anyone using MSVC.
Also I do think that, if it did reduce performance by 50%, people would have noticed and complained at the time.
So, obviously there is a problem somewhere but I am not convinced it will be solved by re-instating the ASM code.