As I've been working on a job, I've discovered that the garbage collector HUGELY impacts the performance of the system. This was not an issue when I first made this thing, since the garbage collector didn't exist (at least, not in the same way)
It's not super noticeable with low-bone count armatures on higher end hardware, but it actually gets pretty bad pretty fast, resulting in stutters of up to half a second thanks to the garbage collector needed to clean up a lot of mess.
Unfortunately, there's not an easy quick-fix I can suggest for now - matrix_multiply, which is quite essential for this thing, allocates a new array each time it is called, and marks the old array in place as garbage. So for any non-native targets, there is a terrible performance impact. I can't just remove this function, and replacing it with a GML-based byRef version results in significantly worse performance.
The solution: In a later update, animations will be driven by native C++ on as many platforms as possible. Other platforms will still have the GML as a fallback (and HTML5, if it ever works, won't suffer as badly due to its different GC pipeline)
It might take me a while to get this sorted since I have less free time now, but I'm hoping to get stuff updated soon in a much better way - it'll be pretty much a complete re-write, but I'll not be selling it as a separate thing, so don't worry about needing to pay twice