I currently use a program called HONK for my PNGTUBER. It doesn't move my PNG well like I think this program would.
BUT, what it does have is around 14 different mouth shapes that actually react and replicate mouth movements accurately enough on the PNG model. Is this something that you could add to PNGTUBER PLUS?
What I think it is doing is picking up individual frequencies or levels of the mic to read the vowel and letter sounds. I would swap over to this program instantly if this feature was added here