Well, as I said, there is no easy way to superpose PNG and JSON, especially in the case of cities. Maybe it would be more straightforward with SVG, but I can't guarantee it without checking the code - this is implemented differently in my different generators. I have no idea how trivial it is to load/manipulate/render SVG in Unity though.
Feel free to ignore the following if you think I'm too slow: To be honest, I still don't understand why it's needed, I might be missing something. This is how I see it: the DM selects a map by browsing PNGs and after that a corresponding JSON is loaded. The JSON data (and only JSON data!) is used to build a 3D scene and that scene is used both for presenting it to players in all its 3D glory AND for as a workspace for DM for adding/editing/removing stuff. In this scenario PNGs are only used as previews and no superposing is needed. Isn't it?