First, thanks for providing a description in the comment, as I forgot to provide a description field in the submission form :D
I see what you mean with the difficulty at mixing, there are some abrupt changes due to the layer-driven approach. Maybe even just having very slow volume fades between sections could be a good compromise? But apart for this very specific mixing thing, I think the result is great, I can see how the evolution of the city matches that of the track. And the song is good in general! I'm wondering if the complete lack of low end in certain parts could be an issue? You'd be playing those sections for a long time until the city evolves, so you'll be missing the bass for a long time.
Kudos for the environmental awareness focus :D