So I have an issue with the response time, it feels like you always have to anticipate the beat oppose to being on beat. The sword clinks have some natural resonance to them that make the note sound a little longer than the input window actually allows. Make the input window a little more lenient to compensate for that. I would also put more drag to the sword animations to show the anticipation for people more visually incline for rhythm cues. They basically are neutral then crossed, there is no wind up.
The player is likely to feel the sword clash rhythm tracker in 4/4,but they still need to be taught that. The sword cue and gameplay starts at as soon as you hit space, at the very first measure. You should let start at the 3rd or 4th, Have the character step on screen in rhythm during the first two measures with a space bar icon above teaching them the rhythm. Then let the player control start at the 3rd measure. I would also use a hopping sound or the Minotaur's footsteps as the rhythm tracker over the swords. Give an option to skip that that small tutorial bit at anytime and jump to a 1-2-3-GO notification and beep counter for people doing retries.
Put a more exact timing for the statue breaks. I managed to break them 3 ways:
1) quickly mashing space twice (often losing rhythm but would always successfully break them and worked out best if I had a lot of distance)
2) Doing a double quarter note keeping rhythm but losing a hop. (Natural feeling but was the worst option)
3) Hopping up to it and doing two 1/8th taps. (this felt good rhythm-wise but didn't give a consistent result. I would sometimes get the break and not a hop afterwards)
I think you should put a time signaler above the gargoyle statues on the hop up to them to teach the timing you want the player to actually use.