I don't know if this is possible with the way the game is set up, but what I would recommend is simply restarting the players solution on the second set of tests. The way I abused the system was simply hard coding the correct output for the first test above the actual solution, if the program where to run from the start for the second set of tests that hard coded output would cause the solution to fail.
I haven't looked at the code in a while, but it's likely that that is exactly what I'm doing on the server side. It would make sense to do the same thing on the client side (I don't recall if there was a reason for the discrepancy), although that could potentially break some existing solutions. Regardless, thanks for flagging this for me!