Post by Aztec_man in Deep Down jam comments

AI and Games Jam 2021 » Entries » Deep Down

Viewing post in Deep Down jam comments

The music is hecka good. Nice job overall!

Definitely curious to know more about the learner; you mention ML agents on your game-page - did you base your work off of any example in particular? I know that Proximal Policy Optimization is the default algorithm for unity agents, so I'm guessing that is the overall model type, but curious to hear more about the process.

Like Reply

Doriens3 years ago(+1)

Nice to see that you liked the music and the game :)

Exactly, I used PPO for the training. I saw this numerous of time in the OpenAI article and was curious about it. I am used to work in AI, but more the supervised/unsupervised stuff, not reinforcement learning, so I thought that this Jam was the perfect opportunity to try it. I already worked in small project aiming to move an object from point A and B using Q-learning, but that pretty much it.
So I watched this tutorial of CodeMonkey to see how it can be applied in Unity (I recommend it for anybody who wants to learn MLAgents, it is wonderfully done !) and achieved to move my cube from A to B ^^ (How to use Machine Learning AI in Unity! (ML-Agents) - YouTube)

The rest is pretty much a lot (and I mean A LOT) of experimentation on the reward system and the hyperparameters of the model.

Also I chose to do a sort of "incremental training". I know from experience that one big problem in reinforcement learning was the agent never able to find the positive reward (you can check this CodeBullet video where he face this A.I. Learns to DRIVE - YouTube). In the case of the video, he decided to create checkpoints at some given distance for the agent to be able to move toward the target.
In my case, I didn't want to do that, as it would make the character displacement too linear in my opinion (aka, just going in a straight line to the target, without trying to avoid or hit monsters along the way).

What I did is that I started with a small training room without any monsters. Then I increase the size of the training room and randomized target and character positions. Then I added monsters, then lava, then the "breaking the rules" rewards on the agent.

But as well said in this blog (https://blog.unity.com/community/using-machine-learning-agents-in-a-real-game-a-...), finding the good rewards can be very complicated, and lot of tries were needed to get an acceptable behaviour. As said in the post, the first builds had some interesting results, but pretty much useless as a game meant to be controlled by a player (the tank only good when there are monsters around, the healer trying to heal more the tank, so lost with him, and the DPS trying to kill the monsters disregarding the leader). Also the model architecture changed quite a lot during the week, and with each modification, we had to start over all the training process from the beginning (so much fun :p) !

Like Reply

Aztec_man3 years ago

Thank you for the detailed reply!!!!
Fascinating overall.
I appreciate the link to the article as well.

Like Reply

itch.io

Viewing post in Deep Down jam comments