Basically you need to do raycasting from the main camera to detect game object which are meant to be interacted in the game scene. These objects have scripts having functions which are called when doing interact input. I'm sure there are lots of tutorials in Youtube to check :)