Hey! To make it cross platform I write backends for the different platforms.
A few platform specific things are needed to be done:
1. Making a window that I can draw an image on
2. Getting keyboard and mouse input
3. Audio
On macOS I use X11 and AudioToolbox, on Linux I use X11 and ALSA, and on Windows I use Win32 + GDI + WinMM
I have some code on GitHub that demonstrates some of these things (minus the audio):