To learn a new programming language, I like to port a code that I already know well. To learn the Go language, I chose my game as the base code. Go is a nice language for doing most of the code, but is currently too slow for key kernels, so I wrote a Julia script that generates SSE code for the kernels. I'm maintaining the port on GitHub. I've tested it on Windows 8 and would like to hear if it works on any other platforms.
Frequon Invaders is a small program, on the order of 3,000 lines, so my remarks are in the context of a small program. I have not worked on a large project with Go to test its claims of more scalable development. Except for an interrupt handler for sound, the project did not exercise Go's elegant support for concurrency.
Low pause garbage collector: Go 1.5 introduced a low-pause garbage collector. Indeed, hearing about this a GopherCon this summer is what motivated to start the port project. Without the low pause, a video game in Go would be impractical.
Fast compilation: Compilation seemed quick, albeit Frequon Invaders was a small project.
Syntax: The syntax is easy to learn and light on ink. The simplified and saner operator precedence is a relief compared to C. Go's left to right declaration syntax is easier to read/write than C's, and I'm no novice with C syntax -- I have International Obfuscated C Code Contest winners on my resume.
Standard formatting: Go has a standard formatting convention and a tool to automatically do the formatting. This makes reading other Go code easier. The standard format is fairly close to how I format C++ code, so I'm happy, with the guilty pleasure that it makes other people format code the way I like it.
Multiple return values: Go has a straightforward syntax for returning multiple values from a function.
Bounds checking on array/slice accesses: This saved a lot of debugging time. The slice feature of Go provides much of the power of pointers for common cases, but with bounds checking that quickly catches errors.
Interfaces: I developed a liking for Go's "interface" style of abstraction. It works well for writing strongly-typed groups of related callbacks without having to introduce dependencies on base classes as in C++.
Channels: Sound generation runs as an interrupt-driven component. The rest of the code communicates with it over a channel, which must be non-blocking since both sender and receiver might be running on the same thread. Having non-blocking channels built into Go saved me the trouble of coding my own. This file has the code, and it's delightfully simple.
Fast compilation: Frequon Invaders is a small program, but nonetheless fast compilation is appreciated.
Test + benchmarking framework: Go comes with a test and benchmark framework that worked well for me.
Haves and Have Nots: No matter how much I like the good points above, there is a big sore point for me: Go splits programmers into the "haves" and the "have nots". The "haves", who built the compiler, get nice syntax for their features. The "have notes" get clumsy syntax. For example, C++ lets users define subscripting for their own types; Go does not. Go's complex numbers get +, -, *. User-defined 2D vectors do not. The Go map is nice when it works, but cannot be extended to key types that need user-defined comparison. The set of operations allowed on constants is unextensible -- there's no way to define your own operations or new structured constants like C++'s constexpr enables. The "haves" imbued their built-in types and functions with parametric polymorphism. For example, map types are parametric over their key and value types. Users cannot define parametric types and functions. I would like for Go to put user-defined constructs on same footing as the built-ins, even if it took more care on the user's part to define those constructs.
No support for assertions: Go has no assertions. While I appreciate the argument that assertions are sometimes abused, I've worked on other codes where assertions are essential for development, and essential to leave out of production code because they incurred significant overhead. I ended up using conditional compilation, which felt clumsy.
No variable-size multi-dimensional arrays: In code where a 2D array is the obvious simple solution, it pains me to linearize array subscripts or use array-of-array structures. The latter is asymptotically impractical for constructing a view of existing data since it will take O(number of rows) time and space to construct. And because of the lack of parametric polymorphism and operator overloading, there's no way for me to hide the obfuscation of the linearization approach.
Numerical performance: The Go compiler does not (yet?) vectorize loops or offer other SIMD support, so a pure Go version of Frequon Invaders fell about 3x short of the desired 60 Hz frame rate. To be fair, previous versions of Frequon Invaders have relied on run-time code generation or hand-coded assembly, but by this decade I'd like to have a portable solution. For example, having +, -, and * defined for a vector type (which could be a short array) might be a solution. The next section describes what I did instead.
A lot of games can relegate the heavy computation to graphics hardware, via interfaces such as SDL2 or OpenGL. But Frequon Invaders has the peculiar property that the main display is not of a 2D or 3D space, but of a Fourier transform of a sparse dataset, so sparse that a DFT outperforms an FFT. The transform needs to be computed efficiently to maintain an acceptable framerate. Processors are faster now, but screens have more pixels to compute, and players expect higher frame rates. The original 2001 Frequon Invaders, developed for a 75 MHz Mac Performa, used run-time code-generation to customize a DFT to the sparsity of the dataset, and nonetheless had a slow frame rate. The 2006 version dropped run-time code generation in favor of hand-written SSE assembly code and careful attention to instruction pipelining. As explained below, the latest version returns to computing assembly code, though no longer at run-time.
The DFT has been reformulated to exploit some complex-conjugate symmetry, reducing floating-point operations to 70% of what the 2006 version required. Nonetheless, the DFT needs about 8 GFlops to deliver 60 frames/sec on a HDTV display. The underlying computational kernels are mathematically simple, but involve complex arithmetic and are most efficiently implemented with "array of structure of array" data layouts and SIMD instructions. Since Go has no support for SIMD, I wrote a Julia script that generates Go assembly code. It essentially does the work of expanding patterns and allocating registers. Julia's multiple dispatch (function overloading) was a good fit for this. For example, it overloads a function mov for various kinds of operands, including complex values. The generated kernels are admittedly overkill for a desktop machine -- on my i7-4770S processor I get more than 24 GFlops from a single core. Though on a more slowly clocked laptop with more than HDTV resolution, the kernels are probably about the right speed.
I developed a liking for the short names X0, X1, ... etc. that the Go assember uses instead of xmm0, xmm1, ...etc for x86 SIMD registers. One other benefit of Go assembler: it has the same calling convention across different OSes. So the x86 assembly code in my project, though developed on Windows, ran correctly the first time I ran Frequon Invaders on Linux.
So though most of the Frequon Invaders code is Go, far more time is spent in code computed by Julia. If Julia had a low-pause garbage collector, non-blocking channels, and a binding to SDL2, it might be practical to rewrite Frequon Invaders in Julia, since Julia does have SIMD support
Go seems to be suitable for writing video games. The syntax is easy to learn and the environment productive. The "haves" and "have nots" design of the language is irksome.
Thanks goes to the creators of SDL2 and the creators of these Go bindings for SDL2. The Go team has delivered an intriguing language with a reliable implementation.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804