Games, Tests and GitLab CI

highscore

We are getting midterm of the GNOME 3.30 development cycle and many things already happened in the Games world. I will spare the user facing news for later as today I want to tell you about development features we desperatly needed as maintainers: tests and continuous integration.

TL;DR: GLib, Meson, Flatpak and GitLab CI make writing and running tests super easy! šŸ˜ This will allow Games to be more stable and to have more features.

The More the Buggier

Not only does Games and retro-gtk are slowly becoming bigger and more complex, but to handle many platforms Games has to come flatpaked with Libretro cores. Games and retro-gtk are currently only tested manually and as far as I know, this is also true for the vast majority of the Libretro cores we distribute. Thatā€™s quite a large number of untested lines of code, it is already impossible to test all of them manually and the test matrix is not going smaller. We are not immune to introducing new bugs or to accidentally reintroducing bugs we already fxed. We also caught some bugs in Libretro cores which caused a loss of save data, which is absolutely unacceptable (luckily, it was only in the unstable Flatpak and we mitigated the issue as soon as possible).

The Libretro API offers an interface for cores to define variables with a non-empty list of possible values that the frontend can set to the desired value; these variables can be seen as core-specific runtime APIs that Games could use to implement platform-specific features. Unfortunately, for Games to rely on such unstable APIs we would need to react quickly to their changes, changes which canā€™t be detected at compilation time. Because of the lack of a good way to detect such changes and for the sake of maintainability, Games doesnā€™t use the variables at all to implement platform-specific features.

These problems have simple answers: unit tests, behaviorial tests and continuous integration.

Mandatory meme.

Unit Tests

Meson and the GLib testing framework make implementing unit tests and code coverage super easy. retro-gtk has been ported to Meson a few versions ago so adding unit tests for its central RetroCore class was a breeze! Games has a Meson port by Alice Mikhaylenko and Denis Ollier (thanks a lot!) that is waiting changes in the master GNOME runtime to be mergeable; once merged, unit tests will be written.

RetroCore requires a path to a Libretro core to be constructed and to run, so to test it correctly I wrote the retro-dummy Libretro cores which implements the strict minimum required by the Libretro API.

Tests are particularly convenient to try reproducing bugs and to ensure we donā€™t reintroduce them by accident; for example until recently Games was crashing when loading a UTF-16 encoded cue sheet, so I want to write tests checking that various file formats are parsed correctly.

Our unit testing guidelines recommends to install tests system-wide, by doing so we have been able to flatpak the retro-gtk tests in org.gnome.Retro.UnitTests and we can now easily run them in a sandboxed environment.

Reftests

Unit tests are nice to check that small bits of code behave as expected, but retro-gtk being very dynamic by nature, testing the behavior at the application level is needed. To complement the unit tests, I wrote retro-reftest to run reference tests on Libretro cores via retro-gtk, allowing to compare the actual reactions of retro-gtk when running special test Libretro cores to the expected reactions.

First, you write a reference test file which looks like this:

[Retro Reftest]
Path=/test
Core=/app/lib/libretro/test_libretro.so

[Options]
test_aspect=4:3;16:9;
test_samplerate=30000;20000;
test_opt0=false;true;
test_opt1=0;
test_opt2=0;1;foo;3;

[Frame 0]
State=Refresh
Video=test.png

Then you run retro-reftest with this file and the --generate option to generate the test outputs (in that case there is only test.png), then you pass the reftest descriptor file to retro-reftest again and if all goes as expected, you should have this output:

/test/Boot: OK
/test/Options: OK
/test/0/State Refresh: OK
/test/0/Run: OK
/test/0/Video: OK

If you want to know more about how retro-reftest works and how to write reference test files, check the Retro Reference Test Case Specification page. The tests are still quite simple and they are expected to grow the ability to send various inputs, to set the options (called variables by Libretro), to test the rumble, to load the state of the core, to set the number of frames to run ahead of timeā€¦

Libretro offers a set of test cores which have been added to org.gnome.Retro.UnitTests with corresponding reference tests to test retro-gtk and to ensure it behaves correctly.

Continuous Integration

Tests are useless if they are not run, thankfully Games and retro-gtk moved to GitLab so we can benefit from its continuous integration tool to run org.gnome.Retro.UnitTests on a regular basis, but we can got way further.

By frequently rebuilding the flatpak of Games we can check that the bundled Libretro cores compile successfully, and thanks to retro-reftest we can also ensure they work as expected with retro-gtk! This is very important because as far as I know Libretro cores are not tested much by upstream and they are certainly not tested with retro-gtk which is what matters the most for us (if you know cores which are tested besides simply building, please let me know! This allows us to catch regressions or improvements in the Libretro cores early and to quickly react accordingly.

These Libretro core integration tests are shipped in org.gnome.Games.UnitTests and are regularly run via GitLab CI. Later, this flatpak will also receive and run the unit tests of Games and I would love to have code coverage support like GLib does. Thanks to all who allowed the CI to work in GNOMEā€™s GitLab as well as in Games and retro-gtk!

Digression About Test Data

Finding freely redistributable test data (understand: test programs) for many platforms is super hard! Most persons producing them just let the tests freely available but without specifying a license, making them by default proprietary software and hence unsafe to redistribute. I managed to find some test suites with free software licenses, but it was mostly out of luck. Please, tests and demos producers, consider distributing your work under a free license rather than none!

Alternatively, if you know how to write free software for any exotic platform please consider writing some simple test programs: even a Hello, world! is extremely valuable to us!

With Tests and CI Come Features

Now that we have tests running very often and with retro-gtk being able to check the variables offered by a core, we will be rapidly notified of any breakage in these unstable APIs and hence making using them more sustainable. With these core-specific APIs we could implement platform-specific features or set sensible defaults, for example:

  • you could select the palette of your Game Boy Color game,
  • you could choose on which kind of Game Boy your game runs on (some Game Boy Color games unlock bonuses when running on a Game Boy Advance),
  • we could force a Nintendo DS emulator to not mimick a mouse but to instead use the actual mouse pointer,
  • we could force the anaglyph mode of a Virtual Boy emulator to something Games can build upon,
  • ā€¦

Not being in constant fire monitoring mode and fearing breakages less will allow us to make Games grow! šŸ˜