Create new track for GDScript

pfertyk · August 22, 2023, 8:43pm

That’s not good :( I will continue working on tests, hoping that the issue with Apple will be solved. If I reach the end of my work with the test runner and the issue is still there, I’ll see if I can do anything about it.

glaxxie · September 14, 2023, 7:41pm

Hey there @pfertyk, I hope everything is going well.
I saw you mentioned that you are an indie dev, then I’m sure you must have saw the recent atrocious Unity announcement. I’m sure that move will drive a lot of people to Godot and having a track on exercism would be really awesome.
Whenever the track is ready, if you need people to help porting over exercises I would be gladly to help whenever I can.

pfertyk · September 14, 2023, 10:01pm

Hi @glaxxie ! Thank you very much for your offer ;) I agree that Unity’s decision might push a lot of people toward other engines, including (hopefully) Godot. I also agree that it would be great to have an Exercism track ready to help them on their new path :) I was a bit overwhelmed recently, but given the circumstances I’ll put in some effort to focus on this task. I think the runner looks promising already (with some refactoring and extra tests coming up soon), and the initial exercises could be mostly ported from Python track (the syntax is very similar). There might be some other issues (like the one found by @BNAndras ), but hopefully we’ll find a workaround.

BNAndras · September 14, 2023, 10:51pm

Let me know if you need any help porting exercises. I’m almost finished with a backlog of exercise PR reviews for Haxe so I’ll have some time.

pfertyk · September 17, 2023, 10:42pm

@glaxxie @BNAndras @Meatball I’ve realized that the test runner needs to support also checking variables (not only methods) so I’ve started adding this functionality. I’ll also add an ability to test multiple methods (both of these requirements come from the “lasagna” exercise that I keep seeing in multiple tracks ;) ).

Meanwhile, is it possible for you to start adding the exercises? The exact format of test files will be known once I finish with the runner (there is no unit testing in Godot, so I’m making this up from scratch), but perhaps the initial 20 exercises can be already selected? Assuming, of course, that the list is not fixed for new tracks ;) Maybe you could already copy the descriptions from existing tracks (using Python as an example), add proper .json configuration required by Exercism etc? Just to save some time and start this track faster ;)

What do you think? ;)

BNAndras · September 18, 2023, 3:31am

Lasagna is a concept exercise and those can always be added later. The general recommendation is to focus on practice exercises which are easier to implement since we have canonical test cases and data for most of them. You’lol also need the test runner to be v2 or v3-compliant before you need to worry about concept exercises. Practice exercises work fine at v1.

Having gone though a lot of exercise PRs for Haxe and Pyret recently, I don’t recall a practice exercises that tests a variable. They tend to test methods, properties, functions, and those kinds of things. Being able to test multiple methods though will be necessary for several exercises (clock, grade-school, list-ops, etc.) so that’d be a good feature to add.

As to which exercises to port, there are a few suggestions here. leap, resistor-color series, space-age, and two-fer are good ones to add (likely without much difficulty) and lend themselves nicely to mentoring opportunities for students.

If the CI uses the Docker test runner image to test PRs before allowing a merge, then we could do partial implementations and then once the test runner is ready, add the test files. Then we can merge the PRs when the CI is eventually happy. However if Godot doesn’t have an unit testing framework, that’ll make it harder to check our exercise solutions while we work on them.

pfertyk · September 20, 2023, 10:08pm

@BNAndras Thank you for the feedback! Today I’ve added a test for multiple methods (I just moved the const METHOD_NAME to individual test cases). The code is already pushed ;)

With the info you provided, I now know what to focus on :) Tests for variables will be added later, multiple methods work already, I will add some other test cases (e.g. a method that raises an exception, some of the multiple methods not found etc). But I think the format of test files will not change a lot (especially with variables out of the way). You can check the examples in my PR, like this one:

const TEST_CASES = [
	{"test_name": "Test Add 2", "method_name": "add_2_numbers", "args": [1, 2], "expected": 3},
	{"test_name": "Test Add 3", "method_name": "add_3_numbers", "args": [1, 2, 4], "expected": 7},
	{"test_name": "Test Hello", "method_name": "return_hello", "args": [], "expected": "Hello!"},
]

Please bear in mind that GDScript doesn’t have any built-in testing frameworks, so this is created from scratch. But, in earlier discussion, we decided that a simple format like this will be enough for GDScript exercises ;)

Knowing that format, is it OK to start adding the docs, configuration, and metadata for initial exercises? From my side, I would really like to see “Grains”, “Leap”, “Protein Translator”, and “Hamming” (but I won’t insist on these ones ;P ).

BNAndras · September 21, 2023, 4:46am

The canonical data for Hamming, Grains, and Protein Translation all include error handling to various levels. Does the test runner have a way of testing whether an error occurred? Then, you could run a test case with an expected error message and see if the method causes an error with the same message. If you can’t compare the messages, you could just check if an error occurred for those tests expecting one. If an error occurred for tests not expecting one, they’d error out normally so I don’t think you need to assert that those won’t raise an error.

In a pinch, the tests expecting errors in Grains and Protein Translation can be excluded since they’re for input validation. For Grains, they make sure you can’t pass in a number too high or too low. For Protein Translation, they make sure incomplete / gibberish sequences can’t be translated, and the readme makes it clear the tests are skippable. Leap on the other hand doesn’t have any tests expecting an error so that’d be an easy one to add.

BNAndras · September 22, 2023, 12:59am

I made draft PRs for leap, pangram, resistor-color, resistor-color-duo, and triangle since those are fairly straightforward and don’t have any error handling. Once the test runner is working nicely, we should be able to use that Docker image in the CI to test the exercises as they’re being merged in. That’ll speed up the exercise porting.

pfertyk · September 22, 2023, 9:54pm

Thanks for the PRs, I’ll review them soon ;) As for error handling in Godot, today I’ve added a new example to the test runner that shows a function validating its input and returning an error in some cases. Unfortunately, Error in Godot is just an enum, so effectively my test checks if the function returns a specific integer (in this case, 31). I’ll see if it makes sense to change the format of test runner files to check if the return value is an actual error, or if we can keep it as it is.

pfertyk · September 26, 2023, 10:24pm

I’ve added more tests, including some that are checking for “invalid” scenarios (e.g. the method is called with a wrong number of arguments, or the method throws an exception when it’s executed). Because of GDScript’s limited error handling mechanisms, I ended up wrapping the actual execution of user defined method in several levels of functions/assertions, and if something goes wrong, that execution returns “null”. So, I came up with the following rule: if a tested method returns “null”, it means that something went wrong and the test runner shows a generic message (without any details).

Is this OK? There is no “try … catch …” in GDScript, so this seems like the best thing I can achieve for our purpose. It is possible to filter out the error output from Godot’s execution, and maybe put this output somewhere in the test runner’s output files. However, that would require mixing up the GDScript code that saves the “results.json” files with the bash code that executes our test runner script, which might get dirty … Also, my solution doesn’t make sense if there are any exercises in Exercism that expect a method to return “null” in valid scenarios (are there any?). So, I was wondering if this part of the test runner meets the standards?

glaxxie · September 26, 2023, 10:46pm

Nice work!
As far as returning “null”, I can’t really recall any exercises that would require it.
There are a lot of returning empty array, 0, “” or False though.
If this is the acceptable way for the test runner, and somehow you got an exercise or function that want to return null, I think it would be appropriate to tweak the test itself a bit.

pfertyk · September 27, 2023, 9:07pm

So, we can agree that all exercises added to this track will return a non-null value? Any “falsy” values are completely OK with the current implementation, but I’ll add a test just to be sure In this case, if the test runner encounters a “null” it will assume that something went wrong and will print a message.

I have only 2 questions at this point:

Is it OK for now to print a generic message: “Execution of the method ‘…’ has failed. Please make sure that it accepts a correct number of arguments and has a correct implementation.” ? To print more, I’ll have to filter the error output in bash and put it somehow into result.json files (as mentioned above). That’s the only option I see of providing more information about failed tests. It is complicated, but I also think it would be helpful for our users to know more about what went wrong … On the other hand, we should start this track ASAP, since Unity users are moving to Godot now …
It is possible for GDScript methods to return an Error, but Error is just an enum, so e.g. returning ERR_INVALID_PARAMETER is the same as returning an integer 31, which might be a correct value for some executions. We cannot return “null” (because of the point above), we might return “false”, but that will not work for methods that return a boolean (where “false” might be a correct value). This seems like a rather dirty workaround anyway, so I’d prefer to avoid it. I know there are some exercises that involve raising an error, but could we agree not to add them initially, and instead add them later when I figure out a proper solution?

glaxxie · September 27, 2023, 9:21pm

I think if the test runner can point out which test fail, and a generic statement like : expecting “this predefined result” but got “user’s result”, that should be plenty good because that’s how a lot of the track operate.

pfertyk · September 29, 2023, 7:54pm

The current implementation does that, but for “failed” tests only. What I have in mind are tests that end with an error (e.g. someone tries to divide by 0 inside the tested method). In that case, Godot will log some details to stderr, but there is no try … catch … in GDScript, so we will end up with a “null” as a returned value. Because of that, the runner will put a generic “something went wrong” message for each test that causes execution error.

I am currently checking if maybe there is a relatively easy way to put the full value of stderr after each test into the corresponding message in results.json file. If I find a way, I’ll add it to the initial version of the test runner, as I believe providing more details will be very useful for GDScript newbies ;)

glaxxie · September 29, 2023, 8:20pm

Wow, that would be much more detail. You did mention that

It is possible for GDScript methods to return an Error, but Error is just an enum, so e.g. returning ERR_INVALID_PARAMETER is the same as returning an integer 31

Is this Enum of Errors a fixed built in or predefined, or you can change its value? Because if it can be change to custom values, we could come up of a list of error code for them and act accordingly to which one being returned.

pfertyk · October 1, 2023, 10:13pm

Unfortunately, Error is a fixed Enum:

So, I am thinking of removing the test that checks for Error being returned (as it is the same as checking for a specific integer is returned), and possibly choosing exercises that don’t involve error handling. Or, as an alternative, making a decision about error handlin case by case (e.g. if a method returns a number in normal conditions, then an incorrect input parameter will cause it to return a string with an error message instead).

On the bright side, I think I’ve figured out how to include the full error output from Godot Engine in our results.json files. That will provide a lot of useful details to our users :) The code is already pushed, and I think this (and error handling) are the last items on my TODO list for the script runner’s core funcionality. So, I will now proceed with refactoring

BNAndras · October 1, 2023, 11:32pm

You don’t have to implement all the tests for an exercise. Just mark the test(s) you’re skipping in the tests.toml file to document that for future maintainers. There’s a test for I think gigasecond where the test runner checks if the date object being passed in is mutated. For languages with immutable date objects, the test is redundant and can be skipped

pfertyk · October 3, 2023, 9:24pm

Oh, that sounds like a good idea! I will check the configuration to ignore tests in other tracks and will try to add something similar for GDScript. Thanks!

pfertyk · October 14, 2023, 9:37pm

So, I realized that skipping tests works for exercises, not for the test runner itself (unless I’ve missed something). Considering that, I decided to temporarily remove the test for “returning an error on purpose” (in my case, it was grains and it was returning an error if the input value was out of range). With GDScript’s current architecture, I’m afraid that error handling will have to be done case-by-case when adding exercises (e.g. functions that return an integer can return a string message if they want to raise an error). Or, we can skip such exercises altogether, considering that there is enough exercises without error handling to get the track up and running. Please let me know what you think about this.

I’ve also added a test for returning a correct value, but incorrect type (e.g. returning an integer “3” instead of string “3”). That test works fine, except the error message is kinda useless (Expected '3' but got '3'). Did we have similar issues before? Is there a standard way of fixing them, like wrapping the values in something other than quotes, or explicitly stating the type of the value?

I am now working on refactoring the test runner and adding some docs (so far so good). i think soon we will be ready to merge :)