Tweak stub and test files

ErikSchierboom · January 5, 2025, 5:40pm

I’ve been taking a look at the Clojure track, and some things stood out to me:

The stubs often contain [] ;; <- arglist goes here, which means that the student has to specify the arguments themselves. This goes slightly against our stub recommendations, where we suggest to provide stubs that students can jump into and start coding. What do you all think about adding the parameters to the stub files?
The stubs contain ;; your code goes here for where the student needs to put their code. This is clear enough, but has one disadvantage: some of the tests might pass without the student coding anything (I think it will return nil be default?) Maybe there is a better way to handle this?
The test files’ structure is not very consistent:
- A deftest expression per test case, even when only one function is being tested (example)
- A deftest expression per “category” of outcomes (for example true vs false)
- A deftest per tested function (example)
  Maybe we can use a consistent pattern? Personally, I quite like the third option, where all tests related to a single function are grouped together under one deftest.

ErikSchierboom · January 5, 2025, 7:39pm

Oh and we should also discuss whether having the GUIDs as test names makes sense (I missed that when merging a couple of PRs)

tasx · January 5, 2025, 8:20pm

Known issue. I’m gradually fixing this whenever I get the chance (usually when syncing the tests). I’m following the template from [v3] Template for Exercises · Issue #305 · exercism/clojure · GitHub

tasx · January 5, 2025, 8:21pm

Yes, it will return nil. I guess, we’ll have to think this through. Perhaps switching to an alternative approach would be beneficial.

tasx · January 5, 2025, 8:22pm

Known issue: Template for tests · Issue #673 · exercism/clojure · GitHub

I’m currently adhering to the template I proposed on GitHub, which specifies one deftest per test case. This approach clearly indicates all failing cases each time we run the tests.

Combining multiple tests within a single deftest is another option, but I’ve found that it can create a confusing user experience. When multiple tests under a single deftest fail, only the first failing test is displayed. For example:

Test 4 failed. Clicking to expand reveals the specific failing test case.

This method doesn’t show that other tests have also failed. As a result, after fixing the identified failing test and rerunning the tests, it might still display:

Test 4 failed, but now with a different test case.

Since it’s easier to remember the number of the failing test rather than the test details, this could confuse people as to why the “same” test appears to fail repeatedly.

I haven’t conducted extensive research on how failed tests are presented, but this seems to be the default behavior. Testing with lein locally shows a similar presentation when multiple test cases are included within a single deftest.

Edit: does not let me post a 4th consecutive reply. Can someone please post after this post?

ErikSchierboom · January 5, 2025, 8:31pm

Ah that is confusing! Let’s use a deftest per test case then!

ErikSchierboom · January 5, 2025, 8:32pm

Cool! I might also do a couple when I feel like it.

ErikSchierboom · January 5, 2025, 8:32pm

Some languages have a todo or pending construct. Others throw an error.

tasx · January 5, 2025, 8:33pm

Unfortunately, I don’t have a perfect solution for this. This was a deliberate choice I made in an attempt to semi-automate the process of writing tests. My reasoning was that, since we write the tests manually, using a name for the deftest derived from a description can easily lead to duplicate function names due to copy-paste errors. This issue won’t be flagged during test runs; instead, some tests will be silently skipped. I’ve encountered this problem multiple times, and it has also affected the other maintainer.

Additionally, I wasn’t fond of forcing users to read function names that are often overly long and nonsensical. By using UUIDs as function names, users are encouraged to skip over the function name and focus directly on the test description in the(testing ...) form, which is derived from the specs.

Given that I’ve chosen to follow the one deftest per test case template, I ended up considering the function names irrelevant. If every test case includes a (testing ...) form that clearly describes the test, and if all exercises are updated, we could eventually remove the deftest test-uuid altogether from the presentation, leaving only the (testing ...) forms visible when someone clicks the test case.

ErikSchierboom · January 6, 2025, 7:29am

I didn’t mind that, but that’s personal.

Yes, although that only helps users that use the online editor.

They way I approached this in my tracks (C# and F#) was to have test generators to generate the tests, that way you never have to deal with duplicate tests because the logic for selecting which tests to include can be automated.

I’ve looked at the existing generators and to me they’re not ideal mostly because I can’t really see the code well enough. What I would much prefer to see is to have a text template that is rendered using a template engine (e.g. handlebars), which would make it much more obvious how the tests are structured. That has the benefit that it also makes writing them and contributing to easier.

ErikSchierboom · January 6, 2025, 9:30am

I could look at building a sample generator

tasx · January 6, 2025, 9:30am

Mostly no. My main argument here was that the function name is displayed next to the test case, and given the weird function names, it’s often difficult to see what the test is about without clicking on the test case. So instead of spending the user’s time to comprehend the function name, my intention is to encourage them to click on that test to see a proper description of the test, and the test code.

Additionally, we might end up with other issues even if we automate this. For example if we use the test case description for the function name we might run into duplicated names in case the descriptions are not unique for some reason, or very long names, or cases where there’s a word that we don’t want to appear in the function name. For example sometimes the word “list” might appear in the name (becase it came from the canonical-data) whereas the implemented tests are using vectors. That’s just very confusing overall.

After weighing all options, i concluded that using a function name that comes from the test case description was worse than using the uuid as a function name. None of them is actually any good because we can’t really tell what the test is about given a function name.

Well, yes, that’s the intention. To improve the online user experience. If someone is confident enough to work locally, i’m pretty sure they can easily figure things out.

ErikSchierboom · January 6, 2025, 9:32am

They are guaranteed to be unique, taking nesting into account.

tasx · January 6, 2025, 9:49am

Are we still considering using function names that come from the test case description? If so, how are we going to generate the names for a nested test?

tasx · January 6, 2025, 11:01am

Another option would be to just return a simple string, like so

(def f
   [x]
   "Remove this string and implement the function body")

In retrospect, I don’t believe it really matters whether the stub returns nil, throws an error, or returns something else. A student can easily remove the initial code provided and disregard the case where the function returns nil. This means they can come up with a solution that passes a test checking for nil without explicitly coding against it.

ErikSchierboom · January 6, 2025, 1:41pm

I think we’re not fully aligned on several things, so I’ll happily defer to you for track maintenance. Feel free to use anything you think is useful.

tasx · January 6, 2025, 1:44pm

Whatever you do, please don’t make the script generate this kind of monstrosity

(deftest testing-for-pollen-allergy->allergic-to-something-but-not-pollen)

It can get even worse if the description is much longer and nested. I just don’t have a way to deal with those case besides using a test-uuid template.

Another thing you should probably be aware is that the handling of errors is also inconsistent across exercises. There’s an ongoing topic here How to handle input validation and other exceptions in exercises? · Issue #670 · exercism/clojure · GitHub

and another issue here Conflicting messages on the website when a solution uses `assert` · Issue #57 · exercism/clojure-test-runner · GitHub

ErikSchierboom · January 6, 2025, 2:34pm

Don’t worry, I won’t be building one after all

tasx · January 6, 2025, 3:27pm

To clarify, the test-uuid template is simply a choice at this point. Since there is no agreed-upon convention for function names, I would never reject PRs that do not align with my approach.

I am also open to reconsidering and removing the test-uuid template. I initially expected a discussion on this topic with the co-maintainer, but since that did not happen, I chose what I felt was best at the time.

ErikSchierboom · January 6, 2025, 4:58pm

Feel free to choose whatever you feel is best!