I’m excited to share that I’ve submitted a pull request for a new practice exercise called “Mazy Mice”!
This exercise challenges students to generate “perfect” mazes with a single solution path, using box-drawing characters for the output. It involves algorithm design for maze generation and handling seeded randomness.
I’ve included a canonical-data.json file with various test cases, including specific maze outputs for given seeds and error handling.
I’d love to get your feedback on the exercise description, instructions, and the canonical data. Please let me know if you have any thoughts or suggestions!
While Mazy Mice is on your mind for a future review, I wanted to raise a point that’s come up: how best to accommodate different implementation approaches across tracks. Specifically, for some languages, it might be more idiomatic for students to return a data structure (e.g., List[List[CellType]] in Scala) rather than a fully formatted string maze. This would allow them to focus on the generation logic itself.
For instance, a Scala 3 approach might look something like this:
// Enum representing elements in the maze's character grid
// (Illustrative: a full version would include all specific wall/corner/T-junction types)
enum MazeElement:
case PATH_SPACE // e.g., ' '
case WALL_HORIZONTAL // e.g., '─'
case WALL_VERTICAL // e.g., '│'
case WALL_CORNER_TL // Top-Left corner, e.g., '┌'
// ... other specific wall types like WALL_CORNER_TR, WALL_T_LEFT, etc.
case ENTRANCE // Represents the entrance marker
case EXIT // Represents the exit marker
// Function signature for generating the maze structure
// 'rows' and 'cols' are for the room grid; output grid is (2*rows+1) x (2*cols+1)
def generateMaze(rows: Int, cols: Int, seed: Option[Int] = None): List[List[MazeElement]] = {
// ... logic to return the structured maze ...
???
}
(A complete MazeElement enum would detail all components corresponding to the box-drawing characters and path/entry/exit points, allowing a track to map these to characters if needed or test the structure directly.)
My main questions, when you have a moment to consider, are:
Should the canonical description.md somehow acknowledge that some tracks might expect a data structure return, even though the description itself will visually showcase the string representation?
Would it be useful to include hints or guidelines for track maintainers on adapting the exercise for a data structure approach (e.g., suggestions for Enum design, how property-based tests apply)? Our current canonical-data.json is property-based and seems compatible.
This also relates to the ongoing thought about ensuring the visual examples in description.md are as robust as possible (e.g., using > for the entrance/exit arrows). Even if some tracks don’t output strings, the description.md is the primary visual reference for the maze’s structure and components.
It’s worth noting that, if I recall correctly, the use of UTF-8 didn’t work very well in the awk track until the locale settings in the Docket container got updated. There is potential for UTF-8 not working “out of the box” on other tracks.
Batch, Cairo, and probably the assembly languages would have some level of difficulty, but I think it’ll be mostly fine. Batch could swap out UTF-8 for similar ASCII characters. Cairo could do a two-dimensional array of enum variants as well. Both could be surfaced in instructions appends.
The mazy mice PR has been progressing nicely, but one thing that we’ve struggled with is how random the generator should be. In Mazy Mice by rabestro · Pull Request #2312 · exercism/problem-specifications · GitHub, senekor argues that allowing students to generate fully random mazes would make it harder for tracks to implement the maze. They suggest instead to pick a maze generating algorithm such that the canonical tests can be much more specific. I’d be curious in everyone’s thoughts, but mostly the maintainers’
Testing the suggested properties (“there is a single path from the entrance to the exit” and the like) makes writing the tests a good TDD exercise, that requires a load of concrete test cases. Generating such property tests automatically is impossible, I’d say.
Testing for “uses randomness” (“if the seed parameter is omitted, random mazes should be generated”) is quite impossible, as we often discuss in the forum. Suggesting to do so in problem specifications is a bit weird…
So, if we can’t have concrete test cases provided by problem specifications, simply provide a textual description with no canonical-data.json at all (we have a few such exercises, like dot-dsl).
Testing for “uses randomness” (“if the seed parameter is omitted, random mazes should be generated”) is quite impossible…
I use this pattern:
a = generateMaze(x, y)
b = generateMaze(x, y)
asserts:
a should be not equal to b
a is a perfect maze of size (x,y)
b is a perfect maze of size (x,y)
I had written tests for Java and AWK.
I am quite sure it is possible for other languages.
It is an option. While I’m using canonical-data.json to feed AI for generate test cases for AWK track (we don’t have a dedicated generator) and I can confirm that AI smart enough to generate good results. For this special exercise it may be a little bit tricky.
Just in case - how I used AI to generate a test for Scala track. After migrating to Scala 3 the old generator has broken and doesn’t work. So, I just ask AI. I can feed canonical-data.json or in the last case I feed Java tests. AI generates 100% valid tests.
So, we can keep canonical-data.json just for AI or some text description.
Note, sometimes a randomizer does return the same value twice in a row ;) If the user does something like randomizer = new Random(seed=getRandom()) then there’s a chance that this would fail.
Anything is possible. That doesn’t mean it’s a reasonable expectation or ask for maintainers to do this.
If an AI is needed to generate tests, that is a bit of a red flag with regard to the test complexity.
I haven’t seen a convincing arguments against my idea of making the whole thing deterministic:
pick an interesting maze-generation algorith
define the set of edge cases we need to test against for implementing that particular algorithm
find mazes where generating them with that algorith isolates the edge cases
take the list of “random” decisions the algorithm needs to make to produce a maze as input to the test case
(optional) sprinke randomness-vibes on top by including a test case where the decisions of the algorithm are actually generated randomly
benefits:
higher-quality tests
possible to use existing test generators (instead of AI)
Students are taught the importance of separating the functional core of their software from any randomness / IO to improve testability.
Tracks are more likely to actually implement the exercise, as there is no hurdle of writing tests manually or reviewing AI generated ones.
mix of benefit and downside:
Students cannot cheese the exercise by trivially generating boring-to-solve mazes, but they also cannot choose to implement different maze generation algorithms they might be interested in.
I like the determinism in this approach. However, would that result in an overly prescriptive requirement? Most exercises are pretty agnostic to how you go about solving them.
That’s true, but I think it makes sense for this exercise. Firstly, there is the exercise about the Sieve of Eratosthenes, so it’s not unprecedented to have exercises about specific algorithms. But more importantly, coming up with an interesting maze generation algorith is difficult. If users come up with their own algorithm, there’s a good chance it’ll be boring and or produce boring puzzles. The instructions in the PR even link to Wikipedia, nudging users to pick an interesting algorithm to implement. So, if we want users to pick an existing, well-documented algorithm anyway to make the exercise fun, why not make that decision for them and massively improve the tests in the process?
X_0 is the supplied seed (it appears as an input field in canonical-data.json)
Each X_n specifies a direction, using X_n * 4 div 134456
0 right
1 up
2 left
3 down
Maze Generation
The maze is generated using randomized depth-first search (backtracking algorithm).
The width and height of the grid of cells is specified in the input.
We start at the top left corner and choose the next cell to visit by generating a random direction. Whenever the selected direction is not valid (because it would take us outside the grid or to a previously visited cell), we generate another random direction. If all neighbouring cells have been visited, we go back to the most recently visited cell having an unvisited neighbour. We repeat until no cells have an unvisited neighbour.
Note something that should be made clearer: we don’t generate a random direction when a cell has no unvisited neighbours, but otherwise we always do generate a random direction; even when there is only one unvisited neighbour, in which case it might take us several attempts to find the only valid direction.
Always using a fixed scheme to generate the decisions makes it hard to find a seed that produces a maze that’s well suited for testing a specific edge case. I think it would be much easier to simply store the list of decisions as a list. At least for the bulk of tests that check for known edge cases on relatively small puzzles.
It would be reasonable to use the random number generator for later tests that involve larger mazes. The two can be mixed.
The list of decisions could also be a string where each character is a decision, if a smaller encoding is desired.
The maze is generated using randomized depth-first search (backtracking algorithm).
I have no strong objections to that, but I think it’s worth discussing a few options of maze generation algorithms. We want the algorithm to be as interesting as possible to implement. The difficulty should also be considered.
For example, I think Wilson’s algorithm would be more interesting to implement. It also has the neat property that it “generates an unbiased sample from the uniform distribution over all mazes” (if the decisions made are actually random).