[Linked List] Exercise instructions were broken by new tests

tony-sappe · December 22, 2022, 11:14pm

New tests introduced by [Maintenance]: Update linked-list by BethanyG · Pull Request #3256 · exercism/python · GitHub run counter to the original problem statement. The separate additional instructions (which were unfortunately misnamed in the previously mentioned PR and needed to be addressed here) conflict with the current instructions content.

First item
To keep your implementation simple, the tests will not cover error conditions. Specifically: pop or shift will never be called on an empty list. (See markdown here)

New tests are explicitly checking a specific exception with a specific message is raised. Either the tests needed to be rolled back or the problem statement needs to be updated to be in alignment with the additional instructions notes and tests.

Second item
The addition of delete() and __len__() by the tests increased the exercise difficulty and are not accounted for in instructions.md. The programmer now needs to implement functionality for traversal through all nodes to provide the expected behavior. This is a departure from the previous exercise solution which solely utilized the head and tail nodes.

So?
This isn’t a nitpicky complaint. Both of these were flagged by another member in their iteration 2 solution:

# XXX The tests now expect there to be a delete method, though the problem
# description makes no mention of it.  Also, in direct contradiction to the
# description, they expect IndexError if you try to pop or shift from an
# empty list.

In an effort to improve the user experience I think maintainers should step back and clearly identify what the intentions are with this exercise and clean up the instructions and tests to be in sync with the intention.

Conclusion
While an overwhelming majority of exercise updates are beneficial and appreciated it is doubtful in this recent case.

While updating the notes in instructions.md would be the simple “fix” there is an argument to be made that the additions were not trivial and increase exercise complexity past its currently listed “medium” difficultly level.

It may be a good idea to roll back the new tests to preserve the original exercise solutions and create a new exercise which goes deeper into Linked List so programmers can build upon these valuable concepts.

Meatball · December 23, 2022, 12:07am

Hi, thanks for your comment.

There was a bug there some instructions were missing as I think the other member pointed out. That has been fixed though. And we are mentioning error handling as well in the instructions.

To give some background:

This body of work done in that pr added jinja templates to the exercise. That means for the user that the data can stay better up to date against: exercism/problem-specifications: Shared metadata for exercism exercises. (github.com) .
The problem specification is a shared space for sharing tests and instructions for different exercises.

When we did this “new” update of the exercise that meant the addition of “len” and “delete” methods since the “tests” had that data. The people, who update the exercises has always the option to decline changes but we thought the changes were fitting.

We also decided to include error handling, this was to make linked list stay better up to pair against simple-linked-list. The problem description isn’t written by the python track contributors but instead by the problem specification documents authors.

Although every track has the option to make its own interpretation of each exercise. It is generally thought that changes to the main instructions should be avoided. We have decided that the best solution is by having an append that includes instructions on the error handling bit.

On the fact about the difficulty, the python track has currently 2 exercises with the difficulty set to “hard”. This exercise isn’t as hard as those(in my opinion) and probably not as hard as other “medium” exercises (in my opinion). Although if more people complain that this exercise feels like an “hard” exercise, it will be looked at.

As a final note, there are very rarely rollbacks of changes. Your feedback is noted and if I see more people with the same kind of feedback, we will reconsider it.

IsaacG · December 23, 2022, 12:32am

Documentation is one of those few hard problems in programming! Any discrepancies between the instructions and expectations should definitely be resolved. Thank you for flagging this. Thankfully, fixing documentation typically isn’t that big of a deal.

Generally speaking, Exercism exercise requirements are driven by the tests more so than the documentation. It’s good to avoid discrepancies but we tend to rely on the tests as an indication of what is expected and as the final say of what the exercise is and isn’t expected. Just because something isn’t called out in the docs doesn’t mean it’s wrong or should not be included. There’s an ongoing balancing act that goes on between laying out every last detail in the instructions and letting the tests be the guide (TDD).

One (or two) users out of thousands complaining about an exercise detail isn’t exactly the gold standard for “is this a nitpick or not”. I appreciate that you recognize that this is a bit nitpicky … but your justification feels like motivated reasoning and not well thought out. Falling victim to logical fallacies and spotting them can be tricky and we’ve all done so. As someone who has studied formal logic, I try to flag poorly constructed arguments so that people can make sure they’re making a logical case for something. This point may be worth a revisit, assuming it’s relevant to fixing the issue at hand.

I believe that’s exactly what this is about! Exercism but a hold on community contributions specifically so maintainers can decide how to best serve the community. Python maintainers have decided that aligning tests with the problem specs and more templating is the best way to serve the community. It’s a big effort and I applaud their work! I also recognize that improvements occasionally come with “breaking changes”. Thankfully, some mismatch between documentation and tests is quite minor, all things considered.

This has the bearing of a logical argument but I don’t think the premises support this conclusion. This change moves the exercise in the direction of more templating and closer alignment to the problem spec which the maintainers (and I!) think is a really good goal! It seems like there’s more than one side to consider here.

Have you completed the “hard” exercises? Are there other “hard” exercises you believe are on par with this one? A list traversal is relatively easy. There’s plenty of other medium exercises with greater complexity than walking a linked list. Ranking exercise difficulties is challenging. The best way to rank exercises is relative to other specific exercises. Drafting a list of all the exercises and their difficulties, taking into account the full range of exercises, would be really helpful and add data points the maintainers can use to update difficulties.

Usually rollbacks happen when something is broken pretty badly and it leaves things in an unacceptable state. A bit of a mismatch in the instructions doesn’t seem that big of a deal. Ultimately, the choice to rollback, ignore or roll forward lies with the track maintainers and/or the Exercism team.

It might be more helpful and productive to suggest specific changes the to instructions which can help bring the instructions in line with the tests and clarify anything that needs fixing. That would help maintainers identify the severity of the issue and help them identify the best path forward, be it a doc fix, a rollback or accepting that the documentation has (and always will have) some limitations.

tony-sappe · December 23, 2022, 7:44am

Hey Meatball, thanks for your response.

That makes sense, and I think it is a good desire. I guess I’m lost since the new tests now contradict the original instructions. Does that mean the Python exercise, while sharing the same instructions, now diverge from tests in other languages due to an effort to standardize? If I’m not mistaken on that it seems like standardizing the test cases would be better.

I realize the levels are subjective. At least one of the “hard” exercises was trivial for me and I had to wrestle with more than one “medium” exercise. I was primarily pointing out that if the exercise was previously “medium” and the changes increased complexity perhaps the level should change.

tony-sappe · December 23, 2022, 8:33am

Hey Isaac, thanks for taking the time to write a detailed response. It’s been a while since providing you mentorship on a solution. Glad to hear you’re doing well.

Correct, if the documentation is wrong. I attempted to make the case that the new tests were wrong. They oppose the original (and still documented) instructions which are fine as-is.

Yes, agreed. However this post wasn’t due to missing content but additional tests running counter to previous implementations and explicit instructions. It is one thing to not mention expected behavior so the failing tests can communicate expectations, and it is quite another to explicitly state certain scenarios will not be tested but suddenly are tested.

At the time of this response there are only five users with a published solution satisfying the latest tests for that exercise (out of 330 total, a far cry from “thousands”). Additionally I wasn’t looking for confirmation, I stumbled upon it from the only solution I looked at. Minimum 2/6 (including my yet to be published solution and the one published after my original post) → 33% of users negatively remarking on the changes. What percentage of users would prompt rethinking a course of action?

Also it might be worth considering how humans act: if even one person has a problem with something to the point of taking their time to document it, it is safe to assume others share the same sentiment. As a thought experiment, do we know how many users decided not to update the exercise because they disagreed with the changes?

Very noble.

Breaking changes are one thing. Adding new tests 180 degrees off the instructions are another.

Yes, I completed every Python exercise.

Considering I found the “Rest API” exercise very easy, yes. I’ve also wrestled with more than a few “medium” exercises. I realize difficulty is subjective. Logically, making a medium problem more complex could increase the difficulty level.

I’m sorry my original post was unclear. I was not advocating or asking for changes to the instructions. I was recommending rolling back the new tests so the exercise stayed true to the original problem.

Meatball · December 23, 2022, 10:30am

Well as I already have said a full rollback has a very slim occurrence of occurring.

As of now are you the only one who has complained about this specific issue since yesterday when a fix was merged. That kinda makes all feedback before that point “obsolete”.

As of now, we have to wait for more feedback, which can take quite a long time at least for this kind of exercise. For example, the number concept exercise is on our schedule to redo, since there has been feedback and the completion rate is disappointing. Although that exercise has been out for a year or perhaps two.

At the same time am I unsure a bit about what your point is. Is it that we added delete and len which is part of the canonical data or that we added costume cases that add error handling? Or is it that we added error handling while a part of the description said there won’t be error handling? Or is it all of them?

At least in my opinion does error handling not make this exercise harder. Just a few extra steps.

Sure we went against the “main” instructions but at the same time as far as I have seen in conical data so do you have to add costume test cases for all error handling since error handling isn’t covered in canonical data.

Therefore if we never made costume cases for some exercise so would no exercise cover error handling.

Also the fact about what these changes meant was we aware of when it was released. We also shared to the community about these changes in our blog post:
Python blog post 12/13 - Exercism - Exercism

IsaacG · December 23, 2022, 2:45pm

Might I suggest we focus on what change be changed in the documentation to bring the docs in line with the tests going forward?