ISBN tests doesn't catch incorrect handling of X in some cases

I have observed solutions that could be tricked to accept invalid ISBN numbers by cleverly placing X. The proposed tests catch some mishandling of X in input.

Namely:

  • trailing X disables checking of ISBN length
  • X is treated as 10 anywhere inside ISBN (current test X is only valid as a check digit doesn’t have a correct checksum in this case)
  • X is skipped anywhere except the last digit

Attached PR Add isbn-verifier tests for X related bugs by pichi · Pull Request #2168 · exercism/problem-specifications · GitHub

What solution/case does this code aim to cover?

  • "81X" should be covered by the length check. Wouldn’t this fail any solution which does length checking? Shouldn’t this be covered by existing tests that have too-short inputs?
  • "3-598-2X507-9" is a duplicate test. This already exists on line 67.
  • "3-598-21X508-8": What is this meant to catch that isn’t covered by the existing “3-598-2X507-9”? Is this a likely case or one of many infinite possible corner cases which are unlikely to show up in most implementations?
  • “81X” should be covered by the length check. Wouldn’t this fail any solution which does length checking? Shouldn’t this be covered by existing tests that have too-short inputs?

No, it definitely doesn’t. This solution passed all tests but will accept "81X" as a valid ISBN. It is exactly the reason why I have added this test. The reason is described by this sentence trailing X disables checking of ISBN length which is written in the original post after the first bullet.

  • “3-598-2X507-9” is a duplicate test. This already exists on line 67

Oh, you are right. It seems that in the Elixir track is a typo which is why I was mistaken that this test isn’t there. I will remove it and send PR directly into the Elixir track.

  • “3-598-21X508-8”: What is this meant to catch that isn’t covered by the existing “3-598-2X507-9”? Is this a likely case or one of many infinite possible corner cases which are unlikely to show up in most implementations?

It is meant to catch implementations implemented in a way that X is skipped anywhere except the last digit. It’s the third bullet.

What if someone checks for other chars in all positions … except the last position? Should we include a "123P" test? What if they special case the first or third position?

Tests should be general and catch general approaches. They are not tailored (nor should be they tailored) to specific implementations and solutions.

Tests should be designed to nudge solutions to implement the expected implicit requirements. They should be designed to catch common pitfalls and incorrect approaches. They should not be tailored to specific solutions.

If someone wrote a solution that hard coded all the test inputs and outputs, would you suggest adding a new test case to cause that solution to fail?

New test cases should be applicable to a large number of solutions, not individual solutions.

What if someone checks for other chars in all positions … except the last position? Should we include a “123P” test? What if they special case the first or third position?

I don’t know which one you just argue with. Be more specific. The first case is the real solution which was not made by me. It is a perfectly working and very efficient solution in the language which is written in. It can be fixed by literally changing one character in the source code. It passes all current tests. There is a huge number of invalid inputs that this real code not written by me will erroneously accept as valid ISBN. I have made the shortest possible input which is obviously not a valid ISBN but this real code will fail to recognize it as invalid.

I don’t understand your argument here. It is not my code. It is not tailored to intentionally fail this test. It is the most stared solution in the track. Explain to me why it is not a good idea to make a test that will catch this real bug in real code made by a real person and unnoticed by tens of other fellow developers. It is literally the perfect example of a situation when you are supposed to add a new test to the test suite by the book.

BTW this bug went unnoticed for two years.

New tests in the problem spec should either extend the exercise with additional requirements or address a common issue. If there were multiple solutions in the top 100 community solutions that would fail that test, then that would suggest a shortcoming in the tests that leads to a common pitfall. If there is one specific input that one specific solution fails to handle properly… that’s not surprising.

With the amazing diversity of approaches that can be used to solve exercises, it would be surprising if there wasn’t a selection of solutions that had flaws. Tests shouldn’t be tailored to specific solutions that have bugs. They should be tailored to leading students towards a general solution. If they fail to generally do that and students often end up with a common bug, that is definitely worth fixing.

Specific bugs in specific solutions ideally would be worked out in mentoring sessions, not with problem specification changes.

Again, this solution is the most starred solution in the Elixir track with more than twice of stars as the second one.

  def isbn?(isbn), do: isbn?(isbn, 10, 0)
  def isbn?(<<c, rest::binary>>, n, sum) when c in ?0 .. ?9, do: isbn?(rest, n - 1, sum + n * (c - ?0))
  def isbn?(<<c, rest::binary>>, n, sum) when c == ?-, do: isbn?(rest, n, sum)
  def isbn?("X", _, sum), do: isbn?(<<>>, 0, sum + 10)
  def isbn?(<<>>, 0, sum), do: rem(sum, 11) == 0
  def isbn?(_, _, _), do: false

This solution will pass all tests but will accept “81X” as a valid ISBN. Yet this solution

  def isbn?(isbn), do: isbn?(isbn, 10, 0)
  def isbn?(<<c, rest::binary>>, n, sum) when c in ?0 .. ?9, do: isbn?(rest, n - 1, sum + n * (c - ?0))
  def isbn?(<<c, rest::binary>>, n, sum) when c == ?-, do: isbn?(rest, n, sum)
  def isbn?("X", 1, sum), do: isbn?(<<>>, 0, sum + 10)
  def isbn?(<<>>, 0, sum), do: rem(sum, 11) == 0
  def isbn?(_, _, _), do: false

Will work correctly. Try spotting the difference. It’s obvious there is a bug in the first one. It’s obvious that the test suite is insufficient which caused this bug to go unnoticed for two years. I’m professionally in the IT industry for 25 years and was coding for more than ten years prior. I must admit the first time I came across rules which prevent extending the test suite in cases like this. I’m baffled. Sorry. I’m perplexed.

Who makes those rules? When I recount all hoops I had to go thru to even get here just trying to help and fix the issue. All those welcoming messages and videos and then your PR can’t even keep open for seconds. Do you want help? If not, I can just go. I have never come across a community that was that vehemently rejecting any attempt to help.

Other track maintainers may very well have a different opinion than me and think this is worth adding a test.

Personally, I don’t believe a test case is worth adding simply because a single solution has a bug. I don’t believe the number of stars on the solution is relevant in any way. I don’t believe it matters how subtle or obvious the bug is. I don’t believe it matters how old the solution is. I feel like we’re going in circles.

Per the auto-responder on the repo, we prefer having discussions and agreeing on the value of PRs prior to opening PRs. The bar for updating the problem specifications is consensus of three track maintainers that the fix is worthwhile. There may very well be other maintainers who think this test is worth adding, which is entirely fine. My opinion and contribution to the discussion is the criteria I’ve already laid out. My goal here is to make my reasoning and criteria clear, and my opinion on this test.

I understand you believe that specific solution should fail and that more tests which catch more bugs is a good thing. I believe more is not necessarily better.

If you do not understand my reasoning or criteria, I’m happy to try to explain myself better. Otherwise, I feel like we’re just going in circles about what tests are or are not worth adding.

I see you are not interested in contributions. Bye.

To be clear, I do not in any way speak for any of the other maintainers nor for the Exercism team.

We also do want contributions. But we are trying to reset how things are done and are asking that people engage in a conversation and that we first come to an agreement about what should or shouldn’t change.

I hope that you do continue bringing up topics for discussion and do contribute to Exercism going forward!

I have my job and I’m here for fun. I found the bug. I found a missing test case. I made PR in track. They said you have to put it in the problem-specification. I made PR in problem-specification. I have installed all the required tools to check validity. I made this topic.

The code with the bug was not mentored. The code uses a valid approach. It is actually an idiomatic way how to solve this task in Elixir if you are concerned about efficiency. It is by the way an idiomatic way how to solve this task in Erlang generally. I can assure you it is from the position of over ten years of professional career development in Erlang. It’s why I was able to spot the bug in seconds. And no, I’m not going to open the Erlang track and check how many of the solution has the same bug and I’m not going to do it for Elixir solutions. Simply because I don’t have a means ho to do it in a timely manner. For example, when I copy from the web it ends up like this

1
defmodule IsbnVerifier do
2
  @doc """
3
    Checks if a string is a valid ISBN-10 identifier
4
 
5
    ## Examples
6
 
7
      iex> ISBNVerifier.isbn?("3-598-21507-X")
8
      true
9
 
10
      iex> ISBNVerifier.isbn?("3-598-2K507-0")
11
      false
12
 
13
  """
14
  
15
  @spec isbn?(String.t()) :: boolean
16
  def isbn?(isbn), do: isbn?(isbn, 10, 0)
17
  def isbn?(<<c, rest::binary>>, n, sum) when c in ?0 .. ?9, do: isbn?(rest, n - 1, sum + n * (c - ?0))
18
  def isbn?(<<c, rest::binary>>, n, sum) when c == ?-, do: isbn?(rest, n, sum)
19
  def isbn?("X", _, sum), do: isbn?(<<>>, 0, sum + 10)
20
  def isbn?(<<>>, 0, sum), do: rem(sum, 11) == 0
21
  def isbn?(_, _, _), do: false
22
  
23
end
24

And I don’t wanna waste more of my time figuring out how to circumvent around. It is what automated testing is for. It is why we professionals write tests. It is why I would like to add a missing test case, to help fellow developers.

I have already wasted orders of magnitude more time trying to help you than I had to do in any other open-source project. And it was vastly wasted my time to figure out how to jump through the hoops of your broken process. I find a bug, fix the code, and add a test case to prevent repeating. It’s what I have done for living. Now I tell people what to do and what not to do. For living. I can tell you.

@pichi-42 For what it’s worth, I’m with you on this one. It seems to me this discussion about saving effort has become way too expensive. I know of no clear rules about what kind of fixes are welcome; maybe there should be :person_shrugging:

2 Likes