Error in test for Card Games exercise, 6. More averaging techniques (Clojure)

Hi y’all,

After coding up answers to the Card Games Clojure exercise and testing and submitting, and then checking community solutions for ideas, I saw I actually had an error in my implementation, but the current tests aren’t noticing that.

If you go to StephanieKemna's solution for Card Games in Clojure on Exercism (if you can? if not, let me know and I can copy-paste implementations - but I was trying to avoid that in case people search for answers)
and compare iteration 1’s average-even-odd? to iteration 2’s average-even-odd? then the first iteration was in fact not correct.

I have been thinking if it is possible to make a test case where it will actually fail for iteration 1 but not iteration 2, with the current test condition, but I think that is not possible:
If the average over even and odd is the same, then the average over all is the same as well.
If the average over even and odd is not the same, then the average over all is also not the same.
So then you’d have to check the values that people are calculating, but that would change the subexercise. So I don’t have a good solution yet, heh :sweat_smile:, but wanted to flag this in any case.

Thanks in any case for the Exercism platform and exercises and all :smiling_face:

What makes that first iteration incorrect? It sounds like you simply found there are multiple ways to check the same thing!

The first iteration is incorrect indeed. I’m not surprised though, the lists used in the tests are a bit too “predictable” and it’s easy to pass them with an incorrect solution. Even though this won’t happen very often, i still think it’s worth adding few checks, and it’s trivial to do so.

The only problem is that the repo has no active maintainers. My last PR is still closed, i guess nobody hasn’t even bothered looking, so… :slight_smile:

Thanks @IsaacG but that is not the case this time :)
The goal is to check for even versus odd, but in the first iteration I calculated the average over all items instead of over only the odd-indexed items, for avg_odd.

@tasx thanks, yes, I tried to come up with some extra test cases, to have a solution to this error I am pointing out, because then it would indeed be trivial to add another test case (ignoring whether or not it in fact is so due to code maintenance).
However, when I started analyzing the existing test cases and the Math behind, I belief this is not possible. I would be happy to be proven wrong, though, if you can find test cases.
If we think about the math of the problem, it seems impossible to test the function the way it is being tested now. If the average over odd and even is different, then the average over all and even will, by definition, also be different. I don’t see a way yet to test that things are calculated correctly in any other way than actually testing the calculated averages are correct. Then we’d need extra functions to test those function outputs or something… and thus the exercise would change more drastically.

The goal is to determine if the average of even and odd cards matches. You found that the result can be determined by comparing even vs all. That is slightly more work but does correctly fulfill the goal.

The tests check if the correct value is computed, not how it is computed.

I see what you mean, and thanks for taking time to respond!
I would agree if I had in fact thought this through before implementing, but that was not the case. I made an error in my logic and in my implementation - because I was trying to actually calculate the average over all odd index cases (as evidenced by the comments and naming).
So even though I can say now, having looked into the cases more, that the result ends up being the same if you compare all to even, versus odd to even, that is not what the exercise is asking for, and it was not the intention of my implementation either. So I would still say that the test fails to find such implementation errors.

Yes, i believe you are right. I did some math and this appears to be the case.

That’s true, but it’s also perfectly fine if an “incorrect” implementation always passes, which appears to be the case here. Your logic would be incorrect if it did fail for some cases. So your solution is acceptable even though it doesn’t match the problem description.

Correct. The tests don’t care about the implementation at all. They only care about the result. Since both implementations are functionally identical, the tests are satisfied. The tests make no effort to understand the implementation. Only the result. This is by design. This is not an error in the tests.

The tests only check the results. If you want feedback on implementation, you can request a code review, or review other solutions and/or the Dig Deeper docs.

The exercise asks you to determine if the average of the even and odd cards are the same. Both implementations do determine if the averages are the same or not. The exercise does not tell you how to implement that check. It does not say “compute the even average and odd average”. It says “figure out if they are the same or not”. Both implementations are correct.

Ok, thanks for the feedback!

I understand all you both are saying, and appreciate the perspectives, though I still cannot say I like it :sweat_smile: because it does not provide the best learning experience…

I had hoped maybe there was a way to change the tests to fix that, but it seems the only way to catch such errors as these would be to start a mentoring session. If one has to create a mentoring session for every exercise, that’s a lot of (unnecessary?) burden on the mentors.

But we can close this case then, since this goes into bigger discussions about the design of this platform and the learning experience, and that is beyond the scope of what I started this topic about.

Version 2 of the platform required that students go through a mentoring session on every exercise to mark it completed. V3 makes it optional but it’s still highly recommended. The mentoring sessions is a key feature of the platform and students are strongly recommended to use it – for every exercise, not just the ones they don’t understand or realize they are solving poorly.

Often students learn a huge amount from being mentored on solutions they thought was already perfect when they submitted it.

That’s interesting, I had no idea on the history here. As a new user of the platform, that was not clear to me either, and by default, I would use mentoring if I was unsure about something, not if I thought I’d implemented things fine, and I did not have specific questions. Since this is based on all volunteers, one is a bit hesitant to send in too many requests, if one goes through the exercises fairly quickly.
Good to know :slight_smile: