"Hamming" Exercise allows solutions that produce incorrect results

The following code passes the test cases but is not correct.

hamming_distance_chars([], [], 0).
hamming_distance_chars([C|Cs1], [C|Cs2], Dist) :- 
    hamming_distance_chars(Cs1, Cs2, Dist).
hamming_distance_chars([C1|Cs1], [C2|Cs2], Dist) :- 
    hamming_distance_chars(Cs1, Cs2, SubDist),
    Dist is SubDist + 1.

hamming_distance(Str1, Str2, Dist) :-
    string_length(Str1, Len1),
    string_length(Str2, Len2),
    Len1 == Len2,
    string_chars(Str1, Chars1),
    string_chars(Str2, Chars2),
    hamming_distance_chars(Chars1, Chars2, Dist).

The issue lies in the hamming_distance_chars/3 predicate. The third case does not check if C1 and C2 are different.
Therefore for the call
?- hamming_distance("AAA", "AAA", Result).
We get the following answers:

Result = 0
Result = 1
Result = 1
Result = 2
Result = 1
Result = 2
Result = 2
Result = 3

However, the only valid answer should be 0.
The root cause seems to stem from the attempt of fixing the non-exhaustive test cases as raised in issue #81 since the test cases behave no different than prolog itself
test(long_identical_strands, condition(pending)) :- hamming_distance("GGACTGA", "GGACTGA", Result), Result == 0.
As long as hamming_distance/3 is able to find a Result which is equal to 0, the test passes.

In my opinion, the better method to write test cases in prolog is to first consider if the exercise expects a deterministic solution (exactly one or zero solutions for bound inputs) or a non-deterministic solution (can have multiple solutions for bound input variables).

Therefore, instead of trying to bind Result and then check if there was a found answer matching our expectation, I suggest using the -Options argument in the test/2 predicate. That should make the intention of the tests clear and avoid the issues in GitHub Issue #81

:- begin_tests(hamming).
    % problematic as pointed out in issue #81 and does not invalidate wrong solutions
    test(identical_strands, condition(true)) :-
        hamming_distance("A", "A", 0).

    % accepts only one solution
    test(det_long_identical_strands, true(Result =:= 0)) :-
        hamming_distance("GGACTGA", "GGACTGA", Result).

    % if multiple solutions were considered valid for this exercise for the provided code at the start of this issue
    test(nondet_long_identical_strands, set(Result == [0,1,2,3,4,5,6,7])) :-
        hamming_distance("GGACTGA", "GGACTGA", Result).

:- end_tests(hamming).

While I am sure this is a general problem on this exercism track, I noticed it during a mentoring session for this exercise in particular.

Edit: 2024-01-28:
I created a PR with my suggestion how to avoid students being able to submit solutions that offer incorrect answers:

I understand the issue, but on my machine there is no apparent difference between the current test and your suggestion.

If I use your implementation and change the tests to:

:- begin_tests(hamming).

    test(regular, condition(true)) :-
        hamming_distance("A", "A", 0).

    test(using_true, true(Result =:= 0)) :-
        hamming_distance("GGACTGA", "GGACTGA", Result).

:- end_tests(hamming).

The output is:

[1/2] hamming:regular ..
Warning: /Users/erik/solutions/prolog/hamming/hamming_tests.plt:9:
Warning:     PL-Unit: Test regular: Test succeeded with choicepoint
[2/2] hamming:using_true ..
Warning: /Users/erik/solutions/prolog/hamming/hamming_tests.plt:12:
Warning:     PL-Unit: Test using_true: Test succeeded with choicepoint
% All 2 tests passed in 0.014 seconds (0.013 cpu)

Both tests output the exact same warning. Am I missing something?

You are correct, I misunderstood the test frameworkā€™s documentation in regards to deterministic and non-deterministic test cases.
My point still stands, though.
Here is a single file with different test methods for this exercise and using the implementation I provided in the first post.

hamming_distance_chars([], [], 0).
hamming_distance_chars([C|Cs1], [C|Cs2], Dist) :- 
    hamming_distance_chars(Cs1, Cs2, Dist).
hamming_distance_chars([C1|Cs1], [C2|Cs2], Dist) :- 
    % depending on if C1 \== C2 is commented out or not, different test cases fail
    % if not commented, test cases D and E fail
    % if it is commented out, test case C fails
    %C1 \== C2,
    hamming_distance_chars(Cs1, Cs2, SubDist),
    Dist is SubDist + 1.

hamming_distance(Str1, Str2, Dist) :-
    string_length(Str1, Len1),
    string_length(Str2, Len2),
    Len1 == Len2,
    string_chars(Str1, Chars1),
    string_chars(Str2, Chars2),
    hamming_distance_chars(Chars1, Chars2, Dist).

%%%%%%%%%%% test
pending :-
    current_prolog_flag(argv, ['--all'|_]).
pending :-
    write('\nA TEST IS PENDING!\n'),
    fail.

% different methods to write unit tests in prolog
:- begin_tests(hamming).
    % Test A
    % deterministic check but binds the output before the predicate is called
    % problematic if output is supposed to be generated by the query
    test(determinate_form_single_answer_long_identical_strands) :-
        hamming_distance("GGACTGA", "GGACTGA", 0).

    % Test B
    % binds the variable Result in true() later such that hamming_distance/3 initializes it
    % basically the same as 
    % ?- hamming_distance("GGACTGA", "GGACTGA", Result), Result =:= 0.
    test(single_answer_true_form_long_identical_strands, [true(Result =:= 0)]) :-
        hamming_distance("GGACTGA", "GGACTGA", Result).

    % Test C
    % accepts only one solution because internally hamming_distance/3 is called in setof/3 with sort/2
    test(single_answer_set_long_identical_strands, set(Result == [0])) :-
        hamming_distance("GGACTGA", "GGACTGA", Result).

    % Test D
    % passes if it finds all unique solutions, uses setof/3 again
    test(many_answers_set_long_identical_strands, set(Result == [0,1,2,3,4,5,6,7])) :-
        hamming_distance("GGACTGA", "GGACTGA", Result).

    % Test E
    % passes if all answers are found in the given order. for simplicity, uses hamming_distance("AAA", "AAA", Result).
    % internally it calls hamming_distance/3 with findall/3
    test(many_answers_using_all_long_identical_strands, all(Result == [0,1,1,2,1,2,2,3])) :-
        hamming_distance("AAA", "AAA", Result).

:- end_tests(hamming).

I hope that you can reproduce the results I pointed out at the start of the cdoe in line 5 to 7.
With that said, I guess the more correct way for this exercise would be to provide the form used in Test C, using an answer set with only one answer, because if the student would write incorrect code, the test would respond with

test_file.pl:44:
        test single_answer_set_long_identical_strands: wrong "set" answer:
ERROR:     Expected: [0]
ERROR:        Found: [0,1,2,3,4,5,6,7]

EDIT 5th Jan:
I want to point out that the following nonsense program can also pass the current test cases which would not be possible if Test case Cā€™s structure was used

% Proof that the current test structure is insufficient
hamming_distance(Str1, Str2, Dist):-
    string_chars(Str1, CharsStr1),
    string_chars(Str2, CharsStr2),
    same_length(CharsStr1, CharsStr2),
    between(0, 9, Dist). 

Iā€™d like someone else to also take a peek. @neenjaw what are your thoughts?

As you know this isnā€™t a new problem for this track, but Iā€™m not sure what the best route to take is on this track.

To your point, many of the problems donā€™t fit Prolog very well and they often also donā€™t have tests that always guide students to learn how to control how Prolog explores the solution space. Your examples are fair and show it can lead students down a path where they are creating incorrect solutions without feedback.

Iā€™d be okay with your approach to proving single-answer deterministic solutions using test/2ā€™s options. The format for the multiple-answer deterministic solutions feels a bit cludgey to put them all inside a list, but it is something where currently we have nothing.

I went ahead and created a PR based on my proposal

1 Like

Maybe I am misreading, but that is not doing what @neenjaw suggested? I think the preference was for the true(Result =:= 0), or am I misreading?

Hello @ErikSchierboom , I understood this part of @neenjaw comment as saying that my suggestion of using set in the Option list is ok for single answer predicates.
The usage for multiple solutions is a bit cumbersome.

At the end of the day it is also not the end of issues regarding prologā€™s tests as it is always possible to provide the input, output pairs as facts

Example:

hamming_distance("", "", 0).
hamming_distance("AAA", "BBB", 3).
hamming_distance("AAA", "BBA", 2).
% ...

But it is a step into the right direction.

Letā€™s have @neenjaw confirm this. I feel like the set notation makes it a bit harder to read for students.

Yea, I definitely meant this format as my preference. The list literal notation is the confusing part about the other style suggested.

This is where we need to remember that people get out of Exercism what they put in. If they choose to shortcut, then they do, and they are the only one who actually suffers because they lack the skill to solve the problem. Trying to stop this behaviour is futile work to me and I donā€™t think we should devote our scarce resources to solve this problem.

1 Like

Maybe not for putting in the specific answer as in the

hamming_distance("", "", 0).
hamming_distance("AAA", "BBB", 3).
hamming_distance("AAA", "BBA", 2).

but what does it say about an exercise if a student submits wrong code that solves the exercise by accident?

I still think that providing the answers as sets has two distinct advantages.

First of all, students wonā€™t receive the warning about multiple choice points which is a very common point of confusion in the mentoring sessions.
Second it gives the students feedback that their code contains errors / provides confidence that it is correct

It is not as if there arenā€™t exercises already that make a use of the Option argument. You can find it for example in the exercise Satellite.

Iā€™m still not in favor of the set based notation, sorry. So either we use format I mentioned before, or we leave things as is.