"Hamming" Exercise allows solutions that produce incorrect results

bomber34 · January 2, 2024, 7:03pm

The following code passes the test cases but is not correct.

hamming_distance_chars([], [], 0).
hamming_distance_chars([C|Cs1], [C|Cs2], Dist) :- 
    hamming_distance_chars(Cs1, Cs2, Dist).
hamming_distance_chars([C1|Cs1], [C2|Cs2], Dist) :- 
    hamming_distance_chars(Cs1, Cs2, SubDist),
    Dist is SubDist + 1.

hamming_distance(Str1, Str2, Dist) :-
    string_length(Str1, Len1),
    string_length(Str2, Len2),
    Len1 == Len2,
    string_chars(Str1, Chars1),
    string_chars(Str2, Chars2),
    hamming_distance_chars(Chars1, Chars2, Dist).

The issue lies in the hamming_distance_chars/3 predicate. The third case does not check if C1 and C2 are different.
Therefore for the call
?- hamming_distance("AAA", "AAA", Result).
We get the following answers:

Result = 0
Result = 1
Result = 1
Result = 2
Result = 1
Result = 2
Result = 2
Result = 3

However, the only valid answer should be 0.
The root cause seems to stem from the attempt of fixing the non-exhaustive test cases as raised in issue #81 since the test cases behave no different than prolog itself
test(long_identical_strands, condition(pending)) :- hamming_distance("GGACTGA", "GGACTGA", Result), Result == 0.
As long as hamming_distance/3 is able to find a Result which is equal to 0, the test passes.

In my opinion, the better method to write test cases in prolog is to first consider if the exercise expects a deterministic solution (exactly one or zero solutions for bound inputs) or a non-deterministic solution (can have multiple solutions for bound input variables).

Therefore, instead of trying to bind Result and then check if there was a found answer matching our expectation, I suggest using the -Options argument in the test/2 predicate. That should make the intention of the tests clear and avoid the issues in GitHub Issue #81

:- begin_tests(hamming).
    % problematic as pointed out in issue #81 and does not invalidate wrong solutions
    test(identical_strands, condition(true)) :-
        hamming_distance("A", "A", 0).

    % accepts only one solution
    test(det_long_identical_strands, true(Result =:= 0)) :-
        hamming_distance("GGACTGA", "GGACTGA", Result).

    % if multiple solutions were considered valid for this exercise for the provided code at the start of this issue
    test(nondet_long_identical_strands, set(Result == [0,1,2,3,4,5,6,7])) :-
        hamming_distance("GGACTGA", "GGACTGA", Result).

:- end_tests(hamming).

While I am sure this is a general problem on this exercism track, I noticed it during a mentoring session for this exercise in particular.

Edit: 2024-01-28:
I created a PR with my suggestion how to avoid students being able to submit solutions that offer incorrect answers:

ErikSchierboom · January 3, 2024, 9:21am

I understand the issue, but on my machine there is no apparent difference between the current test and your suggestion.

If I use your implementation and change the tests to:

:- begin_tests(hamming).

    test(regular, condition(true)) :-
        hamming_distance("A", "A", 0).

    test(using_true, true(Result =:= 0)) :-
        hamming_distance("GGACTGA", "GGACTGA", Result).

:- end_tests(hamming).

The output is:

[1/2] hamming:regular ..
Warning: /Users/erik/solutions/prolog/hamming/hamming_tests.plt:9:
Warning:     PL-Unit: Test regular: Test succeeded with choicepoint
[2/2] hamming:using_true ..
Warning: /Users/erik/solutions/prolog/hamming/hamming_tests.plt:12:
Warning:     PL-Unit: Test using_true: Test succeeded with choicepoint
% All 2 tests passed in 0.014 seconds (0.013 cpu)

Both tests output the exact same warning. Am I missing something?

bomber34 · January 3, 2024, 2:06pm

You are correct, I misunderstood the test framework’s documentation in regards to deterministic and non-deterministic test cases.
My point still stands, though.
Here is a single file with different test methods for this exercise and using the implementation I provided in the first post.

hamming_distance_chars([], [], 0).
hamming_distance_chars([C|Cs1], [C|Cs2], Dist) :- 
    hamming_distance_chars(Cs1, Cs2, Dist).
hamming_distance_chars([C1|Cs1], [C2|Cs2], Dist) :- 
    % depending on if C1 \== C2 is commented out or not, different test cases fail
    % if not commented, test cases D and E fail
    % if it is commented out, test case C fails
    %C1 \== C2,
    hamming_distance_chars(Cs1, Cs2, SubDist),
    Dist is SubDist + 1.

hamming_distance(Str1, Str2, Dist) :-
    string_length(Str1, Len1),
    string_length(Str2, Len2),
    Len1 == Len2,
    string_chars(Str1, Chars1),
    string_chars(Str2, Chars2),
    hamming_distance_chars(Chars1, Chars2, Dist).

%%%%%%%%%%% test
pending :-
    current_prolog_flag(argv, ['--all'|_]).
pending :-
    write('\nA TEST IS PENDING!\n'),
    fail.

% different methods to write unit tests in prolog
:- begin_tests(hamming).
    % Test A
    % deterministic check but binds the output before the predicate is called
    % problematic if output is supposed to be generated by the query
    test(determinate_form_single_answer_long_identical_strands) :-
        hamming_distance("GGACTGA", "GGACTGA", 0).

    % Test B
    % binds the variable Result in true() later such that hamming_distance/3 initializes it
    % basically the same as 
    % ?- hamming_distance("GGACTGA", "GGACTGA", Result), Result =:= 0.
    test(single_answer_true_form_long_identical_strands, [true(Result =:= 0)]) :-
        hamming_distance("GGACTGA", "GGACTGA", Result).

    % Test C
    % accepts only one solution because internally hamming_distance/3 is called in setof/3 with sort/2
    test(single_answer_set_long_identical_strands, set(Result == [0])) :-
        hamming_distance("GGACTGA", "GGACTGA", Result).

    % Test D
    % passes if it finds all unique solutions, uses setof/3 again
    test(many_answers_set_long_identical_strands, set(Result == [0,1,2,3,4,5,6,7])) :-
        hamming_distance("GGACTGA", "GGACTGA", Result).

    % Test E
    % passes if all answers are found in the given order. for simplicity, uses hamming_distance("AAA", "AAA", Result).
    % internally it calls hamming_distance/3 with findall/3
    test(many_answers_using_all_long_identical_strands, all(Result == [0,1,1,2,1,2,2,3])) :-
        hamming_distance("AAA", "AAA", Result).

:- end_tests(hamming).

I hope that you can reproduce the results I pointed out at the start of the cdoe in line 5 to 7.
With that said, I guess the more correct way for this exercise would be to provide the form used in Test C, using an answer set with only one answer, because if the student would write incorrect code, the test would respond with

test_file.pl:44:
        test single_answer_set_long_identical_strands: wrong "set" answer:
ERROR:     Expected: [0]
ERROR:        Found: [0,1,2,3,4,5,6,7]

EDIT 5th Jan:
I want to point out that the following nonsense program can also pass the current test cases which would not be possible if Test case C’s structure was used

% Proof that the current test structure is insufficient
hamming_distance(Str1, Str2, Dist):-
    string_chars(Str1, CharsStr1),
    string_chars(Str2, CharsStr2),
    same_length(CharsStr1, CharsStr2),
    between(0, 9, Dist).

ErikSchierboom · January 3, 2024, 3:00pm

I’d like someone else to also take a peek. @neenjaw what are your thoughts?

neenjaw · January 17, 2024, 5:32am

As you know this isn’t a new problem for this track, but I’m not sure what the best route to take is on this track.

To your point, many of the problems don’t fit Prolog very well and they often also don’t have tests that always guide students to learn how to control how Prolog explores the solution space. Your examples are fair and show it can lead students down a path where they are creating incorrect solutions without feedback.

I’d be okay with your approach to proving single-answer deterministic solutions using test/2’s options. The format for the multiple-answer deterministic solutions feels a bit cludgey to put them all inside a list, but it is something where currently we have nothing.

bomber34 · January 28, 2024, 4:50pm

I went ahead and created a PR based on my proposal

ErikSchierboom · January 30, 2024, 9:34am

Maybe I am misreading, but that is not doing what @neenjaw suggested? I think the preference was for the true(Result =:= 0), or am I misreading?

bomber34 · January 30, 2024, 6:24pm

Hello @ErikSchierboom , I understood this part of @neenjaw comment as saying that my suggestion of using set in the Option list is ok for single answer predicates.
The usage for multiple solutions is a bit cumbersome.

At the end of the day it is also not the end of issues regarding prolog’s tests as it is always possible to provide the input, output pairs as facts

Example:

hamming_distance("", "", 0).
hamming_distance("AAA", "BBB", 3).
hamming_distance("AAA", "BBA", 2).
% ...

But it is a step into the right direction.

ErikSchierboom · January 31, 2024, 8:05am

Let’s have @neenjaw confirm this. I feel like the set notation makes it a bit harder to read for students.

neenjaw · February 6, 2024, 10:02pm

Yea, I definitely meant this format as my preference. The list literal notation is the confusing part about the other style suggested.

bomber34:

At the end of the day it is also not the end of issues regarding prolog’s tests as it is always possible to provide the input, output pairs as facts

Example:
hamming_distance("", "", 0).
hamming_distance("AAA", "BBB", 3).
hamming_distance("AAA", "BBA", 2).
% ...

This is where we need to remember that people get out of Exercism what they put in. If they choose to shortcut, then they do, and they are the only one who actually suffers because they lack the skill to solve the problem. Trying to stop this behaviour is futile work to me and I don’t think we should devote our scarce resources to solve this problem.

bomber34 · February 7, 2024, 5:44pm

Maybe not for putting in the specific answer as in the

hamming_distance("", "", 0).
hamming_distance("AAA", "BBB", 3).
hamming_distance("AAA", "BBA", 2).

but what does it say about an exercise if a student submits wrong code that solves the exercise by accident?

I still think that providing the answers as sets has two distinct advantages.

First of all, students won’t receive the warning about multiple choice points which is a very common point of confusion in the mentoring sessions.
Second it gives the students feedback that their code contains errors / provides confidence that it is correct

It is not as if there aren’t exercises already that make a use of the Option argument. You can find it for example in the exercise Satellite.

ErikSchierboom · February 15, 2024, 1:39pm

I’m still not in favor of the set based notation, sorry. So either we use format I mentioned before, or we leave things as is.