[Sublist] make instructions more precise

The words “sublist” and “subsequence” mean different things.
They should not be used interchangeably.

By the definition, [1, 3] is a subsequence of [1, 2, 3] .
However, a test requires these to be Unequal.

The test is in contradiction with the instructions due to using the word “sub-sequence”.

Link to article that defines subsequence:

Link to the test that’s in contradiction with the word “sub-sequence” in the instructions:
https://git hub.com/exercism/problem-specifications/blob/9c864c448fc1cc85cf2959ccffe9c768001a7a39/exercises/sublist/canonical-data.json#L147-L156

Working link to the above (correct) test. The exercise is absolutely supposed to be checking for sublists and not “sub-non-contiguous-sequences”.

1 Like

This thread is forked from Wording issue in Sublist instructions.md - #5 by mkovaxx

Suggestion from earlier thread: replacing sub-sequence with contiguous sub-sequence in lines 11 and 12

That sounds reasonable to me!

The current PR uses the term sublist to define a sublist, and both changes are more complicated than they need to be.

Both definitions can be simplified:

  • List A is a superlist of B if A contains a contiguous sequence of elements that is equal to B.
  • List A is a sublist of B if B contains a contiguous sequence of elements that is equal to A.

Note that i’m not using the prefix sub because it’s not really needed.

1 Like

Note that the definition i’m proposing, will classify A being a superlist of B if A == B. This isn’t incorrect in the mathematical sense, but it’s probably not what we want here. So to avoid those cases we can change the definitions to:

  • List A is a superlist of B if A contains a contiguous sequence of elements that is equal to B and A is longer than B.
  • List A is a sublist of B if B contains a contiguous sequence of elements that is equal to A and B is longer than A.

Note that i’m still not using the prefix sub to avoid any kind of repetition (sublist, sub-sequence). Using plain English is probably much better, since people are not expected to know what sub implies.

Sounds good. Another phrasing that might also work:

  • List A is a superlist of shorter list B if A contains a contiguous sequence of elements that is equal to B.
  • List A is a sublist of longer list B if B contains a contiguous sequence of elements that is equal to A.

This makes sense to me, so I’ve updated the branch.

FWIW, I think it’s OK to choose a definition for the sublist relation such that it is reflexive (i.e. X sublist X is true for any list X).

Doing so is in line with mathematical tradition, where subset and subsequence are also reflexive. When reflexivity is to be disallowed, the word proper is added.

The problem statement makes it clear that A == B is a case to be handled separately, even though in that case A sublist B and A superlist B are also true.

Now that the discussion is also about simplifying the problem statement, I would propose the following.

The problem statement can be reworded to only refer to == (equality) and the sublist relations, so that the superlist relation is unnecessary complication.

Something along the lines of:

  • A is equal to B; or
  • A is a sublist of B; or
  • B is a sublist of A; or
  • None of the above is true, thus lists A and B are unequal

The exercise requires you to return one of four values, one of which is superlist. If we’re asking students to return the superlist value, we should be defining superlist.

Many (most?) students likely are not familiar with these terms. We try to explain the concepts using “everyday” English and generally avoid technical terminology (unless that terminology is really helpful to understand, in which case we would need to define it).

I’m not suggesting to use the terms ‘reflexive’ or ‘proper’ in the description of the exercise itself. I’m using these terms to explain why I support tasx’s proposal.

Does that distinction make sense?

If we’re asking students to return the superlist value, we should be defining superlist.

How about wording it as follows?

  • If A is equal to B, then return the value equal.
  • If A is a sublist of B, then return the value sublist.
  • If B is a sublist of A, then return the value superlist.
  • If none of the above are true, then return the value unequal.

I think it makes more sense to define/explain the four terms. We avoid implementation details in the prose where possible. (Not all tracks ask that you return a value; some tracks have you print values or update database tables).

I would stick with just inserting “contiguous” into the existing prose.

Specifically, list `A` is equal to list `B` if both lists have the same values in the same order.
- List `A` is a superlist of `B` if `A` contains a sub-sequence of values equal to `B`.
+ List `A` is a superlist of `B` if `A` contains a contiguous sub-sequence of values equal to `B`.
- List `A` is a sublist of `B` if `B` contains a sub-sequence of values equal to `A`.
+ List `A` is a sublist of `B` if `B` contains a contiguous sub-sequence of values equal to `A`.

The problem with using just sub-sequence is that if A == B then A is a sub-sequence of B and B is also a sub-sequence of A, because a sub-sequence can be formed by deleting 0 or more elements. That’s primarily why i left the sub prefix out and changed it to my current suggestion:

  • List A is a superlist of B if A contains a contiguous sequence of elements that is equal to B and A is longer than B.
  • List A is a sublist of B if B contains a contiguous sequence of elements that is equal to A and B is longer than A.

What @keiraville proposed is also good, with the extra addition of the before the words longer and shorter.

I didn’t see any issue with equal lists also being sublists of each other. We could also add in a “and are not equal” to the clause if we want a strict sublist, but I’m not sure that is needed.