Unicode or just digits?

The description for the Series exercise explicitly states that it’s specifically about digits, hence the need for its clarification at the end, " Note that these series are only required to occupy adjacent positions in the input; the digits need not be numerically consecutive."

That sentence could be cut if it were about letters, as this test implies:

  test "Unicode characters count as a single character" do
    assert StringSeries.slices("José", 1) == ["J", "o", "s", "é"]
    assert StringSeries.slices("José", 2) == ["Jo", "os", "sé"]
  end

However, I see that it’s not part of the canonical test data.

For consistency, could I either change the wording of the description (remove the “digits” part and make it use letters in its example, so there’s no need for the clarifying sentence at the end) or cut the extra test that contradicts the description?

I think it makes sense to replace “digits” with “letters” or “characters” in the description.

You wouldn’t also swap the example to make it quicker to grasp? (Or just substitute a letter for one of the numbers?)

I wouldn’t be opposed to that but I don’t think it’s necessary. But I’d love to see what other maintainers think, too!

1 Like

I’m not sure the clarifying sentence does much as-is. The “49142” example already shows we’re grabbing adjacent characters, and those outputs aren’t numerically consecutive. So maybe we can move the clarification into the example and drop the sentence about input validation.

Is this an update to the canonical exercise instructions? Then, we’ll want to add tests using alphabetic inputs and maybe even alphanumeric ones. Otherwise, the tests are incomplete compared to what the instructions say.

Up to that track’s maintainers since it’s not from problem specifications.


I’d be for expanding the tests. Not all tracks support Unicode (Cairo for example) so it’d be nice to have both ASCII and Unicode tests though.

I’d be opposed to adding non-ASCII tests. We have those in other exercises already if we want exercises that explore that, and it would detract from the focus on this exercise.

1 Like

This exercise is about digits, not letters. Wouldn’t adding extra tests with letters break all solutions?

Since this extra test is not in the canonical-data, i’d prefer if the Elixir maintainers find a way to deal with it. Maybe with an append file?

Yeah, I’d likely only include the ASCII tests myself, but it could be an optional scenario for other tracks.

Yes, this is about the Elixir track only.

To me the Elixir track went beyond the scope and point of the exercise (maybe it could even be solved numerically, but Unicode takes that option away).

So, what’s the verdict? :)

1 Like

No? The tests use digits, not numbers. Digits and letters are both a subset of ASCII characters. I expect most existing solutions treat the input as an ASCII string and don’t check if the string contains digits vs other chars.

1 Like

Depends, they could’ve used guards or regexs. I’m not sure we should lean into the scope breakage to cause user solution breakage while we’re at it.

Would you mind sharing how Elixir differs from the problem spec and why it would differ from other tracks? At a glance, it seems to be pretty in line with the specs.

Yes, as mentioned at the top, they added a test for Unicode to the canonical tests, which only test using digits (as the description says will happen).

No prob if you want to keep this special Unicode test in the Elixir track, but the description should reflect it, that’s all I mean. :)

Oh, then I’m bowing out. I thought we were discussing upstreaming this to problem-specifications. Since this isn’t in the Elixir subforum, the Elixir maintainers might not be seeing this. In that case, I’m CCing @angelikatyborska.

Ah. Joy. I don’t think the problem spec should mention non-ASCII chars, but also tracks generally use the problem spec description without modifications (though sometimes there are appends). Balancing both of those can make things … tricky.

1 Like

Ah, thanks, sorry about that. I hit New Topic and picked the one the PR bot had always directed me to, then added the Elixir tag to it, but I didn’t realize the tag is insufficient. :sweat_smile:

1 Like

I agree, it doesn’t need to mention non-ASCII chars, but it shouldn’t explicitly say to only expect digits.

Let’s remove this non-canonical test. My approach for the Elixir tack is to be as close to the canonical data as possible, and this extra test is clearly unnecessary. @codingthat thanks for bringing this up. Would you want to open a PR that removes the test?

PS: I get email notifications for all posts in the Elixir category and usually reply within 24h.

5 Likes

Glad to help! Done: Remove Unicode test from `series` exercise by codingthat · Pull Request #1567 · exercism/elixir · GitHub Thanks!

1 Like

Merged! I believe the thread can me marked as solved now.

3 Likes