Unicode or just digits?

codingthat · April 24, 2025, 10:23am

The description for the Series exercise explicitly states that it’s specifically about digits, hence the need for its clarification at the end, " Note that these series are only required to occupy adjacent positions in the input; the digits need not be numerically consecutive."

That sentence could be cut if it were about letters, as this test implies:

  test "Unicode characters count as a single character" do
    assert StringSeries.slices("José", 1) == ["J", "o", "s", "é"]
    assert StringSeries.slices("José", 2) == ["Jo", "os", "sé"]
  end

However, I see that it’s not part of the canonical test data.

For consistency, could I either change the wording of the description (remove the “digits” part and make it use letters in its example, so there’s no need for the clarifying sentence at the end) or cut the extra test that contradicts the description?

IsaacG · April 24, 2025, 3:28pm

I think it makes sense to replace “digits” with “letters” or “characters” in the description.

codingthat · April 24, 2025, 3:45pm

You wouldn’t also swap the example to make it quicker to grasp? (Or just substitute a letter for one of the numbers?)

IsaacG · April 24, 2025, 3:51pm

I wouldn’t be opposed to that but I don’t think it’s necessary. But I’d love to see what other maintainers think, too!

BNAndras · April 24, 2025, 4:07pm

I’m not sure the clarifying sentence does much as-is. The “49142” example already shows we’re grabbing adjacent characters, and those outputs aren’t numerically consecutive. So maybe we can move the clarification into the example and drop the sentence about input validation.

Is this an update to the canonical exercise instructions? Then, we’ll want to add tests using alphabetic inputs and maybe even alphanumeric ones. Otherwise, the tests are incomplete compared to what the instructions say.

Up to that track’s maintainers since it’s not from problem specifications.

I’d be for expanding the tests. Not all tracks support Unicode (Cairo for example) so it’d be nice to have both ASCII and Unicode tests though.

IsaacG · April 24, 2025, 4:34pm

I’d be opposed to adding non-ASCII tests. We have those in other exercises already if we want exercises that explore that, and it would detract from the focus on this exercise.

tasx · April 24, 2025, 4:39pm

This exercise is about digits, not letters. Wouldn’t adding extra tests with letters break all solutions?

Since this extra test is not in the canonical-data, i’d prefer if the Elixir maintainers find a way to deal with it. Maybe with an append file?

BNAndras · April 24, 2025, 4:48pm

Yeah, I’d likely only include the ASCII tests myself, but it could be an optional scenario for other tracks.

codingthat · April 24, 2025, 5:28pm

Yes, this is about the Elixir track only.

To me the Elixir track went beyond the scope and point of the exercise (maybe it could even be solved numerically, but Unicode takes that option away).

So, what’s the verdict? :)

IsaacG · April 24, 2025, 5:29pm

No? The tests use digits, not numbers. Digits and letters are both a subset of ASCII characters. I expect most existing solutions treat the input as an ASCII string and don’t check if the string contains digits vs other chars.

codingthat · April 24, 2025, 5:31pm

Depends, they could’ve used guards or regexs. I’m not sure we should lean into the scope breakage to cause user solution breakage while we’re at it.

IsaacG · April 24, 2025, 5:32pm

Would you mind sharing how Elixir differs from the problem spec and why it would differ from other tracks? At a glance, it seems to be pretty in line with the specs.

codingthat · April 24, 2025, 5:47pm

Yes, as mentioned at the top, they added a test for Unicode to the canonical tests, which only test using digits (as the description says will happen).

No prob if you want to keep this special Unicode test in the Elixir track, but the description should reflect it, that’s all I mean. :)

BNAndras · April 24, 2025, 5:49pm

Oh, then I’m bowing out. I thought we were discussing upstreaming this to problem-specifications. Since this isn’t in the Elixir subforum, the Elixir maintainers might not be seeing this. In that case, I’m CCing @angelikatyborska.

IsaacG · April 24, 2025, 5:51pm

Ah. Joy. I don’t think the problem spec should mention non-ASCII chars, but also tracks generally use the problem spec description without modifications (though sometimes there are appends). Balancing both of those can make things … tricky.

codingthat · April 24, 2025, 5:52pm

Ah, thanks, sorry about that. I hit New Topic and picked the one the PR bot had always directed me to, then added the Elixir tag to it, but I didn’t realize the tag is insufficient.

codingthat · April 25, 2025, 7:56am

I agree, it doesn’t need to mention non-ASCII chars, but it shouldn’t explicitly say to only expect digits.

angelikatyborska · April 25, 2025, 3:01pm

Let’s remove this non-canonical test. My approach for the Elixir tack is to be as close to the canonical data as possible, and this extra test is clearly unnecessary. @codingthat thanks for bringing this up. Would you want to open a PR that removes the test?

PS: I get email notifications for all posts in the Elixir category and usually reply within 24h.

codingthat · April 25, 2025, 8:19pm

Glad to help! Done: Remove Unicode test from `series` exercise by codingthat · Pull Request #1567 · exercism/elixir · GitHub Thanks!

angelikatyborska · April 26, 2025, 5:52am

Merged! I believe the thread can me marked as solved now.