More unicode tests in problem-specifications

senekor · January 22, 2024, 10:21am

On the Rust track, there are several exercises with unicode related test cases. These are mostly custom test cases not from problem-specifications. I thought it had to be this way, because many languages make it hard to handle unicode well (out of the box).

But now I see that problem-specifications has a unicode scenario (only used by parallel-letter-frequency for now), meaning such test cases could easily be excluded by test generators of languages where unicode is tedious to deal with. So I think it would be a good idea to upstream these test cases with the scenario.

One disadvantage that comes to mind is that any language track that incorporates these test cases will slightly increase the difficulty of the exercise and risk invalidating many community solutions. It may be considered a breaking change.

Here’s the list of exercises on the Rust track where we have unicode tests that I consider suitable to be upstreamed. (There are others where I don’t quite see the added value, e.g. a test that unicode characters are simply ignored in scrabble-score.)

anagram
grep
rail-fence-cipher
reverse-string

What do you think, should I work on a couple PRs?

iHiD · January 22, 2024, 10:43am

I’m +1 on this. I see no disadvantage to having tests upstreamed really. Tracks should just bare in mind the “risks” of adding them to their track as you mention. Thanks!

ErikSchierboom · January 22, 2024, 12:25pm

Same here. If you’ll add them with the unicode scenario, we should be good.