The current tests doesnt actually test that the application can work with grapheme clusters. Current tests only tests if the system can handle unicode characters not that a character made of multiple unicode points actually gets treated as a single character.
Given the following condition would we expect that it would be true, since there are no letter which are similiar between them:
Anagram.find("üy", ["uÿ"]).should eq([] of String)
But since if we actually split these 2 strings into all the unicode points we will be left with: ["u", "y", "<two dots>"], so the 2 arrays will actually be the same.
The current test cases doesnt test for a scenario like this.
I noticed this when writing up the implementation for Microblog on the JS track, that’s why I also wrote a bunch of approaches to explain that some solutions will fail in some cases.
I think the point for such a test is excellently made, but in the case of Microblog for example, it’ll also invalidate almost all current solutions since almost all of them don’t take grapheme clusters into account.
Well, it’ll only invalidate them if tracks choose to implement it. That’s why we have the tests.toml file where tracks can opt-out from indidivual test cases.
On the Rust track, some test cases are made optional by hiding them behind feature flags. The test runner ignores those, but people can run them locally. Optional bonus challenges are implemented this way, including handling grapheme clusters. Adding such tests would be backwards-compatible. Maybe a similar approach is possible in other languages.