Bash function that finds multiple instances of char in a string

So I was doing the bash exercise anagram and all tests except one are passing.

For all bash exercises that I have done so far, I did not use any external, non-builtin features just to exhaust what are the possibilities with it. Going back to anagram, the one test that is failing is against this:

user@debian:~/programming/$ cat -n anagram.bat
    83  @test "anagrams must use all letters exactly once" {
    84    #[[ $BATS_RUN_SKIPPED == "true" ]] || skip
    85    run bash "tapper" "patter"
    86    assert_success
    87    refute_output
    88  }

And the test report:

user@debian:~/programming/$ bats anagram.bats 
 ✓ no matches
 ✓ detects two anagrams
 ✓ does not detect anagram subsets
 ✓ detects anagram
 ✓ detects three anagrams
 ✓ detects multiple anagrams with different case
 ✓ does not detect non-anagrams with identical checksum
 ✓ detects anagrams case-insensitively
 ✓ detects anagrams using case-insensitive subject
 ✓ detects anagrams using case-insensitive possible matches
 ✓ does not detect a anagram if the original word is repeated
 ✗ anagrams must use all letters exactly once
   (from function `refute_output' in file bats-extra.bash, line 611,
    in test file anagram.bats, line 87)
     `refute_output' failed
   -- output non-empty, but expected no output --
   output : patter
 ✓ words are not anagrams of themselves
 ✓ words are not anagrams of themselves even if letter case is partially different
 ✓ words are not anagrams of themselves even if letter case is completely different
 ✓ words other than themselves can be anagrams

16 tests, 1 failure

The failure (I think) is due to the character t on the test word patter being repeated. To resolve that, I’m thinking of creating a sort function where it walks through the entire string and checks for any duplicates. But since I’m thinking of implementing sort in pure bash function, are they any guides/tips you can provide? I understand that it may be a bit silly to avoid the external command sort, but it’s just me.

I did not include my code as it still messy am looking only for some pointers on how to create a bash sort function.

As always, thanks a lot!

Do you need to sort chars to track if you’ve seen the char and/or if it shows up multiple times?

If I can find any char that shows up multiple times without sorting, that would be optimal. But I am not sure if I need to sort it out first, then check?

This is the structure of my idea:

for ((i=0,x=$((i+1));i<${#mystring};i++)) {
# if mystring is sorted, I can try?
if [[ ${mystring:i:1} -lt ${mystring:x:1} ]]; then .... is where I ran out of ideas on how to compare char by char..

I'll change the title to be more specific to what I want.

Sure. If you first sorted, you could find dupes by comparing neighboring letters.

What if you wanted to count the occurrences of each letter? Could you build a mapping of letter to occurrences? Could you use that for detecting dupes?

Thanks for the pointer @IsaacG . I was able to create a function that does what I want by using a declarative array and initialise all keys with 0 values.

    16      function check_for_duplicate_char() {
    17          input="${1,,}"            
    18          declare -A myhash                     
    19          for ((i=0;i<${#input};i++)) {
    20              myhash+=( ["${input:i:1}"]=0 )
    21          }                          
    22          # Iterate through again to get any chars that shows up more than once.                         
    23          for ((i=0;i<${#input};i++)) {         
    24              key="${input:i:1}"
    25              echo "key:$key -- current value:${myhash[$key]}"
    26              myhash+=( [$key]=$((${myhash[$key]}+1)) )
    27          }   
1 Like

Note, you shouldn’t need to preset all the values to 0. You can drop the first loop. If you just want unique or not, you don’t even need to increment the value! It’s enough to just check for existence.

Indeed. All my tests are now passing ok, so this has now been resolved. Thanks!