Bash function that finds multiple instances of char in a string

turtle666 · October 6, 2023, 10:28pm

So I was doing the bash exercise anagram and all tests except one are passing.

For all bash exercises that I have done so far, I did not use any external, non-builtin features just to exhaust what are the possibilities with it. Going back to anagram, the one test that is failing is against this:

user@debian:~/programming/exercism.org/bash/anagram$ cat -n anagram.bat
...
    83  @test "anagrams must use all letters exactly once" {
    84    #[[ $BATS_RUN_SKIPPED == "true" ]] || skip
    85    run bash anagram.sh "tapper" "patter"
    86    assert_success
    87    refute_output
    88  }

And the test report:

user@debian:~/programming/exercism.org/bash/anagram$ bats anagram.bats 
anagram.bats
 ✓ no matches
 ✓ detects two anagrams
 ✓ does not detect anagram subsets
 ✓ detects anagram
 ✓ detects three anagrams
 ✓ detects multiple anagrams with different case
 ✓ does not detect non-anagrams with identical checksum
 ✓ detects anagrams case-insensitively
 ✓ detects anagrams using case-insensitive subject
 ✓ detects anagrams using case-insensitive possible matches
 ✓ does not detect a anagram if the original word is repeated
 ✗ anagrams must use all letters exactly once
   (from function `refute_output' in file bats-extra.bash, line 611,
    in test file anagram.bats, line 87)
     `refute_output' failed
   
   -- output non-empty, but expected no output --
   output : patter
   --
   
 ✓ words are not anagrams of themselves
 ✓ words are not anagrams of themselves even if letter case is partially different
 ✓ words are not anagrams of themselves even if letter case is completely different
 ✓ words other than themselves can be anagrams

16 tests, 1 failure

The failure (I think) is due to the character t on the test word patter being repeated. To resolve that, I’m thinking of creating a sort function where it walks through the entire string and checks for any duplicates. But since I’m thinking of implementing sort in pure bash function, are they any guides/tips you can provide? I understand that it may be a bit silly to avoid the external command sort, but it’s just me.

I did not include my code as it still messy am looking only for some pointers on how to create a bash sort function.

As always, thanks a lot!

IsaacG · October 6, 2023, 11:18pm

Do you need to sort chars to track if you’ve seen the char and/or if it shows up multiple times?

turtle666 · October 6, 2023, 11:21pm

If I can find any char that shows up multiple times without sorting, that would be optimal. But I am not sure if I need to sort it out first, then check?

This is the structure of my idea:

for ((i=0,x=$((i+1));i<${#mystring};i++)) {
# if mystring is sorted, I can try?
if [[ ${mystring:i:1} -lt ${mystring:x:1} ]]; then ....
...here is where I ran out of ideas on how to compare char by char..
}


I'll change the title to be more specific to what I want.

IsaacG · October 7, 2023, 12:45am

Sure. If you first sorted, you could find dupes by comparing neighboring letters.

What if you wanted to count the occurrences of each letter? Could you build a mapping of letter to occurrences? Could you use that for detecting dupes?

turtle666 · October 7, 2023, 11:19pm

Thanks for the pointer @IsaacG . I was able to create a function that does what I want by using a declarative array and initialise all keys with 0 values.

    16      function check_for_duplicate_char() {
    17          input="${1,,}"            
    18          declare -A myhash                     
    19          for ((i=0;i<${#input};i++)) {
    20              myhash+=( ["${input:i:1}"]=0 )
    21          }                          
    22          # Iterate through again to get any chars that shows up more than once.                         
    23          for ((i=0;i<${#input};i++)) {         
    24              key="${input:i:1}"
    25              echo "key:$key -- current value:${myhash[$key]}"
    26              myhash+=( [$key]=$((${myhash[$key]}+1)) )
    27          }

IsaacG · October 7, 2023, 11:25pm

Note, you shouldn’t need to preset all the values to 0. You can drop the first loop. If you just want unique or not, you don’t even need to increment the value! It’s enough to just check for existence.

turtle666 · October 8, 2023, 7:27am

Indeed. All my tests are now passing ok, so this has now been resolved. Thanks!