Hi. Given my enormous confusion about the Pig Latin exercise, I’ve been thinking if I should reply in this conversation or instead create a separate discussion, but I decided that I’m really new to the community so I would start by just responding here.
First of all - I totally support the concerns of @Arehandoro . I’m already almost halfway throught the Python track, I still consider myself a Python rookie (and I’m even worse in linguistics), but this is the first time I’m running into an exercise that has such an unclear task but, more importantly, tests that only cover part of the possible cases, which altogether makes it very hard to choose the approach of solving the actual task.
I guess the problem here comes from one of the few real flaws of English as a spoken language: it’s phonetics are lacking a system and contain a lot of exceptions. Add to this a certain difference in its British and American pronunciations and constant evolution of the language which results in creating new words and borrowing ones from other languages, with their own phonetical rules. With English not being my mother tongue, I will still try give you a few simple examples, but I’m sure there are more of them and you will easily come up with your own variants:
- Hourly [ˈaʊə.li]
- Unicorn [ˈjuː.nɪ.kɔːn]
- Europe [ˈjʊə.rəp]
Let’s take the first word hourly which starts with a consonant letter ‘h’ that is silent so the word actually starts with a vowel sound [ˈa] of the subsequent letter ‘o’. This is what the rules say for this case:
Rule 1: If a word begins with a vowel sound, add an “ay” sound to the end of the word. Please note that “xr” and “yt” at the beginning of a word make vowel sounds (e.g. “xray” → “xrayay”, “yttria” → “yttriaay”).
This word should be translated into Pig Latin as “hourlyay”. If we try to feed it to the code from the solution in the Dig Deeper part of the exercise:
VOWELS = {"a", "e", "i", "o", "u"}
VOWELS_Y = {"a", "e", "i", "o", "u", "y"}
SPECIALS = {"xr", "yt"}
def translate(text):
piggyfied = []
for word in text.split():
if word[0] in VOWELS or word[0:2] in SPECIALS:
piggyfied.append(word + "ay")
continue
for pos in range(1, len(word)):
if word[pos] in VOWELS_Y:
pos += 1 if word[pos] == 'u' and word[pos - 1] == "q" else 0
piggyfied.append(word[pos:] + word[:pos] + "ay")
break
return " ".join(piggyfied)
We get “ourlyhay” because according to the business logic of this code ‘h’ produces a consonant sound. The other two examples are the opposite of the described situation as they start with a vowel letter producing a consonant sound ['j].
These exceptions are not caught by any tests, and the top published solutions that I tried also miss these cases. I also don’t think that simply updating the tests by introducing these new cases should solve the problem. As I mentioned above, English, despite all its advantages, is lacking a systematic phonetics, so there will always be a huge number of exceptions. Also, the way the letter sounds in English also depends on where it stands in a word, so simply shifting the letter or letters from the start to the end of the word might also affect how they sound, making the word translated into Pig Latin sound completely different to the expected outcome further thus complicating the testing process. Add to this the fact that “w” can sometimes work with another vowel to make a vowel sound, and you’ll realize this exercise should be moved from “easy” skill level to “nightmare”.
This also creates another problem with the whole exercise - because it’s not crystal clear what the task is, and because of a huge number of exceptions that won’t be caught by tests, the exercise creates a false incentive for the students like myself to simply write the code that passes all the tests and move on to the next task, rather than think of a beautiful solution that should do the job.
Imagine yourself at the start of this exercise. You think to yourself, which approach should I take? Should I create a mapping of all the combinations of English letters to their sounds and group those into vowels/consonants, effectively trying to solve a linguistic problem rather than a coding exercise? Should I import sklearn and train a language model to predict the sounds? Or, because there are thousands of these letter combinations and I’m likely to miss a lot of them, try instead to write a simpler code that will just pass the tests? I guess the answer is obvious.
This was also the first task where I didn’t opt for mentoring. I simply fear that instead of trying to find a smart solution for the exercise together with the mentor, we are going to be arguing with them about our own interpretation of the rules.
Apologies for a long reading and thanks for making it to this point. It seems like a funny made-up children’s language is indeed too hard for the grown-ups to understand. If you ask me, I would suggest completely removing this exercise from the track as, with all respect to the people who added it, it is a bit demotivating.