Clarify Exercise Methodology Required

Thanks for taking the time to reply to my suggestion.

I also take the same approach, but that doesn’t solve an exercise; it only gives you the syntax and right way to use a language.

Let’s try with an example. On the exercise Pig Latin on Python, which is marked as ‘easy’, the instructions clearly state the rules for the game. Those rules, I’m going to say to me, but I guess it’s for the majority of people learning a language, indicate that the exercise requires string manipulations, and a simple if/elif statement.

However, in the community solutions, the most starred one consists of importing the re module, using the map data type and lambda. All based on regex. I might be wrong, but I wouldn’t consider that solution on the ‘easy’ side.

Also, that same solution does not have any comments on how the person reached the conclusion that solving the problem in that way was better/more efficient/readable/etc.

Of course, now it would be up to me, or any other person, to google what that data type is, how to use regex in Python, and what the re module does. But I do believe that would defeat the purpose of following a ‘course’ or exercise based learning approach in the first instance.

Perhaps people are way more invested than me on this, but I would bet the churn of users experiencing that is high.

I understand the work required to implement it would need to be done by volunteers, and that is a lot to ask and work to do. I was only replying to the thread’s question :smiley:

2 Likes

The stars may not be “best solution” and may be different things to different people, though. I star because of it being interesting, for what ever reason, but definitely not for “best solution”.

Interesting may be due to esoteric solutions, clever solutions, approaches that are not obvious, instead, or any number of other reasons.

This is the part that is now missing, the conversation through the iterations with a mentor.

This can be gained, though, individually, through mentorship, or through the automated feedback and dig deeper functionality.

But, again, we are assuming, I think, that the stars are a reflection of those the things mentioned, when that is not necessarily the case; better, more efficient (and in what way?), readable.

Also, the stars may not be a reflection of “the easy solution”. And perhaps this is a good example of this being not the case, as you have determined.

Hi. Given my enormous confusion about the Pig Latin exercise, I’ve been thinking if I should reply in this conversation or instead create a separate discussion, but I decided that I’m really new to the community so I would start by just responding here.

First of all - I totally support the concerns of @Arehandoro . I’m already almost halfway throught the Python track, I still consider myself a Python rookie (and I’m even worse in linguistics), but this is the first time I’m running into an exercise that has such an unclear task but, more importantly, tests that only cover part of the possible cases, which altogether makes it very hard to choose the approach of solving the actual task.

I guess the problem here comes from one of the few real flaws of English as a spoken language: it’s phonetics are lacking a system and contain a lot of exceptions. Add to this a certain difference in its British and American pronunciations and constant evolution of the language which results in creating new words and borrowing ones from other languages, with their own phonetical rules. With English not being my mother tongue, I will still try give you a few simple examples, but I’m sure there are more of them and you will easily come up with your own variants:

  • Hourly [ˈaʊə.li]
  • Unicorn [ˈjuː.nɪ.kɔːn]
  • Europe [ˈjʊə.rəp]

Let’s take the first word hourly which starts with a consonant letter ‘h’ that is silent so the word actually starts with a vowel sound [ˈa] of the subsequent letter ‘o’. This is what the rules say for this case:

Rule 1: If a word begins with a vowel sound, add an “ay” sound to the end of the word. Please note that “xr” and “yt” at the beginning of a word make vowel sounds (e.g. “xray” → “xrayay”, “yttria” → “yttriaay”).

This word should be translated into Pig Latin as “hourlyay”. If we try to feed it to the code from the solution in the Dig Deeper part of the exercise:

VOWELS = {"a", "e", "i", "o", "u"}
VOWELS_Y = {"a", "e", "i", "o", "u", "y"}
SPECIALS = {"xr", "yt"}


def translate(text):
    piggyfied = []

    for word in text.split():
        if word[0] in VOWELS or word[0:2] in SPECIALS:
            piggyfied.append(word + "ay")
            continue

        for pos in range(1, len(word)):
            if word[pos] in VOWELS_Y:
                pos += 1 if word[pos] == 'u' and word[pos - 1] == "q" else 0
                piggyfied.append(word[pos:] + word[:pos] + "ay")
                break

    return " ".join(piggyfied)

We get “ourlyhay” because according to the business logic of this code ‘h’ produces a consonant sound. The other two examples are the opposite of the described situation as they start with a vowel letter producing a consonant sound ['j].

These exceptions are not caught by any tests, and the top published solutions that I tried also miss these cases. I also don’t think that simply updating the tests by introducing these new cases should solve the problem. As I mentioned above, English, despite all its advantages, is lacking a systematic phonetics, so there will always be a huge number of exceptions. Also, the way the letter sounds in English also depends on where it stands in a word, so simply shifting the letter or letters from the start to the end of the word might also affect how they sound, making the word translated into Pig Latin sound completely different to the expected outcome further thus complicating the testing process. Add to this the fact that “w” can sometimes work with another vowel to make a vowel sound, and you’ll realize this exercise should be moved from “easy” skill level to “nightmare”.

This also creates another problem with the whole exercise - because it’s not crystal clear what the task is, and because of a huge number of exceptions that won’t be caught by tests, the exercise creates a false incentive for the students like myself to simply write the code that passes all the tests and move on to the next task, rather than think of a beautiful solution that should do the job.

Imagine yourself at the start of this exercise. You think to yourself, which approach should I take? Should I create a mapping of all the combinations of English letters to their sounds and group those into vowels/consonants, effectively trying to solve a linguistic problem rather than a coding exercise? Should I import sklearn and train a language model to predict the sounds? Or, because there are thousands of these letter combinations and I’m likely to miss a lot of them, try instead to write a simpler code that will just pass the tests? I guess the answer is obvious.

This was also the first task where I didn’t opt for mentoring. I simply fear that instead of trying to find a smart solution for the exercise together with the mentor, we are going to be arguing with them about our own interpretation of the rules.

Apologies for a long reading and thanks for making it to this point. It seems like a funny made-up children’s language is indeed too hard for the grown-ups to understand. If you ask me, I would suggest completely removing this exercise from the track as, with all respect to the people who added it, it is a bit demotivating.

1 Like

@ErikSchierboom Thoughts?

1 Like

Reading this, it seems to me that two main issues are mentioned:

  1. The instructions are not clear
  2. The tests don’t catch all cases

I’m 100% on board with point number 1. The terminology used (vowels, consonants) should absolutely be explained better, preferrably with lost of examples. See problem-specifications/exercises/pig-latin/description.md at main · exercism/problem-specifications · GitHub for the current instructions.

As for the second point, it is hard to catch all test cases, especially with exceptions. I’m personally fine with the exercise not dealing with all edge cases, as our goal is not to implement a 100% correct pig latin algorithm. That said, problem-specifications/exercises/pig-latin/canonical-data.json at main · exercism/problem-specifications · GitHub is open for extension.

I think this is (way) too harsh, as the exercise has been successfully solves by thousands of students. I’m not saying that the exercise shouldn’t be improved (it should!), but removing it is not the solution I think.

Some links to sites I found when last gooling issues with this exercise (in case they help): [Pig Latin]: 'ch' in 'chair' is not consonant cluster - #5 by BethanyG

Thanks for getting back @ErikSchierboom. By no means I want to push for complete removal of the exercise, let me just make sure I made myself clear on the reasons for this idea and why I think the problem with this exercise is actually bigger.

When you’re saying that the exercise has been successfully solved by thousands of students, I can’t associate myself with these thousands, although I, techincally speaking, solved it.

My last attempts to submit a correct solution to the exercise were focused around coming up with a solution that would just pass the tests and move on to the next exercise rather than building a clean code that would solve the task - but I never behaved that way with other exercises. I guess I’m just the only student so far who willingly confesses in this behaviour.

I hope the others didn’t follow that path, but, looking at a random 10 published solutions for that exercise I can say that the majority of them does exactly this - they pass the tests but they don’t use the idea of vowel/consonant letter pronunciations, which, in my opinion, is the key idea of the exercise.

So, my my main concern about the exercise - again, with all my respect to the people who created it - is that instead of teaching students to create programmatic solutions for real-world problems, it teaches them to build a code that rather just formally passes the tests. If such people then apply for a position in SpaceX, I personally wouldn’t like to fly a rocket programmed by them :slight_smile: . I also hope that’s not the mindset that Exercism is trying to teach the students.

Anyway - I’m with you, and if you choose to keep further improving the exercise, I’m more than happy to come up with additional examples of vowel letter / consonant sound and vice versa. Here’s one new by the way:

  • Xbox [ɛ́ksbɔks]

Hope the information above helps. Thank you.

I think it’s totally fine that people’s solutions “just” pass the tests. People don’t need to know everything about the domain in order to solve the exercise. It’s what makes practice exercises useful to, well, practice one’s programming skills. You’re approaching things more from becoming a domain expert perspective, which is fine, but it’s not what our exercises are about. They are about helping one become fluent in language, which does not require becoming a domain expert. That’s not to say that exercises can’t help with that, they can, but it’s not their core focus. Oftentimes, exercises are adaptations of real-world programs, but in a condensed or simplified form.

From this point forward, I’d like us to focus on only two things:

  1. How do we improve the instructions
  2. What extra test cases would make sense
2 Likes

Since I’ve solved this in multiple languages (eleven so far) I do not have the warm fuzzies if new test cases would break all of my existing solutions. That’s not what I came to Exercism for.

I think it’s important to keep in mind that we’re trying to give a programming exercise for learn from, not a business activity that is mission critical. It’s supposed to be a fun exercise, not a comprehensive “pig latin” translation service. And it’s supposed to be focussed on exploring a programming language, not learning an algorithm.

As has been said, there are going to be lots of weird exceptions and such in the concept of “pig latin”, but we’re not really fussed about that, as long as things are clear to the student and that that doesn’t get in the way of things. Currently, it seems that’s not happening.

  • So firstly maybe we need to make that clearer in the instructions - that you’re aiming to get a minimal working version, not a comprehensive one.
  • Secondly, maybe we should be very explicit that people should use the tests to drive this exercise, not the description. The instructions are there to introduce people to the idea of the exercise - the tests are the “spec” in this case.

For me, the best approach in solving it is to get a passing version of the code, then refactor it to be nice.r I feel like we can encourage that behaviour through the description, and in doing so, release a lot of the tension that’s there when solving the exercises.

3 Likes

Seems like I once again didn’t express myself clear enough.

The reason I decided to speak up is exactly that the task did not appear very clear for me as a student, and the more I tried to solve it, the less fun for me personally it was.

Anyway - as @ErikSchierboom suggested, let’s talk about how we could possibly improve the instructions or what exra tests would make sense.

I’ve been searching online for additional cases when a starting vowel letter produces a consonant sound and vice versa. These were the examples I already suggested earlier:

And here are some more:

  • honest [ˈɒnɪst]

  • heir [eə]

  • honour [ˈɑnɚ]
    (and their derivatives, such as honesty, honourable, honorary)

  • union [ˈjuːniːən]

  • utility [juːˈtɪlətiː]

  • eucalyptus [ˌjuːkəˈlɪptəs]
    (and their derivatives and cognates, like united, unique, utilitarian, eucaryotic etc.)

I guess we can forumulate certain rules from it:

  • consonant ‘h’ followed by vowels can sometimes (but not in all cases, like ‘horror’, ‘hard’, ‘hair’) be silent and as a result the word would start with a vowel sound. Algorithmising this, in my opinion, is close to impossible, you have to remember all the exceptions to use them in your speech correctly.
  • ‘eu’ and ‘u’ can sometimes (but, again, not in all cases, like ‘until’, ‘ultimate’, ‘uneven’) start with a consonant [j] sound. No idea how to algorithmise that, in common speach I guess we’re again supposed to memorize all the words.
  • x followed by a consonant might actually be an x followed by an omitted dash, like in x-ray, x-box, x-factor etc, and that’s what normally makes it produce a first vowel sound.

I’d be happy to see which of these exceptions you’d also consider meaningful so that the task and the tests could possibly be updated. Thank you.

Thanks for the examples. I’m reading “sometimes”, “sometimes” and “might”, which to me seems to imply that if we were to add these, we should also be adding exceptions too. I personally would not be in favor of doing so.

Thanks for the suggestions. The thing that I’m unclear on here is why adding more rules makes the exercise clearer/more fun. Is the logic that by defining more of pig-latin, it all makes more sense? Or something else I’m not seeing?

It seems to me that (at least as a starting point) what we should be doing here is more along the lines of clarifying the instructions so that the existing tests make sense (you may be doing that and I’m misunderstanding). I’m very wary of adding new rules that break the solutions that >16,000 people have submitted to this else we’ll get huge pushback (e.g. Bob’s comment above).

Would you be able to articulate exactly what in the instructions doesn’t currently match with the tests? Then we can amend/add-to the instructions as our starting point?

I think the key thing that would help me get clarity would be to understand the objective we’re aiming for here. Is your objective to make the instructions and tests align, or to make the exercise deeper, or something else? I think starting there would help :slight_smile:

2 Likes

Thanks for getting back @ErikSchierboom and @iHiD. To address this concern:

By no means I’m trying to add more rules, what i was trying to convey above was the degree to which the sounds (a concept found a lot in the instructions) differ from the letters or their combinations (concepts with which we have to work in the code and which the tests rely on), and the impossibility of creating firm mapping between the two.

As it is impossible to predict a sound type (vowel or consonant) based on the letter combinations, I understand that further improving the exercise is a hard task of finding a balance between the strictness of the tests and the clarity of the instructions. I’m assuming that this is the challenging situation the authors and the maintainers of the exercise have found themselves in:

  • increasing the instructions precision can infinitely increase the difficulty level of the exercise and might involve adding tests which are going to break the existing solutions
  • reducing the instructions precision would make them less relevant to the actual task and more confusing for the new students

Is that a fair assumption?

P.S. I can also feel that I could have apperaed boring with my concerns at this point - but here is another reference to a person’s experience with this task on Twitter. The guy got so frustrated with the Pig Latin exercise that he stopped coding for a month.

OK, so I think maybe what I find confusing here is that I found this exercise quite straightforward and have what I think is a very simple and readable solution that passes the tests. I just followed through each test getting it green and got a solution that I then cleaned up and refactored.

def translate
  if word.start_with?(*VOWEL_SOUNDS)
    word + "ay"
  else
    first_sound = word[FIRST_CONSONANT_SOUND]
    word[first_sound.length..-1] + first_sound + "ay"
  end    
end

VOWEL_SOUNDS = %w{a e i o u xr yt}
CONSONANT_SOUNDS = %w{ch squ qu thr th sch yt rh} + ('a'..'z').to_a - VOWEL_SOUNDS
FIRST_CONSONANT_SOUND = /^#{CONSONANT_SOUNDS.join("|")}/

So it seems that the complexity is appearing when not trying to solve the tests, but trying to deal with other possible constraints of pig latin as a language.

You say in your description:

I guess I’m just the only student so far who willingly confesses in this behaviour. I hope the others didn’t follow that path

But that’s how the exercise is supposed to be solved, using a Shameless Green (see this for a description if you’re not familiar with the term) approach followed by refactoring to something nice and readable.

That’s also how lots of people (myself included) code using TDD. We write tests, write code that makes them pass, then refactor them to be nice.

Maybe the solution here is just to emphasise that in the task. We say “Pig Latin could be very complex based on accents, regionalities etc, but the aim isn’t to create a full language implementation - it’s to practice solving an exercise in this language using Test Driven Development. So focus on the test cases, rather than trying to build out a whole language.” And then maybe we amend the four points given to explain that they’re examples of things found in the tests?

2 Likes

@iHiD your solution indeed look very nice and clean, but, as mentioned, the reason for my confusion (and I guess other people as well) was that the actual task speaks about completely different concept, which is vowel/consonant sounds, not letters.

This sentence in Rule 1 also added to my confusion:

Please note that “xr” and “yt” at the beginning of a word make vowel sounds (e.g. “xray” → “xrayay”, “yttria” → “yttriaay”).

As we’ve seen, it’s just two out of numerous examples when ,letters ‘x’ and ‘y’ can be combined with other consonants in the start of the word to produce a starting vowel sound, so I guess it would make sense either to list all of these rules or skip these complications completely.

Having that said, if adding more rules is not something that you’re ready to do, perhaps it’s going to be better to remove some of them? I.e.:

  • Rule 1: If a word begins with a vowel sound, add an “ay” sound to the end of the word (e.g. “eat” → “eatay”, “are” → “areay”).
  • Rule 2: If a word begins with a consonant sound, move it to the end of the word and then add an “ay” sound to the end of the word. Consonant sounds can be made up of multiple consonants, such as the “ch” in “chair” or “st” in “stand” (e.g. “chair” → “airchay”).
    There is a number of additional rules for edge cases, and there are regional variants too, but the task is simply to implement the two rules listed above. Check the tests for all the details.

How about such a wording?

What wrong with the current wording and rule set? I’m not sure what reducing the rule set accomplishes.

There is a broader problem description, which is meant as a broad justification, goal or context. There are a series of rules which give the shape of what needs actually implementing. There are unit tests which embody the detailed requirements. The same setup applies to most the exercises and many TDD programming tasks in general.

As I’ve been mentioning in every message here - it appears a bit confusing :slight_smile:

Which part? The fact that those three differ? The fact that the tests don’t perfectly represent the goal? This is true of a large number of exercises and programming in general.

I’m just going to quote one of the previous messages:

I don’t think anyone’s asking here to make the tests perfectly represent the goal. What I’m suggesting is to make it a little less confusing so that it doesn’t frustrate people and does not force them quit coding for a month.