Tweak language and formatting in `protein-translation/description.md`

I appreciate your comment, but I’m confused, won’t this just be a new thread with the same discussion about the same proposed changes? Or do you mean I should have a separate thread for each change? Or one big thread but discussing the changes sequentially?

1 Like

Sorry, I forgot to mention “by the ribosome.” I removed that too since the word “ribosome” isn’t used anywhere else in the description, so this seems confusing at best for those who aren’t familiar with it. “By the program” would be more consistent with the rest of the description, but it’s also redundant. Hence the removal.

The tests don’t check for a STOP value. It’s up to the student how they want to handle it as long as translation stops and the sequence is returned. That’s the behavior the tests related to stop codons are checking.

I was instead referring to the test descriptions in the canonical test data since it was unclear if we should update those as well so “STOP codon” is “stop codon”. Generally, the descriptions get reused in a track’s test suite in one form or another so that’d also be public-facing. However, we can’t just update the test descriptions for existing tests so new tests re-implementing the old tests but with updated descriptions would be needed. Then maintainers know they have tests to update and can do so. Not all tracks have generators so that’ll take up maintainer time.

Oh — that must be specific to Elixir:

    @tag :pending
    test "identifies stop codons" do
      assert ProteinTranslation.of_codon("UAA") == {:ok, "STOP"}
      assert ProteinTranslation.of_codon("UAG") == {:ok, "STOP"}
      assert ProteinTranslation.of_codon("UGA") == {:ok, "STOP"}
    end

In fact, in the Elixir test descriptions it’s already “stop codon” almost everywhere, except in one.

Anyway thanks @BNAndras I see your point. I can remove that part of the PR (or rather, change the other instance of “stop codon” in the description to “STOP codon”) if that’s the final answer. :heavy_check_mark:

1 Like

Rather than using PRs to convey intended changes, can you update this forum thread with the latest proposed change(s) for discussion/approval?

Sure, sorry, I didn’t mean I’d put in a new PR until everything was agreed. Here’s where I believe we’re at:

  • Remove extraneous => for consistency
  • Change 'STOP' codon and stop codon to both be STOP codon (caps but no quotes) to avoid extra test maintenance.
  • Remove “(by the ribosome)” for consistency and accessibility.
  • Remove redundant “after” (given “subsequent”)
  • Remove the empty line just before the redundant “after” because it heavily depends on the previous line, so makes more sense in the same ¶ rather than as a new idea.

^^ @siebenschlaefer sorry, I missed explaining that last one too earlier. I believe what probably happened is that it used to say “after a STOP codon.” But I think what I propose (removing “after” and also the newline) is both more concise and more readable.

1 Like

The fat arrow sequence section is difficult for me to parse mentally since the info after each => rephrases what the previous transformation is. So I’m going forward in the sequence when I hit the => and expecting the next transformation, but instead I’m seeing the previous transformation rephrased.

I’d suggest we show the steps like this:

RNA => Three-letter codons => proteins
"AUGUUUUCU" => "AUG", "UUU", "UCU" => "Methionine", "Phenylalanine", "Serine"
1 Like

There are 64 codons which in turn correspond to 20 amino acids; however, all of the codon sequences and resulting amino acids are not important in this exercise. If it works for one codon, the program should work for all of them. However, feel free to expand the list in the test suite to include them all.

That should read not all codon sequences and resulting amino acids are important. However, we should rephrase this since only CLI users would be in a place to run modified tests locally. That whole section ca be shortened to:

There are 64 codons which in turn correspond to 20 amino acids; however, not all codons will be used in this exercise.

We later provide a table of the relevant codons so perhaps we don’t even need this line.

2 Likes

Just as an FYI, I did some work on this here: website/bootcamp_content/projects/string-puzzles/exercises/protein-translation/introduction.md at 00934f71f43f8efb5cb28d256bf8b46c99b218a7 · exercism/website · GitHub

(I don’t have brainspace to engage with the discussion, but maybe there’s something there that’s useful?)

3 Likes

Thanks @BNAndras , I agree with both those proposals, except I now see that “proteins” should be “amino acids” for the third step. Could I make it into a horizontal table to align the corresponding elements? (It’d fit a standard 80-char terminal still even for people reading unrendered Markdown that way.)

RNA Three-letter codons Amino acids
“AUGUUUUCU” “AUG”, “UUU”, “UCU” “Methionine”, “Phenylalanine”, “Serine”

And thanks @iHiD , that’s helpful, I’m seeing ideas we can pull from your changes and make consistent (like active voice):

RNA can be broken into three-nucleotide sequences called codons, and then translated to a protein like so:

becomes

You can break an RNA strand into three-nucleotide sequences called codons and then translate them into amino acids to make a protein like so:

and

There are also three terminating codons (also known as ‘STOP’ codons); if any of these codons are encountered (by the ribosome), all translation ends and the protein is terminated.

All subsequent codons after are ignored, like this:

becomes

There are also three STOP codons. If you encounter any of these codons, ignore the rest of the sequence — the protein is complete. For example, UAA is a STOP codon, so ignore any subsequent codons:
… (similar table here) …
(Note that the latter AUG is not translated into another methionine.)

I haven’t been following the discussion, but just in case it’s helpful, here’s the past PR that updated the instructions

Thanks, good point. I see "property": "proteins" should probably also be fixed in the canonical data too, but since we were avoiding changes to that for “stop codon,” I’m not sure. @BNAndras ?

I’m not sure about how changes to properties might play out, but I think we should avoid updating the canonical data if possible and focus on the instructions.

2 Likes

What @BNAndras said. Updating the properties may mean reimplementing all the tests, which seems like a lot.

2 Likes

OK, given that, is everyone OK if I PR based on my summary above? Tweak language and formatting in `protein-translation/description.md` - #20 by codingthat

I think you should mention that stop codons exist before explaining what to do when encountering them.

Good catch, my bad.

Also my link wasn’t the best, I meant Tweak language and formatting in `protein-translation/description.md` - #16 by codingthat plus whatever supersedes it in Tweak language and formatting in `protein-translation/description.md` - #20 by codingthat .

Seems reasonable to me. Something like this?

- RNA: `"AUGUUUUCU"` => translates to
- Codons: `"AUG", "UUU", "UCU"`
- => which become a protein with the following sequence =>
- Protein: `"Methionine", "Phenylalanine", "Serine"`
+ RNA `"AUGUUUUCU"` translates to codons `"AUG", "UUU", "UCU"`.
+ That become a protein with the sequence `"Methionine", "Phenylalanine", "Serine"`.
- All subsequent codons after are ignored, like this:
- RNA: `"AUGUUUUCUUAAAUG"` =>
- Codons: `"AUG", "UUU", "UCU", "UAA", "AUG"` =>
- Protein: `"Methionine", "Phenylalanine", "Serine"`
+ All subsequent codons are ignored.
+ For example, RNA `"AUGUUUUCUUAAAUG"` translates to codons `"AUG", "UUU", "UCU".
+ That become a protein with the sequence `"Methionine", "Phenylalanine", "Serine"`.
1 Like

I was thinking a table would be more readable: Tweak language and formatting in `protein-translation/description.md` - #20 by codingthat What do you think?

Got it. That looks good to me.

1 Like