Tweak language and formatting in `protein-translation/description.md`

Thanks.

No, it was to use STOP (all caps, no quotes) uniformly as the interpretation of a codon in isolation, not STOPPED as an interpretation of what the program has done during the processing of a sequence of codons. Hence my suggestion: These are different contexts, and to me, STOPPED gives a moment’s pause due to the conflation it presents, whereas (stopped) gives a clearly distinct meaning appropriate to the context.

We are down to 2 changes, thanks to your agreement about the vertical table. I actually had asked if I should open new threads about them. But then we juggled anyway. So below is an updated proposal, thanks to your assent regarding the vertical tables. Hopefully this is manageable. If not, I will open 2 other threads then come back here with the final proposal.

Updated proposal.

Full summary of changes:

  1. Active voice
  2. Use vertical tables, with the original sequence kept in preceding ¶ text, with the program run sequence using (stopped) instead of STOPPED, to distinguish from STOP and avoid conflating contexts
  3. Remove unexplained reference to ribosomes
  4. Clarify wording following STOP codon example
  5. Standardize STOP codon (prioritizing ease of maintenance, not correctness)
  6. Still mention 64 codons for now, but: cut unclear message about expanding the test suite (only applicable to offline users), cut untrue message about the program working for all codons if it works for one codon, and fix grammatical mistake (use “not all are important” instead of “all are not important”)
  7. Remove redundant “after” (given “subsequent”)

Preview render:

Description

Translate RNA sequences into proteins.

You can break an RNA strand into three-nucleotide sequences called codons and then translate them into amino acids to make a protein.
For example, the RNA strand “AUGUUUUCU” is translated like this:

Codon Amino Acid
AUG Methionine
UUU Phenylalanine
UCU Serine

There are also three STOP codons. If you encounter any of these codons, ignore the rest of the sequence — the protein is complete.
For example, AUGUUUUCUUAAAUG contains a STOP codon, so ignore any subsequent codons:

Codon Amino Acid
AUG Methionine
UUU Phenylalanine
UCU Serine
UAA (stopped)
AUG (stopped)

In other words, the latter AUG is not translated into another methionine here because it’s preceded by a STOP codon.

There are 64 codons which in turn correspond to 20 amino acids; however, not all codons will be used in this exercise. Below are the codons and resulting amino acids needed for the exercise.

Codon Amino Acid
AUG Methionine
UUU, UUC Phenylalanine
UUA, UUG Leucine
UCU, UCC, UCA, UCG Serine
UAU, UAC Tyrosine
UGU, UGC Cysteine
UGG Tryptophan
UAA, UAG, UGA STOP

Learn more about protein translation on Wikipedia.

I’m not a fan of using STOP in one table and (stopped) in the other.

I’m in favor of this, which I see as in conflict with using (stopped).

Followed by a list of 7 changes :slight_smile: I’m not sure how that’s 2 changes.

Since there is a new summary, I’ll reiterate my preference for not having a table interrupt the flow of an example.

This discussion has probably gone on too long now, and has got too complex to ever get to a conclusion, so I’m tip-toeing to try and get it over the line.

I’ve read through the latest draft and the objectives, and this is my suggested version, which I think deals with all the various changes people want. I’ve simplified things down to one table, which I’ve put earlier (three tables made it actively more confusing in my eyes). I’ve simplified the english a bit. I’ve gone through a couple of proof-reading iterations with LLMs.

If this is considered significantly better than what exists currently, and has nothing actively harmfully wrong, can I suggest we merge this, and then any further suggests can be address as individual items.

If @codingthat @IsaacG, @BNAndras and @SleeplessByte are in agreement (as you three seem to still be active in the thread), can I suggest that @codingthat creates a PR with this in? If anyone else would like to object, please do so, but I worry we’re in the weeds a litle right now! :slight_smile:

Rendered Preview:


Description

Your job is to translate RNA sequences into proteins.

RNA strands are made up of three-nucleotide sequences called codons. Each codon translates to an amino acid. When joined together, those amino acids make a protein.

In the real world, there are 64 codons, which in turn correspond to 20 amino acids. However, for this exercise, you’ll only use a few of the possible 64. They are listed below:

Codon Amino Acid
AUG Methionine
UUU, UUC Phenylalanine
UUA, UUG Leucine
UCU, UCC, UCA, UCG Serine
UAU, UAC Tyrosine
UGU, UGC Cysteine
UGG Tryptophan
UAA, UAG, UGA STOP

For example, the RNA string “AUGUUUUCU” has three codons: “AUG”, “UUU” and “UCU”. These map to Methionine, Phenylalanine, and Serine.

“STOP” Codons

You’ll note from the table above that there are three “STOP” codons. If you encounter any of these codons, ignore the rest of the sequence — the protein is complete.

For example, “AUGUUUUCUUAAAUG” contains a STOP codon (“UAA”). Once we reach that point, we stop processing. We therefore only consider the part before it (i.e. “AUGUUUUCU”), not any further codons after it (i.e. “AUG”).

4 Likes

Sounds good to me,

Re: consistent STOP, should the Stop Codons header by STOP Codons? (Done)

2 Likes

LGTM

(Post must be at least 10 characters)

1 Like

Looks GTM.

1 Like

Looks good to me.

1 Like

Thanks @iHiD for the cleanup and sidestepping. One minor request is to re-standardize a term, since the proposed now has three ways of referring to the stop codons: “STOP” codons, “STOP codons,” and STOP codon (without quotes). I’d love to make all three simply STOP codon(s) (without quotes) as earlier agreed upon. If not, I am otherwise in favour of PRing your version and willing to do so. Thanks again.

Given that the proposed changed is all signed off on, how about just pushing that change as is then circling back for further refinements? If you make a PR using the approved change, you should be able to get it merged pretty fast. Once that’s done, any future changes should be small and simple.

None of the 65 tracks (including Cairo and Rust) implement additional codons at the moment.

1 Like

I’ve edited the first bit of text in the section to be consistent with the title.

I think the STOP should be in quotes because otherwise it looks like someone is shouting “STOP [the] codons” in the same way someone might shout “STOP [the] thief!”. It needs to be clear that it’s a term we’re introducting. Once it’s been defined it then doesn’t need the quotes any more :slight_smile:

LGTM! :slight_smile: