Tweak language and formatting in `protein-translation/description.md`

BNAndras · April 28, 2025, 5:32pm

I want to unravel this since there are two separate issues that were conflated by proximity.

However, feel free to expand the list in the test suite to include them all. should be dropped because it implies a student could edit the test suite in the browser. However, outside of D, I’m not aware of other tracks where one could do that in the browser. If we’re just letting folks know they can edit the test suite locally, why shouldn’t we do this for all the other exercises?

I then suggested we could also remove the preceding text because the table later on introduces the codons used in the canonical test data. So this information both comes a little early and might cause confusion if a track chooses to add additional codons (Rust and Cairo both did for a while). I’m not aware of other tracks that add additional codons so that was a more a theoretical concern.

codingthat · April 30, 2025, 3:41pm

I agree with everything you said here. In addition, “If it works for one codon, the program should work for all of them” is both dubious (you can easily make a program that works for just one codon…and I have actually seen solutions like this posted on Exercism, that just return whatever the test suite happens to look for, without a general approach) and a bit unclear as to what knowledge or insight it offers the student.

So I’m happy with cutting it.

As for the parentheses around the note, I prefer them because it should, theoretically, already be clear without that note, just from the table and its preceding sentence. But in case a student is having trouble connecting the dots, here’s a note — but it’s not, strictly speaking, new information. Maybe a better way to say it would be:

(In other words, the latter AUG is not translated into another methionine here because it’s preceded by a STOP codon.)

But I suppose with that wording I’m OK dropping the parentheses, because the “in other words” is doing something similar.

SleeplessByte · May 8, 2025, 12:51am

Great. Perhaps you could summarise the changes now and we can get some consensus?

codingthat · May 8, 2025, 10:49am

I appreciate the nudge, but which version of the changes? The last two messages are still awaiting replies: Does @IsaacG agree with @BNAndras ’ cut in the end? Does anyone have thoughts on my parentheses and/or prefer the “in other words” route? I don’t want to presume that my opinion is the proposed change in creating a diff or summary (already learned my lesson there, haha). (Unless that’s what everyone would like this time. ;) )

IsaacG · May 8, 2025, 12:47pm

I agree the bit about there being 64 codons could be improved. I don’t think it should be dropped entirely; I think there is value in mentioning that the list is incomplete and there are more codons. I would suggest leaving that alone for now to limit the scope of the change/discussion and optionally discussing that as a second change once the originally proposed changes have been merged.

Feel free to propose/summarize a change here on the forum. It helps move things along. The smaller/simpler the change, the easier it is to reach a consensus. If you propose something that people disagree with, we can discuss it and/or update the proposal. If you propose something that sounds good, we can proceed to a PR.

codingthat · May 9, 2025, 9:52am

OK, here’s what I propose as the final diff. I added back in the 64 codons part with the improvements that had been proposed before a full cut had been proposed. I’ve opened a separate thread about the full cut.

Full summary of changes:

Active voice
Use tables
Remove unexplained reference to ribosomes
Clarify wording following STOP codon example
Standardize STOP codon (prioritizing ease of maintenance, not correctness)
Still mention 64 codons for now, but: cut unclear message about expanding the test suite (only applicable to offline users), cut untrue message about the program working for all codons if it works for one codon, and fix grammatical mistake (use “not all are important” instead of “all are not important”)
Remove redundant “after” (given “subsequent”)

Preview render:

Description

Translate RNA sequences into proteins.

You can break an RNA strand into three-nucleotide sequences called codons and then translate them into amino acids to make a protein like so:

RNA	Three-letter codons	Amino acids
“AUGUUUUCU”	“AUG”, “UUU”, “UCU”	“Methionine”, “Phenylalanine”, “Serine”

There are also three STOP codons. If you encounter any of these codons, ignore the rest of the sequence — the protein is complete. For example, UAA is a STOP codon, so ignore any subsequent codons:

RNA	Three-letter codons	Amino acids
“AUGUUUUCUUAAAUG”	“AUG”, “UUU”, “UCU”, “UAA”, “AUG”	“Methionine”, “Phenylalanine”, “Serine”

In other words, the latter AUG is not translated into another methionine here because it’s preceded by a STOP codon.

There are 64 codons which in turn correspond to 20 amino acids; however, not all codons will be used in this exercise. Below are the codons and resulting amino acids needed for the exercise.

Codon	Amino Acid
AUG	Methionine
UUU, UUC	Phenylalanine
UUA, UUG	Leucine
UCU, UCC, UCA, UCG	Serine
UAU, UAC	Tyrosine
UGU, UGC	Cysteine
UGG	Tryptophan
UAA, UAG, UGA	STOP

Learn more about protein translation on Wikipedia.

SleeplessByte · May 9, 2025, 10:10am

I like everything except for the first two tables. I rather see it as the rna string above it and then each row being one codon.

It doesn’t render well on smaller form factors.

IsaacG · May 9, 2025, 12:24pm

I’m not a huge fan of this paragraph after the example. It feels like the flow is broken to be. IMO the explanation should be in one place before the example.

IsaacG · May 9, 2025, 12:28pm

Note, focusing on small incremental changes tends to get things done faster and with less work. For instance, removing a redundant “after” or standardizing the capitation of stop should be a pretty simple change without much discussion. You could get those fixed without having them tangled up or bogged down by discussions about other rewrites. Doing so would also make those rewrite discussions simpler and more clear as you’d be discussing fewer things. Rather than changing a bunch of things at once, it might be easier to do small, incremental changes.

IsaacG · May 9, 2025, 12:44pm

You mentioned some tracks extend this list? If so, whatever PR rewrites this paragraph should take that into account and avoid communicating misleading information. I’m not sure how to best do so here, but this sentence might be problematic on tracks that extend the list. Conversely, maybe those tracks should follow the spec.

BNAndras · May 9, 2025, 12:54pm

Cairo and Rust don’t anymore. I’ll have to check the other tracks later.

BNAndras · May 9, 2025, 1:18pm

Alright, how about we orient the table vertically then since we’re only dealing with one RNA sequence at a time?

For example, the RNA strand “AUGUUUUCU” is translated like this:

| Codon | Amino Acid |
| ------ | ---------- |
| AUG | Methionine |
| UUU | Phenylalanine |
| UCU | Serine |

and

 For example, AUGUUUUCUUAAAUG contains a STOP codon, so ignore any subsequent codons:

| Codon | Amino Acid |
| ----- | ---------- |
| AUG | Methionine |
| UUU | Phenylalanine |
| UCU | Serine |
| UAA | STOPPED |
| AUG | STOPPED |

SleeplessByte · May 9, 2025, 1:24pm

Works for me

codingthat · May 24, 2025, 2:23pm

Thank you @IsaacG , I heard this earlier. What I tried to do here was include what (I had had the impression) had already been agreed on earlier, rather than splitting it, since the hard work of agreeing (I had thought) had already been done. The part that wasn’t agreed, I split out. I didn’t think it made sense to start again from scratch. If that is what had been proposed, I missed it.

Any update, @BNAndras ?

BNAndras:

Alright, how about we orient the table vertically then since we’re only dealing with one RNA sequence at a time?

For example, the RNA strand “AUGUUUUCU” is translated like this:

| Codon | Amino Acid |
| ------ | ---------- |
| AUG | Methionine |
| UUU | Phenylalanine |
| UCU | Serine |

and

 For example, AUGUUUUCUUAAAUG contains a STOP codon, so ignore any subsequent codons:

| Codon | Amino Acid |
| ----- | ---------- |
| AUG | Methionine |
| UUU | Phenylalanine |
| UCU | Serine |
| UAA | STOPPED |
| AUG | STOPPED |

I like this except for “STOPPED” since it seems to imply somewhat that it’s a literal value. Is everyone OK with “(stopped)” instead? And @IsaacG , could you weigh in this vertical table suggestion and my “(stopped)” tweak, before I make another summary of changes?

And @BNAndras and @SleeplessByte, could you weigh in on @IsaacG 's suggestion, before I make another summary of changes:

Personally, to me, it flows better having the before and after. Without it, the table abruptly breaks the flow. But flow or not, I like the “here’s the broad idea” (which most people hopefully will get), followed by an explicit table, followed by an “in other words” (which, in anyone is lost still after seeing the table, should make it abundantly clear; those who aren’t lost can easily skip/skim thanks to the “in other words” preface). However, I welcome others now to weigh in.

Or if everyone wants, I can split just these two things into new threads. If someone wants to split every original suggestion into a new thread, be my guest. Thanks!

IsaacG · May 24, 2025, 2:46pm

One person making a suggestion isn’t quite the same as everyone agreeing. As the number of changes grow, there are more specifics to clarify and it’s much harder for everyone to agree on all the points. People also miss bits of what is changing as the number of changes grow. Splitting things into smaller changes will make things easier for everyone. Seriously.

I’ve worked at Google for years. At Google, “small changes” was a well established guideline for changes. Large changes got automatic feedback telling people to make small changes. Large changes took much more time and effort to get through. Large changes needed a lot of justification explaining why they needed to be large.

See the Google recommendation for small changes.

I highly recommend taking specific parts that are widely agreed upon and making those into small changes, working through the changes incrementally. Trying to gather everything into one large chance is not the best option. It doesn’t simplify things. I’m pretty sure things thread would have been wrapped up by now if the chances were done in small increments.

Description

Translate RNA sequences into proteins.

You can break an RNA strand into three-nucleotide sequences called codons and then translate them into amino acids to make a protein.
For example, the RNA strand “AUGUUUUCU” is translated like this:

Codon	Amino Acid
AUG	Methionine
UUU	Phenylalanine
UCU	Serine

There are also three STOP codons. If you encounter any of these codons, ignore the rest of the sequence — the protein is complete.
For example, AUGUUUUCUUAAAUG contains a STOP codon, so ignore any subsequent codons:

Codon	Amino Acid
AUG	Methionine
UUU	Phenylalanine
UCU	Serine
UAA	(stopped)
AUG	(stopped)

In other words, the latter AUG is not translated into another methionine here because it’s preceded by a STOP codon.

There are 64 codons which in turn correspond to 20 amino acids; however, not all codons will be used in this exercise. Below are the codons and resulting amino acids needed for the exercise.

Codon	Amino Acid
AUG	Methionine
UUU, UUC	Phenylalanine
UUA, UUG	Leucine
UCU, UCC, UCA, UCG	Serine
UAU, UAC	Tyrosine
UGU, UGC	Cysteine
UGG	Tryptophan
UAA, UAG, UGA	STOP

Learn more about protein translation on Wikipedia.

IsaacG · May 24, 2025, 3:58pm

I’m not a fan of using STOP in one table and (stopped) in the other.

I’m in favor of this, which I see as in conflict with using (stopped).

Followed by a list of 7 changes I’m not sure how that’s 2 changes.

Since there is a new summary, I’ll reiterate my preference for not having a table interrupt the flow of an example.