Tweak language and formatting in `protein-translation/description.md`

I want to unravel this since there are two separate issues that were conflated by proximity.

However, feel free to expand the list in the test suite to include them all. should be dropped because it implies a student could edit the test suite in the browser. However, outside of D, I’m not aware of other tracks where one could do that in the browser. If we’re just letting folks know they can edit the test suite locally, why shouldn’t we do this for all the other exercises?

I then suggested we could also remove the preceding text because the table later on introduces the codons used in the canonical test data. So this information both comes a little early and might cause confusion if a track chooses to add additional codons (Rust and Cairo both did for a while). I’m not aware of other tracks that add additional codons so that was a more a theoretical concern.

2 Likes

I agree with everything you said here. In addition, “If it works for one codon, the program should work for all of them” is both dubious (you can easily make a program that works for just one codon…and I have actually seen solutions like this posted on Exercism, that just return whatever the test suite happens to look for, without a general approach) and a bit unclear as to what knowledge or insight it offers the student.

So I’m happy with cutting it.

As for the parentheses around the note, I prefer them because it should, theoretically, already be clear without that note, just from the table and its preceding sentence. But in case a student is having trouble connecting the dots, here’s a note — but it’s not, strictly speaking, new information. Maybe a better way to say it would be:

(In other words, the latter AUG is not translated into another methionine here because it’s preceded by a STOP codon.)

But I suppose with that wording I’m OK dropping the parentheses, because the “in other words” is doing something similar.

1 Like

Great. Perhaps you could summarise the changes now and we can get some consensus?

1 Like

I appreciate the nudge, but which version of the changes? The last two messages are still awaiting replies: Does @IsaacG agree with @BNAndras ’ cut in the end? Does anyone have thoughts on my parentheses and/or prefer the “in other words” route? I don’t want to presume that my opinion is the proposed change in creating a diff or summary (already learned my lesson there, haha). (Unless that’s what everyone would like this time. ;) )

I agree the bit about there being 64 codons could be improved. I don’t think it should be dropped entirely; I think there is value in mentioning that the list is incomplete and there are more codons. I would suggest leaving that alone for now to limit the scope of the change/discussion and optionally discussing that as a second change once the originally proposed changes have been merged.

Feel free to propose/summarize a change here on the forum. It helps move things along. The smaller/simpler the change, the easier it is to reach a consensus. If you propose something that people disagree with, we can discuss it and/or update the proposal. If you propose something that sounds good, we can proceed to a PR.

1 Like

OK, here’s what I propose as the final diff. I added back in the 64 codons part with the improvements that had been proposed before a full cut had been proposed. I’ve opened a separate thread about the full cut.

Full summary of changes:

  1. Active voice
  2. Use tables
  3. Remove unexplained reference to ribosomes
  4. Clarify wording following STOP codon example
  5. Standardize STOP codon (prioritizing ease of maintenance, not correctness)
  6. Still mention 64 codons for now, but: cut unclear message about expanding the test suite (only applicable to offline users), cut untrue message about the program working for all codons if it works for one codon, and fix grammatical mistake (use “not all are important” instead of “all are not important”)
  7. Remove redundant “after” (given “subsequent”)

Preview render:

Description

Translate RNA sequences into proteins.

You can break an RNA strand into three-nucleotide sequences called codons and then translate them into amino acids to make a protein like so:

RNA Three-letter codons Amino acids
“AUGUUUUCU” “AUG”, “UUU”, “UCU” “Methionine”, “Phenylalanine”, “Serine”

There are also three STOP codons. If you encounter any of these codons, ignore the rest of the sequence — the protein is complete. For example, UAA is a STOP codon, so ignore any subsequent codons:

RNA Three-letter codons Amino acids
“AUGUUUUCUUAAAUG” “AUG”, “UUU”, “UCU”, “UAA”, “AUG” “Methionine”, “Phenylalanine”, “Serine”

In other words, the latter AUG is not translated into another methionine here because it’s preceded by a STOP codon.

There are 64 codons which in turn correspond to 20 amino acids; however, not all codons will be used in this exercise. Below are the codons and resulting amino acids needed for the exercise.

Codon Amino Acid
AUG Methionine
UUU, UUC Phenylalanine
UUA, UUG Leucine
UCU, UCC, UCA, UCG Serine
UAU, UAC Tyrosine
UGU, UGC Cysteine
UGG Tryptophan
UAA, UAG, UGA STOP

Learn more about protein translation on Wikipedia.

1 Like

I like everything except for the first two tables. I rather see it as the rna string above it and then each row being one codon.

It doesn’t render well on smaller form factors.

1 Like

I’m not a huge fan of this paragraph after the example. It feels like the flow is broken to be. IMO the explanation should be in one place before the example.

1 Like

Note, focusing on small incremental changes tends to get things done faster and with less work. For instance, removing a redundant “after” or standardizing the capitation of stop should be a pretty simple change without much discussion. You could get those fixed without having them tangled up or bogged down by discussions about other rewrites. Doing so would also make those rewrite discussions simpler and more clear as you’d be discussing fewer things. Rather than changing a bunch of things at once, it might be easier to do small, incremental changes.

1 Like

You mentioned some tracks extend this list? If so, whatever PR rewrites this paragraph should take that into account and avoid communicating misleading information. I’m not sure how to best do so here, but this sentence might be problematic on tracks that extend the list. Conversely, maybe those tracks should follow the spec.

1 Like

Cairo and Rust don’t anymore. I’ll have to check the other tracks later.

2 Likes

Alright, how about we orient the table vertically then since we’re only dealing with one RNA sequence at a time?

For example, the RNA strand “AUGUUUUCU” is translated like this:

| Codon | Amino Acid |
| ------ | ---------- |
| AUG | Methionine |
| UUU | Phenylalanine |
| UCU | Serine |

and

 For example, AUGUUUUCUUAAAUG contains a STOP codon, so ignore any subsequent codons:

| Codon | Amino Acid |
| ----- | ---------- |
| AUG | Methionine |
| UUU | Phenylalanine |
| UCU | Serine |
| UAA | STOPPED |
| AUG | STOPPED |
1 Like

Works for me

2 Likes

Thank you @IsaacG , I heard this earlier. What I tried to do here was include what (I had had the impression) had already been agreed on earlier, rather than splitting it, since the hard work of agreeing (I had thought) had already been done. The part that wasn’t agreed, I split out. I didn’t think it made sense to start again from scratch. If that is what had been proposed, I missed it.

Any update, @BNAndras ?

I like this except for “STOPPED” since it seems to imply somewhat that it’s a literal value. Is everyone OK with “(stopped)” instead? And @IsaacG , could you weigh in this vertical table suggestion and my “(stopped)” tweak, before I make another summary of changes?

And @BNAndras and @SleeplessByte, could you weigh in on @IsaacG 's suggestion, before I make another summary of changes:

Personally, to me, it flows better having the before and after. Without it, the table abruptly breaks the flow. But flow or not, I like the “here’s the broad idea” (which most people hopefully will get), followed by an explicit table, followed by an “in other words” (which, in anyone is lost still after seeing the table, should make it abundantly clear; those who aren’t lost can easily skip/skim thanks to the “in other words” preface). However, I welcome others now to weigh in.

Or if everyone wants, I can split just these two things into new threads. If someone wants to split every original suggestion into a new thread, be my guest. Thanks!

One person making a suggestion isn’t quite the same as everyone agreeing. As the number of changes grow, there are more specifics to clarify and it’s much harder for everyone to agree on all the points. People also miss bits of what is changing as the number of changes grow. Splitting things into smaller changes will make things easier for everyone. Seriously.

I’ve worked at Google for years. At Google, “small changes” was a well established guideline for changes. Large changes got automatic feedback telling people to make small changes. Large changes took much more time and effort to get through. Large changes needed a lot of justification explaining why they needed to be large.

See the Google recommendation for small changes.

I highly recommend taking specific parts that are widely agreed upon and making those into small changes, working through the changes incrementally. Trying to gather everything into one large chance is not the best option. It doesn’t simplify things. I’m pretty sure things thread would have been wrapped up by now if the chances were done in small increments.

See also

The vertical table sounds good to me.

I thought the whole point here was to use STOPPED uniformly. Now I’m not sure what you’re suggesting we use and if it’s going to be uniform or not.

Can we focus on one change at a time? Juggling so many changes is confusing and makes it hard to understand what’s being proposed/changed. I’m not sure what I’m agreeing to anymore without seeing all the changes.

I already acknowledged this general strategy and will definitely do so going forward — I am not arguing this as a general approach. This was about what to do with this one. And here, I had already summarized these changes a few times, with nobody objecting to most of the summary. So I only separated out the part where there had been disagreement.

Thanks.

No, it was to use STOP (all caps, no quotes) uniformly as the interpretation of a codon in isolation, not STOPPED as an interpretation of what the program has done during the processing of a sequence of codons. Hence my suggestion: These are different contexts, and to me, STOPPED gives a moment’s pause due to the conflation it presents, whereas (stopped) gives a clearly distinct meaning appropriate to the context.

We are down to 2 changes, thanks to your agreement about the vertical table. I actually had asked if I should open new threads about them. But then we juggled anyway. So below is an updated proposal, thanks to your assent regarding the vertical tables. Hopefully this is manageable. If not, I will open 2 other threads then come back here with the final proposal.

Updated proposal.

Full summary of changes:

  1. Active voice
  2. Use vertical tables, with the original sequence kept in preceding ¶ text, with the program run sequence using (stopped) instead of STOPPED, to distinguish from STOP and avoid conflating contexts
  3. Remove unexplained reference to ribosomes
  4. Clarify wording following STOP codon example
  5. Standardize STOP codon (prioritizing ease of maintenance, not correctness)
  6. Still mention 64 codons for now, but: cut unclear message about expanding the test suite (only applicable to offline users), cut untrue message about the program working for all codons if it works for one codon, and fix grammatical mistake (use “not all are important” instead of “all are not important”)
  7. Remove redundant “after” (given “subsequent”)

Preview render:

Description

Translate RNA sequences into proteins.

You can break an RNA strand into three-nucleotide sequences called codons and then translate them into amino acids to make a protein.
For example, the RNA strand “AUGUUUUCU” is translated like this:

Codon Amino Acid
AUG Methionine
UUU Phenylalanine
UCU Serine

There are also three STOP codons. If you encounter any of these codons, ignore the rest of the sequence — the protein is complete.
For example, AUGUUUUCUUAAAUG contains a STOP codon, so ignore any subsequent codons:

Codon Amino Acid
AUG Methionine
UUU Phenylalanine
UCU Serine
UAA (stopped)
AUG (stopped)

In other words, the latter AUG is not translated into another methionine here because it’s preceded by a STOP codon.

There are 64 codons which in turn correspond to 20 amino acids; however, not all codons will be used in this exercise. Below are the codons and resulting amino acids needed for the exercise.

Codon Amino Acid
AUG Methionine
UUU, UUC Phenylalanine
UUA, UUG Leucine
UCU, UCC, UCA, UCG Serine
UAU, UAC Tyrosine
UGU, UGC Cysteine
UGG Tryptophan
UAA, UAG, UGA STOP

Learn more about protein translation on Wikipedia.

I’m not a fan of using STOP in one table and (stopped) in the other.

I’m in favor of this, which I see as in conflict with using (stopped).

Followed by a list of 7 changes :slight_smile: I’m not sure how that’s 2 changes.

Since there is a new summary, I’ll reiterate my preference for not having a table interrupt the flow of an example.