Incorrect test in protein-translation

I’m trying to sync the Rust track’s version of protein-translation with problem-specifications. The canoncial description lists the following codons and their translations:

Codon Protein
AUG Methionine
UUU, UUC Phenylalanine
UUA, UUG Leucine
UCU, UCC, UCA, UCG Serine
UAU, UAC Tyrosine
UGU, UGC Cysteine
UGG Tryptophan
UAA, UAG, UGA STOP

However, there exist more codons. In particular, lysine maps to the codons AAA and AAG. But there is a test case in canonical-data.json against “invalid” codons, but the string "AAA" is used for it:

{
  "uuid": "1e75ea2a-f907-4994-ae5c-118632a1cb0f",
  "description": "Non-existing codon can't translate",
  "comments": [
    "It's up to a track how to behave when a non-existing codon is    ",
    "passed; to simplify the exercise, this test may be excluded in   ",
    "order to remove the requirement for input validation.            "
  ],
  "scenarios": ["input-validation"],
  "property": "proteins",
  "input": {
    "strand": "AAA"
  },
  "expected": {
    "error": "Invalid codon"
  }
},

The example solution on the Rust track fails this test case, because it correctly maps "AAA" to lysine. Due to how the exercise was designed, I can fix that on the Rust side without breaking any existing solutions. (The codon to protein mapping is an input to the students submission. So, a submission should still work even with a technically incorrect input mapping.)

However, maybe it would be better to fix the test with the reimplements mechanism? What do you think?

1 Like

I don’t really know anything about proteins, but the link you’ve provided seems to be quite clear :) So yes, I do think we should reimplement it.

Oh, wait. I just found the full table. Invalid codons don’t even exist :flushed: (on this wikipedia page)

I guess we could say a string with two characters is an invalid codon?

That would be a different type of test I feel. But “XYZ” would be an invalid codon right?

Yes, that’s much better. I was about to make the PR, turns out that test case exists already :upside_down_face:

At this point, should we just delete / deprecate it?

Ah, I was just hit by this on the C track. I think this test is in contradiction with the following paragraph from the description (emphasis mine):

There are 64 codons which in turn correspond to 20 amino acids; however, all of the codon sequences and resulting amino acids are not important in this exercise. If it works for one codon, the program should work for all of them. However, feel free to expand the list in the test suite to include them all.

So, to me it clearly says we can implement the full mapping if we’d like, even if that’s not required and our solution will still pass the tests if we only use the partial list at the end of the description.

Now, this contradiction could either be resolved by removing the test, or by changing the description. IMO we should preserve the description and change/remove the test because:

  1. this leaves more freedom in coding;
  2. that’s how RNA works for real.
2 Likes

I’m not sure we can deprecate test cases. And for deletion I’d need to overrule CI (which I can). What do other people think?

I think I’d add a new test so we can mark it as reimplementing AAA. Can a test reimplement both AAA and XYZ? They test different things now, but the proposed replacement for AAA will be testing the same thing as XYZ.

If we retroactively mark XYZ as reimplementing AAA, that might be confusing on tracks that already have both tests. That’s why I would like a new test reimplementing either AAA or both XYZ and AAA.

Both isn’t an option, but we couldn’t we just mark the existing XYZ test case to be reimplementing the test case we want to exclude?

1 Like

Yeah, that’s my preference. I was just wondering if it’d be confusing to maintainers that an existing test now reimplements a test when it previously didn’t.

I created a PR which makes the existing test case against XYZ reimplement the one against AAA.

1 Like

I merged the PR, but there is something wrong. (I think?)

Running configlet sync, I get the following in tests.toml:

[1e75ea2a-f907-4994-ae5c-118632a1cb0f]
description = "Non-existing codon can't translate"

[9eac93f3-627a-4c90-8653-6d0a0595bc6f]
description = "Unknown amino acids, not part of a codon, can't translate"
reimplements = "1e75ea2a-f907-4994-ae5c-118632a1cb0f"

The upper case should’ve received a include = false due to the reimplementation, right?

Sorry, figured it out. It was an issue with how my test generator automatically called configlet with some flags. Everything good.