Test runner - update in docker, results interpretation by UI

Hi exercists! I have two questions / topics related to test runner functionality.

  1. When is test runner docker image updated exactly? I’ve made commit to main to test runner repo, but it seems that docker image doesn’t contain latest changes (newly implemented exercise). I’ve tested manually - downloaded latest repo version and install script and it works locally. However version of Docker image in UI still uses older one.
    image

  2. My second problem is related to interpretation of results.json. How does UI really print result.json output? I have this file (sample results.json):

{
  "version" : "2",
  "status" : "error",
  "tests" : [
    {
      "name" : "Reading empty buffer should fail",
      "status" : "pass",
      "test_code" : "test01_ReadingEmptyBufferShouldFail\n\t\"Tip: Remember to review the class [Comment] tab\"\n\n\t<exeTestName: 'eading empty buffer should fail'>\n\t<exeTestUUID: '28268ed4-4ff3-45f3-820e-895b44d53dfa'>\n\tcircularBufferCalculator capacity: 1.\n\tself should: [circularBufferCalculator read] raise: Error."
    },
    {
      "name" : "Can read an item just written",
      "status" : "error",
      "message" : "Message not understood: CircularBuffer >> #capacityu\n---Stack-trace---
....

But it is shown in UI like this. Why aren’t cases interpreted by UI?
image

Error, means that a compile error/syntax error has occurred. Meaning no tests should have been able to execute. I personally recommend test runners pr to be reviewed.

Ok, this is where Pharo differs: it distinguishes between test case error - which is runtime exception and failure - assertion missmatch. None of these blocks other tests to run. So how it should be adopted? Change overall state to “fail” ?
PS: PR has been reviewed even vanilla / golden tests.

There are situations where a test case gets an error, but aslong as all test cases didn’t get errors, so will it need to be marked as fail. Which is mentioned in the docs: " Where the status is error (no test was executed correctly), the top level message key should be provided. It should provide the occurring error to the user. As it is the only piece of information a user will receive on how to debug their issue, it must be as clear as possible:"

Thanks for explanation and pointing me to docs. I’ve seen the description of interface at the time of initial implementation, but now it is definitely more clear than before. I will change runner accordingly.
Any hints to the docker image update? - point 1)

Perhaps the fact that is was not pushed through a pr could have affected the workflow but I am not sure. @iHiD, do you know?

If any test was executed, the top-level status must be either pass or fail. If none of the tests could be executed, the top-level status must be error (and the message property should then also be present).

I don’t know exactly how the test runner works, but it is only redeployed when a commit is made to the test runner repo. Even then, the commit has to be one that invalidates at least part of the Docker layer caching, otherwise no changes are made to the Docker image.

The issue is likely that docker caches this line:

RUN pharo metacello install github://exercism/pharo-smalltalk:main/releases/latest BaselineOfExercism --groups=$GROUP_NAMES

So it is not rerun and therefore the test runner won’t sync with the website.

You need to add something either on or before this line that includes the sha of main the website I think.

@ErikSchierboom Thoughts on the best way to do this?

Probably something like this:

  • Add a new file to the pharo-test-runner repo that contains the SHA of the pharo repository it is synced to
  • Add a GitHub Actions workflow that will create a PR to the pharo-test-runner repo once a commit was pushed to main in which one or more exercises where changed

Sounds very sensible!

I’ve created PR to trigger new docker image in test runner repo manually, but it didn’t really help. Question is, why it was working for prior 2 exercises that were pushed to Pharo repo before and now it doesn’t. Is it because of the docker cashes the line iHiD mentioned?

It is. The PR submitted does not invalidate the line where image is loaded. That is this line, right? pharo-smalltalk-test-runner/Dockerfile at main · exercism/pharo-smalltalk-test-runner · GitHub Is it possible to refer to a SHA there?

Side note: I just noticed that the branch protections were not correctly set, which we’ve corrected. This means that you’ll need a code owner (me or Jeremy) to sign off on PRs, which is how we’ve organized it for all tooling repos.

Yes, it should be possible. I’ve tested with:

./pharo Pharo.image metacello install github://exercism/pharo-smalltalk:fa29b3b/dev/src BaselineOfExercism --groups=testRunner

So commit SHA can be used as well. How to pass SHA to that line? Using build args? Or hardcode it to dockerfile?

that you’ll need a code owner (me or Jeremy) to sign off on PRs, which is how we’ve organized it for all tooling repos.

@ErikSchierboom I don’t think it does work fully. I can’t do direct commits to main, but still can merge or squash commits of PRs to main without any review.

Regarding invalidation of cache:

Add a new file to the pharo-test-runner repo that contains the SHA of the pharo repository it is synced to
Add a GitHub Actions workflow that will create a PR to the pharo-test-runner repo once a commit was pushed to main in which one or more exercises where changed

I found a different approach using this one:
--build-arg CACHEBUST=$(date +"%s")
and then in Dockerfile:

ARG CACHEBUST=1
# RUN pharo install ...

This will effectively invalidate all steps after CACHEBUST use. What do you think?
PS: I’m not sure where to put build argument (where is docker build command defined). Is it this line: ci.yml#L24 ?

It will, but it will also do that unnecessarily (i.e. when nothing has changed) as the cache will be busted every single time.

I see. There is very low traffic (in terms of incoming PRs) so I didn’t consider as an issue. What about this change in Dockerfile:

FROM ghcr.io/ba-st/pharo-loader:v10.0.1 AS loader

ARG GROUP_NAMES=testRunner

ARG SHA_OR_BRANCH=main

RUN pharo metacello install github://exercism/pharo-smalltalk:$SHA_OR_BRANCH/releases/latest BaselineOfExercism --groups=$GROUP_NAMES

But more trickier would be how to pass $SHA_OR_BRANCH arg with given content of file (where commit SHA occurs).

That looks good. Does the GROUP_NAMES bit also need to be an ARG?

I think I have some ideas on how to do this. I’ll probably get to this next week.

@ErikSchierboom Please take a look at PR, which should provide a solution: Changed CI and use of buld arg to pass. commit SHA by Bajger · Pull Request #26 · exercism/pharo-smalltalk-test-runner · GitHub
Regarding GROUP_NAMES - yes it is used in running example tests: Run tests in Docker - CI Job