Workflow to suggest `[no important files changed]`

senekor · November 20, 2023, 11:50am

Hello friends,

So there is the magic string [no important files changed] we can put in commit messages to avoid rerunning tests unnecessarily. (post announcing the feature) One concern that has been raised is that adding this flag can easily be forgotten. I can confirm this is a problem, I have personally caused all submissions on the Rust track to be retested unnecessarily Personally, I’m also always anxious to mistype it so I have to go around searching for some place where I can copy-paste the string.

@MatthijsBlom suggested to automate part of this process. I want to pick that suggestion up and made a little proof-of-concept workflow on the Rust track.

The main purpose of this workflow is not to be merged as-is, but to serve as a basis for discussion.

What do you think?

Is this useful or bloat?
Can it be improved?
Is this easy to port to other language tracks? (I guess at least the path which triggers the workflow has to be adjusted)

sanderploegsma · November 20, 2023, 10:30pm

I think this is an interesting idea!

Personally I’d like the workflow to post a comment for all changes that would trigger a rerun of tests, not just when tests are updated.

The comment can then simply point out what to consider. This can be adding [no important files changed] or if a rerun is preferred, maybe the comment can point out to limit the number of exercises being updated in a single PR.

senekor · November 21, 2023, 5:32am

Yes, agreed. I’m not 100% confident how to determine that. The documentation has some stuff regarding that, and I believe it has to do with the "files" property in config.json. (Both the track config and the exercise configs have it though)

The point of that would be to batch the test reruns? So they don’t all happen at once and clog up the system?

sanderploegsma · November 21, 2023, 9:09am

Yes if I understand the docs correctly, a rerun is triggered if any of the files listed in files.editor, files.test or files.invalidator in the exercise’s .meta/config.json have been touched.

Yes that’s the point indeed.

One thing that I’m unsure of is when we actually want solutions to be re-tested automatically after making changes to a track. What is a typical use-case where automatically rerunning tests is preferred @iHiD/@ErikSchierboom?

senekor · November 21, 2023, 9:35am

I think rerunning tests is the default. The website shows community solutions based on whether or not they pass the latest tests. Assume the test files change because a test was added. We want the tests to be rerun so we can show the community solutions which still pass the new test. The ones that don’t pass the new test should not be shown, because presumably those were incorrect implementations all along.

We only want to override this and avoid test reruns if we know that the testing logic did not change. e.g. running a formatter over test files and the changes are only whitespace. Or in my case, I’m changing the test files of all Rust exercises because I’m incrementally applying the test case generator I wrote to all exercises. Since that usually doesn’t change the test logic, assuming the previous test cases were already up-to-date with problem-specifications, we don’t want to rerun the tests in this case.

senekor · November 21, 2023, 9:37am

Given that an exercise’s .meta/config.json is authoritative for triggering test reruns, I guess it should be quite feasible to make the workflow language track independent. Maybe this could then be upstreamed to the repo exercism/github-actions and maintainers who would like to enable this automation can copy a very small workflow file that simply uses this action into their track repo.

sanderploegsma · November 21, 2023, 10:01am

Yes that is also my understanding of how it works. Reason for asking for clarifications is recent comments made by @ErikSchierboom on PRs syncing test cases, like this one: Sync tests for practice exercise `custom-set` by sanderploegsma · Pull Request #2533 · exercism/java · GitHub

This makes it unclear to me when we choose correctness of submissions (by retesting them) over cost reduction.

ErikSchierboom · November 21, 2023, 11:27am

The way it works is that when we sync an exercise, we calculate a hash of its “important” files. These files consist of:

.docs/instructions.md
.docs/instructions.append.md
.docs/introduction.md
.docs/introduction.append.md
.docs/hints.md
Any files in the .files.test[] array in the exercise’s .meta/config.json file
Any files in the .files.invalidator[] array in the exercise’s .meta/config.json file
Any files in the .files.editor[] array in the exercise’s .meta/config.json file

The primary goal of this “important files hash” is to determine whether to present the student with the option to update an exercise to the latest version.

When determining whether or not to re-run the tests, we only consider:

Any files in the .files.test[] array in the exercise’s .meta/config.json file
Any files in the .files.invalidator[] array in the exercise’s .meta/config.json file
Any files in the .files.editor[] array in the exercise’s .meta/config.json file

We consider just these files as they have the potential to affect the test outcome (which the docs never should do). Now if a maintainer knows that they’re only making changes to these files that won’t influence the test outcome (e.g. by only changing the formatting of the files), the [no important files changed] marker can be added to the commit.

ErikSchierboom · November 21, 2023, 11:29am

100% I started working on this myself but never got around to finishing it.

Maybe But I’d be more than happy with this initial version!

As per my above post, I think it should be easy to make it track-agnostic.

senekor · November 21, 2023, 2:51pm

Alright, I will derp around on the Rust track to make the workflow based on the invalidating files specified in .meta/config.json and post again here when I have presentable results

Little devil pops up over shoulder: “Write an exercises with tests that read from the documentation files”

senekor · November 25, 2023, 1:34pm

@ErikSchierboom

I now have a version which I think should be language independet. It reads files.test, files.invalidator and files.editor from an exercise’s config file. I was not yet able to test invalidator and editor, as the Rust track generally doesn’t have these two fields.

The script also assumes exercises are always located at exercises/{practice,concept}/$SLUG. I believe this is the case for all a language tracks, right?

The following PRs are only meant as demonstration and basis for work upstream in the exercism/github-actions repo.

PR with the workflow
PR which touches test files (comment triggered)
PR which touches README.md (comment not triggered)
PR which touches exercise files not mentioned in config.json (comment not triggered)

ErikSchierboom · November 28, 2023, 11:51am

That is correct.

ErikSchierboom · November 28, 2023, 11:51am

Thanks for all that! I’ll have a look once I’ve got some time.

ErikSchierboom · January 11, 2024, 7:48am

@senekor I totally forgot about this! I’ll get to it sometime.

ErikSchierboom · January 23, 2024, 11:04am

I’ve finally found/made time to review the PR :)

kotp · January 24, 2024, 12:30am

I have not done so, but if someone would want to produce a git “pre-commit” hook that adds this to messages based on the requirements, that might be a thing.

senekor · January 24, 2024, 12:42pm

I’m not sure that would be the right place. (I assume by “adds this to messages” you mean commit messages?) The magic string needs to be in the commit message of the squash commit when the PR is merged. So I think the PR itself on GitHub is the right place to have the automation. It’s also not possible to apply the magic string automatically, because really only a human can determine if the test file has changed in a way that warrants rerunning tests or not.

So I think the best we can do is to remind peope to consider adding the string themselves before they merge.

ErikSchierboom · January 25, 2024, 12:37pm

I’ve done some testing too and it seems to work! I’ll add it to the org-wide-files repo to sync it to all tracks.

ErikSchierboom · January 25, 2024, 12:48pm

@senekor I’ve found one issue: it also comments on new files: New exercise (do not merge) by ErikSchierboom · Pull Request #1217 · exercism/fsharp · GitHub Could you fix that please?

senekor · January 25, 2024, 1:24pm

@ErikSchierboom I think I might have a solution, but this is not e2e tested: adding the --diff-filter=M option:

for changed_file in $(git diff --diff-filter=M --name-only origin/main); do

It seems with that, the git command outputs only file names that were changed (not added, renamed, deleted).