Could local tests count as verified when submitted?

deleted-user-33691 · May 30, 2023, 9:36am

For a future idea, it would be great locally passed tests could be accepted on the platform. Like kind of signed request by the CLI or something. I’m sure there can be a solution. Could even take the pressure from the online platform.

Not everybody would be willed to work locally. But that could be a great addition.

BNAndras · May 31, 2023, 6:06am

That’d be great for tracks without online test runners like Dart.

Adrien-ANTON-LUDWIG · May 31, 2023, 12:26pm

This would be so cool!

It would be interesting to discuss this topic further and get the team’s point of view. Maybe it needs a post of its own ?

In particular, what would be an acceptable compromise between load gain and a reliable exercise solution?

vaeng · June 6, 2023, 12:15pm

If you hash the files that the user should not edit to pass the tests, this might be easy to validate.

I see the problem in the other direction. There can be cases that fail locally but would be okay online.

deleted-user-33691 · June 6, 2023, 12:17pm

As soon as you go this way, I guess you should be able to handle it. Usually I saw it the other way around. Especially with C, because of Alpine and musl.

But I’m thinking about that mostly because of convenience and unload computation from the platform.

sbromberger · June 6, 2023, 12:36pm

If you’re not concerned about people gaming the system, why not just have users send a simple response to exercism in the submit command? (e.g., exercism submit --local-tests-passed file1 file2)

If you’re concerned about people gaming the system, then you have much bigger issues to solve (that I won’t enumerate here).

You could have a notification on any published solutions that the code was not tested on official servers as a caution to others that the solution might not be valid.

iHiD · June 6, 2023, 12:42pm

The biggest issue with this is that people can modify the test suites (acccidently) and that in fact, we need people to modify the tests suites as the tests are skipped by default. The reason we’ve not done this before is that we don’t have a good unskip+run mechanism.

Maybe one approach would be to download a meta version of the tests with everything unskipped and run that as part of the cli submit. However, it does rely on us knowing how to run the tests, which is not always that obvious (e.g. needing to be in the right directory, potential env issues, multi-platform, etc).

I’m open to this idea but it’s one of those things that gets more complex when scaling to 65 languages across numerous OS setups etc. But I’m open to proof of concepts

Also, I don’t care about people “cheating”.

vaeng · June 6, 2023, 12:51pm

To check the workspace is set up correctly, we could pass the exemplar files first and abort, if these fail.

deleted-user-33691 · June 6, 2023, 2:37pm

I guess cheating is really not a problem. I still can cheat by copy&paste a already working solution by somebody else. You just cheat on yourself.

The very first version could work with the official test runners. Podman is so great for local installation, since it works also with nice GUI on all platforms. With original Docker there are already limitations. I like Podman using WSL on my Windows 11.

Next step could be to work on pure local setups. Sure, how tests must be run etc. is different for all languages. So for sure no tiny task.

iHiD · June 6, 2023, 2:44pm

Instinctively, I don’t think we could ask people to install podman (or docker or whatever). That’s a huge amount of dependency and our aim with the CLI has always been to keep it as small as possible. So I think we’d need to find a native way to do this.

ErikSchierboom · June 6, 2023, 5:41pm

To be honest, I have a hard time seeing this working well without making all sorts of assumptions. Consider the fact that some tracks allow for multiple implementations to be used (chez scheme vs guile scheme). I also absolutely don’t want to complicate things for our students, so whatever we come up with, it needs to be seamless for students.

BethanyG · June 6, 2023, 6:16pm

I’m also concerned about complexity, as well as things like third-party libraries.

For example, the Python track doesn’t currently allow any third-party libraries in student solutions. This is partially due to space/security in docker – but it also has to do with learning objectives.

Solving something with core Python is different from solving something using Numpy or Pandas or any of the other wide selection of libraries out there. And learning library quirks can really hamper a students (at least at first) understanding of the core idioms of the language.

I’d be concerned that someone who is experimenting with something locally uploads a solution that passes all the tests (in their environment) and then others look at their solution and get stymied / confused / misled by their use of third-party libraries.

deleted-user-33691 · June 6, 2023, 6:47pm

That sounds 100% valid to me. Very good points. I just came up with the idea, because of problems with the test runners. Intransparent errors for C or just broken runs. Probably it can be more about making the tests more stable, instead of an alternative approach.