Elixir submission failing: GenServer Mix.Sync.PubSub terminating

Have already tried adding comments and re-submitting a couple of times, but the build is always failing on the error below. Tests are all passing locally.

12:47:06.474 [error] GenServer Mix.Sync.PubSub terminating
** (File.Error) could not make directory (with -p) "/tmp/mix_pubsub/P2SrvbEiUDXbwyyvCwHtmA": permission denied
    (elixir 1.18.1) lib/file.ex:346: File.mkdir_p!/1
    (mix 1.18.1) lib/mix/sync/pubsub.ex:222: Mix.Sync.PubSub.create_subscription_file/2
    (mix 1.18.1) lib/mix/sync/pubsub.ex:142: Mix.Sync.PubSub.handle_call/3
    (stdlib 6.2) gen_server.erl:2381: :gen_server.try_handle_call/4
    (stdlib 6.2) gen_server.erl:2410: :gen_server.handle_msg/6
    (stdlib 6.2) proc_lib.erl:329: :proc_lib.init_p_do_apply/3
Last message (from Mix.PubSub.Subscriber): {:subscribe, #PID<0.111.0>, "/tmp/solution_i2Nb99z1Gd/_build/test"}
State: %{port: nil, hash_to_pids: %{}}
Client Mix.PubSub.Subscriber is alive

    (stdlib 6.2) gen.erl:241: :gen.do_call/4
    (elixir 1.18.1) lib/gen_server.ex:1125: GenServer.call/3
    (mix 1.18.1) lib/mix/sync/pubsub.ex:44: Mix.Sync.PubSub.subscribe/1
    (mix 1.18.1) lib/mix/pubsub/subscriber.ex:21: Mix.PubSub.Subscriber.init/1
    (stdlib 6.2) gen_server.erl:2229: :gen_server.init_it/2
    (stdlib 6.2) gen_server.erl:2184: :gen_server.init_it/6
    (stdlib 6.2) proc_lib.erl:329: :proc_lib.init_p_do_apply/3

12:47:06.574 [notice] Application mix exited: shutdown
** (ArgumentError) errors were found at the given arguments:

  * 1st argument: the table identifier does not refer to an existing ETS table

    (stdlib 6.2) :ets.lookup(Mix.State, :debug)
    (mix 1.18.1) lib/mix/state.ex:26: Mix.State.get/2
    (mix 1.18.1) lib/mix/cli.ex:112: Mix.CLI.run_task/2

Same here.

1 Like

Angelika Cathor is helping work it out on this Discord thread.

(@angelikatyborska for your reference)

1 Like

They reverted the PR while trying to figure it out. Just re-submitted and build is passing!

@iHiD / @ErikSchierboom it seems that the /tmp directory is not writable in the test-runner container. In Elixir 18.x there was a feature that prevents compilation units from running concurrently on separate is processes and I suspect that to facilitate that they are using a file-based semaphore to coordinate the work. The test framework (outside of our control) wants to write to a file in a subdirectory of the /tmp directory using mkdir -p

@angelikatyborska mentioned that it should be writeable by the test process’ user based on the test runner docs, can you confirm if this is the case?

1 Like

Actually all directories are writable right now, we just encourage containers to only write to the output and tmp dirs

I’m pretty certain that /tmp is generally writable because part of the elixir-test-runner script is to run mktemp -d /tmp/solution_XXXXXXXXXX which happens in a shell script, before running any Elixir code, before the error described in this post.

Note that we use a custom appuser user in the Elixir docker container to run the tests.

What started happening in Elixir 1.18.x is running mkdir -p /tmp/mix_pubsub/[hash] when Elixir code gets compiled. The hash is unique every run. I can see a few possible scenarios when this command would fail with a permission error:

  1. /tmp/mix_pubsub already exists, with the default permissions of 755, and belongs to a different user.
    • This could happen if the contents of /tmp created during the Docker image build get merged to the tmpfs that gets mounted when running the Docker container. The Docker image build uses the root user to compile some Elixir code which would create a /tmp/mix_pubsub directory. However, AFAIK, Docker’s default behavior is to completely overwrite contents of a mount path.
  2. /tmp/mix_pubsub already exists, and belongs to the same user (appuser), but lacks write permissions, e.g. has permissions of 555 for some reason.
    • This could happen if the Elixir compiler code modifies default permissions of this directory (doubtful, I didn’t find anything like that in its source code).
    • It could also happen if the umask in the Docker container is not equal to the default value of 022, causing default new file permissions to be different than the default 755.

I don’t really know how to begin to debug this without ssh access to the container…

3 Likes

The problem has been fixed by @jiegillet and the test runner is now running Elixir 1.18 :slightly_smiling_face:

4 Likes

Thanks Team Elixir!

1 Like