I think it would be better to first implement it for a number of (preferably markedly different) languages, before standardizing the instructions and tests.
I think this could be a nice exercise! Cellular automata are really cool.
We don’t usually go about it this way for new exercises though, as it requires a bit upstream effort.
I’d say, let some other people weigh in and then we can decide if it would be a worthwhile addition.
The reason I suggest implementing it for a few languages first is so that unforeseen avoidable incompatibilities can be discovered.
On reflection I do not really expect any such for this exercise, so I guess going via problem-specifications first should be fine.
It shouldn’t matter much what the representation in problem-specifications is. Individual tracks can parse the canonical data into whatever shape has their preference.
The only thing that I can think of that matters is the suggested function interface. E.g. in Binary Search Tree the problem-specifications make no distinction between empty tree and no tree, which does not match well with languages with algebraic data types such as Rust, Haskell, Gleam.
I like the idea! This gives me strong AoC flashbacks. One approach would be to represent the input and output as a string or list of strings (one per row) using space characters and, say # to represent alive or not. The other input would be the number of rounds to simulate.
You can start by copying another exercise in the problem specs and modifying all the files.