Say you are processing a list using one of the Enum.______ functions. How does Elixir know when you have reached the end of the list and need to stop trying to process with that Enum.____ function? I’m not looking for “just have faith in the language” type of answers, please. I want to know how Elixir knows to stop. Is it checking something internally, and if so, what?
The source code is here: elixir/enum.ex at v1.14.5 · elixir-lang/elixir · GitHub
I hardly know Elixir, but it looks to me like each data structure is required to provide their own implementation, or else use a generic default implementation. In the case of lists and Enum.map
, it looks like evaluation is delegated to :lists.map
, which I do not know how to find the source code of. However, as Elixir’s lists are singly linked lists, I expect it to repeatedly pattern match against [head | tail]
until this fails.
I do not know how exactly Elixir represents its lists, but it is probably essentially a struct a struct (element_pointer, tail_pointer)
, with the null pointer indicating an empty list(tag, element_pointer, tail_pointer)
where the tag
indicates whether this struct represents an empty list or a cons cell.
As MatthijsBlom says, it’s up to the enumerated data type to supply that information. See here: Enum — Elixir v1.14.5. “In Elixir, an enumerable is any data type that implements the Enumerable
protocol.”
To add more details to @MatthijsBlom’s answer, lists.map
is using recursion (Source).
As result, at some points, the list that you have passed as argument to the Enum.______ function will be empty and it will stop looping. It uses recursion because data are immutable in Elixir:
Due to immutability, loops in Elixir are written differently from imperative languages.
The source code is
map(F, List) when is_function(F, 1) ->
case List of
[Hd | Tail] -> [F(Hd) | map_1(F, Tail)];
[] -> []
end.
map_1(F, [Hd | Tail]) ->
[F(Hd) | map_1(F, Tail)];
map_1(_F, []) ->
[].
Do you know why map
pattern matches first, instead of immediately delegating to map_1
? That is, why is the implementation not
map(F, List) when is_function(F, 1) ->
map_1(F, List)
map_1(F, [Hd | Tail]) ->
[F(Hd) | map_1(F, Tail)];
map_1(_F, []) ->
[].
According to this commit title, the reason is for optimization, but I don’t know more than that.
Previously, the map
function looked more simple:
map(F, [H|T]) ->
[F(H)|map(F, T)];
map(F, []) when is_function(F, 1) -> [].
So it “knows” its done when it patterrn matches the empty tail?
map
is implemented as a recursive function, so technically no: it does not itself pattern match on the tail, but instead delegates that to the recursive call. But otherwise yes: a pattern match on the empty list signals the end of operations.