Issues with Parallel Letter Frequency in JavaScript

I’ve read the example solution and the comments on the pull request introducing the exercise.

The objection raised by Erik in the first comment is still present in the exercise.

proof.js seemingly uses Promises, but actually they (the Promise objects) do not serve anything. The executor functions have reference to the result object and update it directly when the Promises are created, not when they are resolved. resolve(res) and Promise.all have no effect. You can remove them and still pass the tests.

I propose to change the tests to expect Promises - they currently do not - and change the example solution accordingly. I’m willing to make a PR with the changes.

I have not yet coded it out, but I expect that if each thread completion triggers its Promise resolution, that’s the way to extend the solution to use real parallelism with Workers.

I am in no way an expert in JavaScript nor in concurrent programming. But I spent about 2 days trying to make this work with Worker, locally and in the online editor, and all attempts fail.

I can now strongly support the argument for async tests for this exercise.

The problem is, that new Promise() does not really do anything concurrently, until you do something that “forces” asynchronous behaviour. I’m elaborating this on @ErikSchierboom solution:

export const parallelLetterFrequency = (texts) => {
  let frequencies = {};
  Promise.all(texts.map((text) => letterFrequency(frequencies, text)));
  return frequencies;
};

const letterFrequency = (frequencies, text) => {
  return new Promise((resolve) => {
    for (const [letter] of text.toLowerCase().match(/\p{Letter}/gu) || [])
      frequencies[letter] = (frequencies[letter] || 0) + 1;

    resolve(frequencies);
  });
};

This works perfectly well, because it does not do anything asynchronously. It looks like it does, but it doesn’t. It immediately fails (always returns {}), when “forcing” the Promise to go async:

const letterFrequency = (frequencies, text) => {
  return new Promise((resolve) => {
setTimeout(() => {
    for (const [letter] of text.toLowerCase().match(/\p{Letter}/gu) || [])
      frequencies[letter] = (frequencies[letter] || 0) + 1;

    resolve(frequencies);
}, 100);
  });
};

In theory, this should yield exactly the same result as before. But it doesn’t. The real async’ed timeout callback does return the correct result, but the main thread returns {} for all inputs before the Promises are resolved. Promise.allSettled() doesn’t change that.

When waiting for all Promises being resolved using .then(), the variable is indeed filled in correctly. But you cannot wait to return it after .then() is done:

export const parallelLetterFrequency = (texts) => {
  let frequencies = {};
  Promise.all(texts.map((text) => letterFrequency(frequencies, text)))
    .then(() => console.log(frequencies));
  return frequencies;
};

There is no way to get to the correct result reliably from that function, because nothing can wait for the Promise to be resolved, but .then(). And this makes it impossible to do an async solution or one using Worker without real async testing.

Hey there,

Seeing that I’m the one that wrote this implementation, I see it fit to add my comment to this thread.

You can pass the tests for this exercise even if your code doesn’t use Promises and you solve this with a regular for loop. This is so by design, to illustrate the point that

  • Javascript is a single-thread-by-default language and it’s really hard to actually execute it in a parallel fashion.

My example solution uses Promise.all() to “simulate” parallelism (read as “cheat”). What it does is receive an iterable of Promises and wait for all of them to resolve, effectively in this way you get the results in the same time, but the code still executes on a single thread.

Promises and asynchronous functions may look like they execute in parallel but in fact they execute concurrently, so it doesn’t matter if you do the actual processing in the Promise constructor or in the resolve step, you cannot force the use of multiple threads and thus, you cannot execute parallel processes.

The only way to use multiple threads in Javascript, that I know of is Workers. The problem with that is that the implementation in different runtimes (Node and the Chrome engine come to mind) is different, so if your code seems to work in your local environment, it will most certainly break on the web, and vice versa. In other words, it’s unreliable.


if you have an asynchronous function, you’d have to await it to actually block the thread until that function resolves, otherwize your function will execute synchronously. Internally .then() basically does the same as async/await so it’s definitely not the only way to wait for a Promise to fulfill.

Promise.all() does in fact not simulate parallelism, it does not even do anything concurrenctly. The way it is used here only makes students believe they are stupid because it is used against documentation and the way Promises work.

In synchronous code, nothing can wait for a Promise to resolve. You cannot use await, because await requires an async function. Which returns a Promise, not a value.

This has nothing to do with JavaScript being single-threaded or async/await. Promise.all() is documented to collect a bunch of promises into an array and pass on execution to .then() if all of them resolve().

Why the example code works is the simple fact, that all Promises constructors execute the given Closures before returning control to the caller. That is synchronous to the core. If you go concurrent in the constructor closure (like setTimeout() does), the Promise will concurrently resolve once control returns to the Event handler loop (which is, when the test function ends, if I remember the step debugger session correctly).

So, using an external collector object to collect the results from the Promises simply is wrong. It is teaching the wrong thing to students.

1 Like

Sorry to pile on @Cool-Katt , but

Your solution does not wait for them, that’s the problem. It doesn’t simulate anything. Try commenting out resolve(res) (promises forever remaining unresolved), it’ll make no difference to the tests.

I just saw that they’ve included it in the video. Oh my God! :grin:

The point of this specific exercise isn’t to teach how Promises work. We already have an exercise for that.

The point is to show that Javascript is not a parallel language and executing tasks in parallel is not possible, unless one uses Worker threads, which is a whole can of worms on it’s own.

Yes, your criticism is valid, the tests do not expect asynchronous execution and my specific solution works because the actual processing happens when a new Promise is created, not fulfilled.

As I said before, you can complete the exercise even with a simple for loop, and that is so by design. We don’t want to limit people to only asynchronous solutions and tell them that “it’s good enough”, we want to teach why JS works the way it does, and i’ve tried to illustrate the point as best as I can in the docs for this exercise.

I’m open for suggestions to improve things (tho it’s up to the maintainers if they’d get accepted), after all everyone is free to make their own implementation. Mine may not be the perfect one, but under the circumstance, I don’t think such exists

So you limit students to only synchronous solutions instead, which is totally against the idea of parallel letter frequency? JavaScript allows for single-threaded concurrency when using async testing, and that’s what I expected it to be.

Putting this Promise solution into the track example, the community solutions and the video is showing wrong JavaScript usage. It helps in no way. Sorry. Frustrating people with wrong promises doesn’t teach anything about limitations.

Going real parallel with Worker is totally impossible with the current tests and telling someone about Workers in the instructions is totally misleading. Async testing would allow to use Workers and open much more headway for experiences. Having only synchronous solutions helps not at all.

Just to be clear, the example solution is entirely irrelevant here. Examples are purely used for CI purposes. They can be the most terrible code in the world and that’s absolutely fine. So I don’t see any value in discussing the merits of the example. Equally, people’s community solutions are individual’s solutions to the exercise - I’m not happy for those to be debated either.

However, there are clearly merits to designing the tests in a way to allow concurrency or parallelism. So a more tangible discussion here is around the question of “how could the tests be rewritten to test concurrency/parallelism?” In Ruby, we do that by benchmarking a sequential solution and then the students and ensuring the student’s is faster.

A second question is whether the exercise should be on the track at all. If it’s impossible to write concurrent code in JavaScript, then it probably shouldn’t be there. However, if it’s possible to write concurrent code in JavaScript, we should test that concurrency somehow in the tests.

A final point would be how we use Approaches/Articles for this (which Katt refers to the initial PR too)


Finally, let’s turn the tone down a little in this thread. It’s getting unnecessarily heated. We’re all trying to do the best for the students - let’s respect that about each other in how we’re communicating please :slight_smile:

3 Likes

I also added code to proof, that this supports synchronous, concurrent and parallel execution with no changes required to existing solutions.

Forgive me if I made a wrong assumption, but I thought that the examples serve to demonstrate (to the CI system) that there exist a solution, the proposed exercise can be solved.

The tests are the crux of the problem here. I couldn’t figure out how how are we supposed to pull an asynchronous value into a synchronous function and I turned to the provided example as a sanity check.

I think it does merit discussion that the exercise was at fault. Let’s not forget, the exercise introduction’s stated goals are to show parallelism in JavaScript using Promises and Workers . The exercise, as it now stands, doesn’t live up to its own goals.

The exercise should be in the track, it can be in the track - just not like this.

I think that this exercise is still perfectly valid for the track, as you can definitely execute code concurrently and there is a way to execute code in parallel but the point would be to show that Javascript is not really designed as a parallel language.

The exercise introduction also clearly states that JS is a single-threaded language and there is only one way to achieve actual parallelism. It also acknowledges that a solution that doesn’t implement concurrency can pass the tests. My reasoning behind that was that I don’t want to limit students to a particular way of solving the exercise, as I couldn’t really find a proper way to test if the implementation uses Workers. (Worker threads don’t have to be used is asynchronous functions, so an async test wouldn’t necessarily prove if a student has used Worker threads or not.)

As I said earlier, improvements are welcome, as long as they don’t disrupt current solutions and don’t limit ways to solve the exercise.

Also, for the sake of completeness, if the idea is that you need to await the Promise.all() in a synchronous function, you can do this

(async () => {
    await Promise.all(formatedTexts.map((t) => 
        processSingleText(t, result)))
})()

Not that it helps with the tests. I agree with the point that the current tests do not really apply to implementations using an async function. Maybe we can focus on how to make them better?

@iHiD that part about benchmarking against a synchronous solution seems interesting. Mind sharing an example?

Well, to put it bluntly, you shouldn’t have done that.

Let’s go a level above to the problem spec:

Count the frequency of letters in texts using parallel computation.

The whole reason for this exercise to exist is to do stuff in parallel. That’s what I was trying to express above when Jeremy replied with we should “allow” concurrency. We shouldn’t just allow it. We should expect it in this exercise.

As for JS being inherently single-threaded: With Promises you can fire off requests to separate servers and wait for their responses - parallelism! not necessarily on the same machine. Fibbing on the exact way that the sub-tasks execute is okay. That is not what I and @mk-mxp are arguing about.

The goal of the exercise is to show how to execute code in a parallel fashion, on the language level. However, Javascript, as a language, was not designed to work in a parallel fashion, so it lacks the language features that many other languages have in order to handle parallel code execution.

There is, according to the docs and to my knowledge, only one way to execute code in parallel - Worker threads. If we have to be pedantic about things, we should expect a real parallel execution solution to use those and we should test for it.

I wasn’t able to find a way to test for worker threads implementation that would be reliable, so I didn’t do it.

If we point to a regular synchronous solution and say that it doesn’t work since it doesn’t fit the goal, then by extension, a concurrent one also doesn’t fit the goal, since it is not executing code in parallel, no matter how much it may look like it does, so why force students to comply with a requirement, only to tell them that “we know this isn’t parallelism but it’s good enough”?

It is better to teach students why concurrency is not the same as parallelism and why we say that it’s good enough, under the circumstance. Something that i’ve tried to explain as best as I can in the instructions to the exercise.

It does. But my point is that the issue is with the tests, not with the example. If the example passes the tests, it’s a valid file. Obviously, discussing the example is a good way to discuss what the tests should do, but I’d like the focus of the discussion not to be on whether the example is good/bad - but rather how the tests should be changed.


Here’s my take. If someone can write some tests that ensure only parallel (or concurrent) solution’s can pass, then we should probably make that change.

@mk-mxp’s set of tests allow concurrent/parallel code to be written, but I’m not sure they force people in that direction. Can anyone come up with some tests that do force concurrency or parallelism?

(Here’s the Ruby Bencmarking comparison test I mentioned - not that it’ll necessarily help, but for completeness).

I’d prefer sticking with “students can solve it the way it is meant to be solved, mentors teach how to do it properly”. Or, as you said here: “we don’t invest in anti-cheating mechansims”.

jest does have mechanisms to test asynchronous code, in particular something like this (from the docs):

test('the data is peanut butter', async () => {
  await expect(fetchData()).resolves.toBe('peanut butter');
});

We could refactor the tests to expect asynchronous functions, but then we are going to invalidate all solutions that use a regular for loop, i.e. regular synchronous solutions. I’m not sure if that’s the desired outcome but I suppose it’s closer to it than what we have now.

I don’t think I follow. Care to elaborate?
If we’re just going to agree to disagree and not limit the solutions in any way, then what was the point of this discussion?


I’m also not sure if there is an easy way to tackle the bigger problem here, which I think should be ‘How do we test if a solution is really using Worker threads to execute code in parallel?’

one proposal was:

See Allow synchronous and concurrent and parallel solutions to Parallel Letter Frequency by mk-mxp · Pull Request #2418 · exercism/javascript · GitHub, as posted above. No existing code is invalidated.