Hi, how are you? Hope everyone is having a great time.
I have a little doubt I would like your opinion about.
I solved the Parallel Letter Frequency exercism in Javascript. I mean all the local tests pass. Local tests meaning they all pass on my computer.
My solution includes using Worker threads. When i submit all the files for my solution i see that one of the tests fails when run on exercism’s servers.
Error: thrown: "Exceeded timeout of 5000 ms for a test.
Add a timeout value to this test to increase the timeout, if this is a long-running test. See https://jestjs.io/docs/api#testname-fn-timeout."
at test (<solution>/parallel-letter-frequency.spec.js:163:3)
at _dispatchDescribe (<solution>/node_modules/jest-circus/build/index.js:91:26)
at describe (<solution>/node_modules/jest-circus/build/index.js:55:5)
Locally, this test passes in less than the time it takes the online counterpart:
✓ many small texts (1009 ms)
The local test passes in 1 second, the online counterpart takes more than 5 seconds to complete and fails.
We can’t tell you if there is something wrong with your code if we can’t see it Would you post it here in a code block?
I have no doubt that your code works given more time to execute. One of the lessons of this exercise is, that Worker are very slow to launch and really should be used rarely. Nevertheless it is possible to stay under the time limit.
const { Worker } = require("worker_threads");
const path = require("path");
const WORKERS = 10;
function divideTextIntoSegments(text, numThreads) {
const textLength = text.length;
const segmentLength = Math.ceil(textLength / numThreads); // Calculate length of each segment
const segments = [];
let startIndex = 0;
// Iterate over the text and create substrings
for (let i = 0; i < numThreads; i++) {
const endIndex = Math.min(startIndex + segmentLength, textLength); // Calculate end index of current segment
const segment = text.substring(startIndex, endIndex); // Extract substring
if (segment.length > 0) segments.push(segment); // Add substring to segments array
startIndex = endIndex; // Update start index for next segment
}
return segments;
}
function createWorker(stringSegment) {
return new Promise((resolve, reject) => {
const worker = new Worker(path.resolve(__dirname, "./worker.js"));
// Listen for the 'message' event to indicate task completion
worker.on("message", (message) => {
resolve(message); // Resolve the promise when the task is complete
});
// Listen for errors
worker.on("error", (error) => {
reject(error); // Reject the promise if an error occurs
});
worker.postMessage(stringSegment);
});
}
function containsOnlyIllegalCharacters(str) {
// Check if the string contains only illegal characters
return /^[^\p{Letter}]+$/u.test(str);
}
function mergeObjects(mergedResults, ...objects) {
// Iterate over each object
for (const obj of objects) {
// Iterate over the keys of the current object
for (const key in obj) {
if (obj.hasOwnProperty(key)) {
// Add the value from the current object to the corresponding property in mergedResults
mergedResults[key] = (mergedResults[key] || 0) + obj[key];
}
}
}
}
export const parallelLetterFrequency = async (texts) => {
// let's clean the global variable where we return results
let mergedResults = {};
if (texts.length < 1) return mergedResults;
for (let i = 0; i < texts.length; i++) {
if (containsOnlyIllegalCharacters(texts[i])) continue;
const segments = divideTextIntoSegments(texts[i].toLowerCase(), WORKERS);
const workerPromises = [];
segments.forEach((segment) => {
if (segment.length > 0) workerPromises.push(createWorker(segment));
});
try {
const results = await Promise.all(workerPromises);
mergeObjects(mergedResults, ...results);
} catch (error) {
console.error("Error occurred:", error);
}
}
return mergedResults;
};
This is worker.js:
const { parentPort } = require("worker_threads");
function isOtherThanLetters(str) {
return /[^\p{Letter}]/u.test(str);
}
parentPort.on("message", (stringSegment) => {
const letterFrequencies = {};
for (let i = 0; i < stringSegment.length; i++) {
if (isOtherThanLetters(stringSegment[i])) continue;
letterFrequencies[stringSegment[i]] =
(letterFrequencies[stringSegment[i]] || 0) + 1;
}
parentPort.postMessage(letterFrequencies);
parentPort.close(); // Close the communication channel
});
Thank you for your help.
I did not know Workers were slow. Maybe i should use another approach?
@glennj That doesn’t prevent Workers from working as required. It just does not help with speeding things up through parallel execution.
@aleleonian I read your code. You always start max. 10 Workers for each texts item (or 1 Worker for a single character). Then you wait for the 10 Workers to finish counting before launching the next 10 Workers for the next texts item.
The test that times out has 50 texts items with 6 characters each. So you launch 50x a batch of 6 Workers and wait for the batch to finish before going to the next batch.
This is very inefficient. And it doesn’t return a Promise that continues to execute in the background, as you await the batch result and mergeObjects() of each batch. These are the points I see where you can do something to provide a solution that does not hit the timeout.
@SleeplessByte Is the 5 seconds limit for each test intentional? I haven’t read about that in the track docs…
To chime in on the issue, Worker threads are probably the least efficient way to solve this exercise (in JS), so if you spawn a worker for each letter of your input, it will definitely take longer to execute than the test permits, in any case.
If you limit the amount of workers it should be fine.
Even if it’s unintentional, this is a good guard against solutions that spawn too many Workers, I think.