Community Solutions search oddity

Searching for ‘105’ in solutions finds results like this one with no two-digit numbers in it. I guess it’s doing some sort of fuzzy search, but – it’s too fuzzy!

If there’s going to be any sort of search fuzzer, please give us a checkbox to disable it. In this case I was curious who might have used this specific constant; any fuzzing just drops the ball.

I have no idea how the search is implemented, I’d be curious to know, but in this particular case there seems to be a raindrops.spec.js file that contains the 105 constant.

Thanks, that is indeed the cause. All of the matches on ‘105’ either (1) actually used that in their code, (2) had it in a comment or debug code, or (3) had for some reason submitted the spec file along with their code.

SO, revised ask: can the search please omit spurious files? In particular, any file which came as part of the problem set and is unmodified, please kick out of the search domain. (This leaves a couple of fuzzy areas: files which were part of the original and were modified; and how to tell if files are unmodified, when the exercise itself may have changed over time. I would say ‘restrict it to the main solution file’, but I don’t think that is a firm, stable concept across all tracks & exercises…)

The person has submitted a modified tests file as part of their solution so this solution wouldn’t help here.

We could block test files altogether from search, but I’m not overly wanting to rebuild all our search indexes for this (which will cost time and $$$). How wide an issue is this? Is this something you’ve run into once or repeatedly?

At least twice; hitting it a 2nd time was what made me go to the trouble of raising it as an issue.

Modifying the indexer to ignore test files seems like an improvement, though there might be very corner cases where one did want to search the users’ modified test files. But I am confident that the use case for that is an order of magnitude smaller than the use case for ‘just search the solutions’.

Reindexing with only the actual solutions should make the index slightly smaller and faster to search in the future, so any CPU costs would be recouped over time. Dev cost of ‘leave out the test files’ is nonzero, but should be small. Dev / ops cost of issuing ‘hey, do a reindex now’ should, I hope, be both constant and tiny!

1 Like