Why sort community solutions by most submitted?

senekor · August 3, 2024, 5:54pm

I was surprised to notice that the community solutions tab sorts by “most submitted” by default. I wonder whether this isn’t counter-productive? After all, this deliberately suppresses solutions that are talented, brilliant, incredible, amazing, show stopping, spectacular, never the same, totally unique, completely not ever been done before. On the other hand, one would expect the uninteresting output of AI assistants to rise to the top quickly.

I guess one could argue the most submitted solutions are probably the most idiomatic ones? I’m not sure if I find that convincing though.

Also, “most starred” doesn’t even seem to be a sorting option anymore, which I find a little strange as well.

This isn’t the first post about this topic, but I didn’t find any talking about the potential downside of sorting by “most submitted” specifically.

kotp · August 3, 2024, 6:59pm

It is kind of nice to see that your submission is not included in the “Most Submitted” immediately. And maybe also reassuring when it does happen. Then go look at other solutions to get some other inights.

The “most starred” does not inherently mean anything. What do you star solutions for? I star them when they are interesting “to me” not “best solution”, for example, while others may star for “Best Solution”, “Lowest memory consumption”, “Fastest Solution”, “Easiest to Read and Understand”, or any other reason to star them for your reasons.

So what are your points of "The downside of sorting by “most submitted”?

senekor · August 3, 2024, 7:24pm

“Best Solution”, “Lowest memory consumption”, “Fastest Solution”, “Easiest to Read and Understand”

I want to see all of these. It doesn’t matter what the star means to anyone. Most generally, the star means “I think others might benefit from seeing this too”. At least that’s what it meant when it was possible to sort by most starred.

So what are your points of "The downside of sorting by “most submitted”?

The more unique a solution is, the more it will be hidden. There are often very interesting solutions that are unique in the sense that it’s very unlikely many other people will submit the exact same code.

Then go look at other solutions to get some other inights.

When you look at other solutions, do you specifically navigate to the last page to look at unique solutions? Even if someone did that, unique+interesting solutions will be buried among unique+uninteresting solutions.

What buggs me is that the star mechanism was a decent, if imperfect, mechanism to surface unique+interesting solutions. Sorting by most submitted makes it very hard to find these.

kotp · August 3, 2024, 7:46pm

Sorting by most submitted is probably not the sort that you would use for “searching”. So then I select a different order, and search that way. Often by “Most reputation” and “newest” to get a mix of “experienced on the platform” (Since that is what "reputation represents here) as well as seeing new solutions that are being submitted, sometimes as a nth pass, but by all levels of experience.

I will then look for likely code statements when I can, for example looking for solutions that use yield or something that I am interested in.

The search facility is not exhaustive, and so will often look for community solutions with a search engine that I can use some kind of “fuzzy matching” to see what it is I am interested in seeing, but this is, of course, off platform.

Yes, or maybe not “last” but “later” pages. So I will skip around pages to see things that are happening based on paging, but not just based on the “Most submitted”. I will also look at “lines of code” though I know that sometimes a “low lines of code” measurement may not show up because they may have highly commented those lines, and I have not noticed that the LOC count (lines of code count) strip out lines of comments.

senekor · August 3, 2024, 8:07pm

That sounds very tedious. From a UX perspective, I think it would be very benefitial to provide better discoverability of unique+interesting solutions. Most people probably navigate to community solutions, check the first few ones, maybe scroll a bit and maybe click on a couple solutions to see the full code.

On a side note, the LOC counter uses tokei, which generally does ignore comments, but it dependends on the language support. If one notices that comments are counted, that can probably be fixed. (relevant docs)

kotp · August 3, 2024, 8:43pm

Using a search engine to find interesting solutions? Why would that be tedious? Or are you responding to something different?

Nice thank you for that.

andrerfcsantos · August 3, 2024, 9:52pm

I think the idea is that the most submitted solutions are supposed to be the ones the student work the most, ideally with a mentor. I believe this default sorting is trying to reward solutions that were probably mentored and solutions that show the most progress made, through mentoring or not.

This is to avoid someone seeing a “brilliant” solution with just one submission and think “I must be dumb, this person got this solution first try!”. Of course the person who submitted probably didn’t got that solution first try, but their history of solutions doesn’t show that. By the default sorting being by most submitted, you are passing the message that the “best” solutions are the ones you iterate on several times.

So, to answer your question, yes, I think the idea here is that solutions that have the most submissions are the most worked on, hence being more idiomatic.

I don’t take a look at community solutions very often, so I don’t really have on opinion if this is a good sorting or not. From your experience seeing community solutions, do you think sorting solutions by “most submitted” is rewarding solutions that have more work put into it?

I can see the system being abused easily - one can just do trivial submissions to increase the count and have their solution at the top. So, while I think I understand the idea behind the sorting in theory, in practice, I’m not sure it’s rewarding the solutions with the most work.

IsaacG · August 3, 2024, 9:54pm

For people newer to a language, most idiomatic is often quite useful. And there are a lot of people new to the language that use the community solutions to see if they approached it in an idiomatic manner! I wouldn’t write that off.

Historically early solutions got stars for being early and solutions with stars tended to accumulate more stars. Most stars historically did not flag great solutions. Once the community solutions started getting bucketed by “similar solutions” using the representer, per-solution stars made less sense as the new UI sorts buckets and not individual solutions.

kotp · August 3, 2024, 10:37pm

I do not think the iterations are measured in the “Most Submitted” report, just that the way that the published solution is identical to others. In other words, one published submission from a student is identical to that many other published submissions, but not by number of iterations, and so no indicate of “amount of times iterated”.

However, seeing a collection of iterations from one person may be interesting to see, but this is something that I might look at from a “High Reputation User” that likely has an interesting number of different approaches for a single exercise.

mk-mxp · August 4, 2024, 5:09am

I’m looking at many community solutions of 48in24 exercises in their week. I do that in PHP, JavaScript, TypeScript and Bash. And I also look at many of those solutions (especially the most submitted ones) when searching for possibly breaking existing solutions during updates to the exercise in PHP track.

I can’t support the assumption, that “most submitted” has anything to do with “idiomatic”. Or “most worked on”. Or mentoring efforts.

To me, “most submitted” reflects nothing else but “these were seen identical by the representer”. The quality of the solutions bucketed together rarely is better than the quality of others. Sometimes the “most reputation” solutions are even worse than those, but interestingly the “most reputation” ones are seldom solving the problem using the same approach as the “most submitted” ones.

Because of that, I believe the only benefit of showing the “most submitted” ordering first is the least amount of (nearly) identical solutions per page.

kotp · August 4, 2024, 6:38am

I agree with your assessment regarding the idiomatic or effort.

Also, I have a bit of reputation, and I do not typically try to solve them idiomatically, though I might argue “worse” is subjective, while also not being “what you would do in production”. That is not the reason for those solutions that I create. (I can only speak for myself.)

I think your belief is correct, in that it reduces potential pages and pages and pages of the same solutions as the representer would reduce them to. (Though not sure why it will group solutions where they are identical other than the double quotes or single quote changes, for ones that I have recently looked at, which is Two-Fer on Ruby, for example.)

iHiD · August 4, 2024, 11:20am

I’ll answer this from the perspective of the logic we chose to implement. I’m repeating things that others have said, but just so you understand why we made the decision we did.

There are two use-cases for community solutions:

How does everyone else do this?
What really cool ways are there to do this?

From an educational perspective, both have merit, but (1) is the more important in my eyes. Seeing the most common approaches tends to help you quickly identify “normal” / idiomatic things you might not have done. This is why we default to most submitted.

The “cool ways to do this” is well served by the “Sort by Highest Rep User”, where high-rep users tend to do more clever cool things. Maybe we should move this into second place in the dropdown.

The starred system fundamentally wasn’t working (see isaac’s post) and it also wasn’t possible to do in the new way of grouping solutions (which removed a huge volume of duplicates).

Maybe “remembering” which option you last chose and defaulting to it in future would be good.

thelmalu · August 28, 2024, 2:42am

I don’t understand ‘most submitted’ at all I recently came across a solution with 24 claimed submissions. In the sort for highest reputation user solutions I found a solution that differed from these in only two places: a variable named ‘s’ in the multi-submission code now became a ward starting with ‘s’. A loop, that was otherwise identical in code and variable names had a different spacing after its beginning line. This was listed as having one submission. Surely you’re not looking for clones – all that would indicate is that people are simply copying published code
I am not particularly good at this and my solutions tend to be clumsier than many. But nor am I creative. So I’m getting essentially the same code as you more facile people. I don’t feel that everything I do is unique, yet I have never seen a solution of mine ring up more than one submission. --thelma

IsaacG · August 28, 2024, 4:00am

Yes, “most submitted” looks for similar solutions. No, it does not indicate copy/pasting. The way “similar solutions” are computed differs from track to track, but the default sort is the same across all tracks. Some track representers are able to mark solutions as “similar” even when the variable names are different while other representers are not able to do so.

thelmalu · August 28, 2024, 11:47pm

export const parse = (sentence) =>
  sentence
    .toUpperCase()
    .match(/[a-z']+/gi)
    .map((x) => x[0])
    .join("")

export const parse = s => s
  .toUpperCase()
  .match(/[A-Z']+/g)
  .map(x => x[0])
  .join("")

TThese are two Community Solutions to Acronym on the Javascript track. The first claims to be a single submission. The second is listed as 1 of 26.

iHiD · August 29, 2024, 12:14am

They use different regular expressions, so they’re different ways to solve an exercise in my eyes

thelmalu · August 29, 2024, 12:46am

I give up. You’re right. The first one make the text all upper case, matches it against a lower case alphabet and makes up for that with a case insensitive match – I guess it’s reasonable to leave that out of the others…so I not only code sloppy; I read sloppy. Thanks