Matching brackets tests don't cover more than one balanced bracket group

This exercise could be solved using recursive regular expressions (see PCRE RECURSIVE PATTERNS or Perl regular expressions - Extended Patterns for more details). Some of the wrong solutions pass all current tests. For example test case like this ()txt() fails despite all current tests passing.

There is PR prepared Add tests for matching brackets for multiple groups #2167

  1. Does the complex latex expression test not cover this? Or are you saying that an even number vs odd number of brackets need to be tested indepedently?
  2. Your PR includes four tests. Is the fourth test sufficient to cover all the cases presented by the prior three tests?
  1. Does the complex latex expression test not cover this? Or are you saying that an even number vs odd number of brackets need to be tested independently?

No, complex latex expression is basically "some prefix<balanced brackets group>". The issue is that there is basically just one long balanced brackets group with some prefix. Then test math expression tests "<balanced brackets group>some suffix". What I have observed was a solution that was working with all test cases but rejects "<balanced brackets group>some inserted text<balanced brackets group>". Basically, all test cases currently present are a single balanced or unbalanced brackets group but not a single test case with more than one brackets group.

No, even vs an odd number of brackets was not crucial in cases I have observed but I’m not sure if there is no case where it could be an issue.

  1. Your PR includes four tests. Is the fourth test sufficient to cover all the cases presented by the prior three tests?

No. It could seem counterintuitive but regular expression-based solutions could be wrongly written in a way they would correctly accept "<balanced brackets group>some inserted text<balanced brackets group>" and erroneously reject "<balanced brackets group><balanced brackets group>" and then there could be other wrong solution which will work another way around. It is usually not a problem with stack-based solutions.

Unfortunately, there is an infinite number of possible erroneous solutions to the exercise. The ideal would be testing with property-based tests like Proper or Quickcheck but those tests can’t be defined in the way of canonical-data.json.

Yup. There are. Which is why it makes sense to pick a minimal sample of tests which provide reasonable coverage of most implementations. There’s always the possibility of some implementation passing all the unit tests but failing some other input. The value of adding additional specific tests to detect those specific implementations is relatively low: one of an infinite number of bad implementations will be caught while the other infinite implementations are not.

Adding tests do carry a cost, though. More PRs, more code to maintain, more tests to run (times n students times m runs of the tests). If there is a specific and clear value in adding a test, and that value outweighs the cost, it may make sense to discuss adding said test.

If one test is as good as four, it would seem the value the additional three tests is very low and probably not worthwhile. It sounds like testing for "{}[]" would suffice for this specific implementation.

The tests already have a "{}[]" input. Testing for "{}[]" vs "{}{}" sounds like it might catch one very specific implementation; I’m not convinced it adds a whole lot of value overall, though.

If one test is as good as four, it would seem the value the additional three tests is very low and probably not worthwhile. It sounds like testing for “{}” would suffice for this specific implementation.

The tests already have a “{}” input. Testing for “{}” vs “{}{}” sounds like it might catch one very specific implementation; I’m not convinced it adds a whole lot of value overall, though.

You are right, I have missed that one. I will remove this one test case. Unfortunately, changes will not be visible until PR will be reopened.

I didn’t mean that one of the cases was not needed. I meant that only one is needed and covers all the cases, i.e. there is no value in having more than one test case in the PR.

That said, I’m still not convinced that this test case would be commonly helpful across student submissions.

I didn’t mean that one of the cases was not needed. I meant that only one is needed and covers all the cases, i.e. there is no value in having more than one test case in the PR.

I don’t understand. If this is true why there already are twenty test cases when only one is needed?

That said, I’m still not convinced that this test case would be commonly helpful across student submissions.

When I was solving this exercise I came up with several erroneous solutions which all passed all current tests but would fail on one of the proposed tests. Fortunately, I have noticed those bugs. Writing correct complex regular expressions is a notoriously hard problem. Having test cases that will catch a wider range of bugs would be very helpful. The current set is not sufficient.

At one time I had a solution that passed all tests (including "{}[]") but would fail on input like "[({}[])]([{}])". To be more specific it would fail on something like "(){[]}" and "[{}]()" but not "{}[]".

At another moment I had a solution that passed all the above but fail on "()foo[]".

Then I managed to make a radical redesign and come up with the correct solution but in the process, I noticed that I could easily come up with more wrong ones. The proposed tests are trying to cover most of them.

I’m not sure what you would like from me. The simplest ones? Then I don’t understand why are already there "math expression" and "complex latex expression". One catch-all complex one? It can’t be done by principle. It is the reason why there are already 18 different simple test cases.

In good humour:

Some people, when confronted with a problem, think
“I know, I’ll use regular expressions.” Now they have two problems.

https://web.archive.org/web/20221028113912/http://regex.info/blog/2006-09-15/247

Regarding the value of testing for this:

This exercise isn’t best solved by using regular expressions. The bracket matching requirement is a Context-free grammar. Regular expressions typically only support regular grammars and recursive descent isn’t actually regular. This exercise is optimally, and pretty easily solved using a stack.

Similar to my comment on the ISBN spec change, tests should reflect a gap that is common across multiple students or multiple approaches, not specific to one solution.

Regarding the number of tests added:

When adding tests (problem spec tests, code unit tests), each new test should ideally be testing for one specific case, e.g. balanced brackets. That case ideally would be handled by exactly one single test. If balanced brackets is a gap that should be covered by the tests, there should ideally be one test for balanced brackets. There is no value in testing for balanced brackets multiple times if the various tests aren’t testing for different cases.

You don’t have to waste so many words to just say you are not interested in contributions. I will not waste your precious time. Bye.

@pichi-42 - A Python track maintainer here. I’m sorry that you feel like this is too much work and discussion for what looks like a small or straightforward contribution. The conversation is not meant as a gate or a challenge to your expertise or effort. It is also not intended to keep you from contributing.

What you may or may not know is that problem specifications, and the test data within it are shared across all 61 programming language tracks. While tracks can choose to opt out of the data, skip cases for an exercise, or in some cases add track-specific cases of their own, its important that we keep the canonical data as small and clean as possible, since every change triggers exercise updates for the programming language tracks that use it.

We also try to keep the data as generic as we can, and only add test cases that are significant for a wide swath of tracks or approaches. That means we debate and discuss a lot more, and for more reasons than we do when considering track-specific updates. We also accept way fewer PRs and at a slower pace, because at least three maintainers (and often more than that) have to form a consensus around the proposed changes.

@IsaacG is weighing in from his perspective, as will other track maintainers as they have time and interest. He may sound super-critical to you - but he really does want to understand if what is being proposed will benefit a wide range of programming language tracks and student solutions.

One of the big issues here is that very few programming languages have regex engines that support recursion in the core of the language. A quick look shows Ruby 2+, Elixr, C, R, Erlang, PHP , Perl, Raku, and those that embed or use PCRE. Python only has access to recursive regex through a third-party library, and we currently don’t allow third party libraries on our track.

So I am rather ambivalent here, since solutions on my track must take the approach that Isaac mentions above in his spoiler tag.

That doesn’t mean your contribution is not wanted. It means that you need to make a case that this would benefit those tracks that have recursive regex, or would be an important track specific add-on for those tracks. The tracks would then decided if the additional test cases were wanted in canonical data or in track addendums.

And that is a lot of work and negotiation. What I might suggest instead is that you consider writing up an approach document (see this post and this one for how those look on the website) for Elixir and Erlang (and any other languages where you have expertise) to point out to students what recursive regex is, what its pitfalls are, and how to test for errors.

2 Likes