Bob: Exercise text should mention linebreak handling

I suggest to add the following line at the end:

Bob always responds with a single sentence, regardless of the number of lines in the input.

This is done in bob: Mention linebreak handling by oleks · Pull Request #2489 · exercism/problem-specifications · GitHub

In particular, the canonical-data.json lists the example:

\nDoes this cryogenic chamber make me look fat?\nNo.

To which bob should evidently responds simply:

Whatever.

I’m tentatively ok with this. Maintainers?

In most languages, the input is a string and the output is a string. Discussing “lines” doesn’t make a ton of sense in that context.

Maybe the term line breaks is more fitting than lines.

A line is not the same as a string. But ok, how about we let it say, “Bob always responds with a single sentence, regardless of the span of the input.”

Exactly. If the input is a string, why would anything think it matters how many lines it is? When you use a length function to get the length of a string, do you need a caveat that the number of lines or paragraphs matter? If you count the number of a characters in a string, would you think the span of the string matters?

Most exercises run code with some given input data and expect back another piece of data. Typically the input (one or more) string or integer and the result is a string, integer or Boolean. Tacking on, “the length of the string doesn’t matter” seems odd to me.

I think that what @IsaacG says is a sensible way to think about strings.

However, there are languages where a line break may be considered as termination of input, particularly if this was a sort of scenario where the user would input strings from a console.

So I can also see a case where having seen just this particular test, I may wonder why is it handled as one string and not two separate strings/sentences, particularly if the person is not an experienced learner.


In this particular exercise however, it is never explicitly stated weather the input is considered to be the whole string, or if each sentence is a separate input.

I’m not following. Why do the instructions need to mention anything about the span or line breaks of the input? Don’t they already cover all possible answers?

I’m with Isaac here.

In such languages, there can be an instruction append that clarifies this. Having it in the main instructions could be really confusing for people in languages that are not that specific.

Yes - this is a bias - of all the issues that I’ve seen on the Python track with the Bob exercise, the linefeed or \n character/codepoint is not a common one. The logic and the punctuation are the biggest gotchas for learners, IMHO.

But the Python test generator trims the \n from all test cases except this one:

 def test_other_whitespace(self):
        self.assertEqual(response("\n\r \t"), "Fine. Be that way!")

So, if we stopped doing that, we would probably put in a hint and an instruction append warning students that they should probably use trim() or NOT use trim() depending on what we wanted them to match on.

Maybe this is less about lines and more about sentences. I think the point of the OP is that Bob doesn’t reply once to each sentence in the input, but only replies to the final sentence. I can see why give two sentences, someone might consider you should give two replies back (maybe even using the newline as a delimiter in the output).

Obviously, we can say that the student should read the tests to determine that, and I’m all for that in principle, but I guess that the test for this was added for a reason (probably users not knowing how to deal with this scenario), and maybe that reason should translate somehow into the instructions too.

I think what’s a bit weird is the test being called “multiple line question” but really it’s “two sentences split over lines”. It’s not really testing the newline as a delimiter (which I’d argue falls under TDD), it’s more testing that you only respond to the last bit. I think :person_shrugging:

I can definitely see this from both angles.

Does this imply all inputs must be sentences?

This exercise asks you to write code which takes a string and returns a string. “A string” is fairly unambiguous in most contexts here.

“A string” is fairly unambiguous in most contexts here.

It is, but a string also has some meaning/context in the model of the exercise being responding to a person. We don’t mention “string” in the instructions. We say “Your task is to determine what Bob will reply to someone when they say something to him or ask him a question.”

What’s weird about this example is that it isn’t clear who is saying what to Bob, etc. If you take it purely as a string-in, string-out formula, then yes, it’s probably clear. But if you enter the theme of the story, it’s less obvious to me all the ways someone might interpret things.

But let’s get into some more firm territory by asking a question…

@oleksss In proposing this, did you hit the issue/confusion yourself when solving the exercise? Did you try responding with multiple statements? Or what was it that you ran into that made you suggest this? Or did you just think it was a good idea?

@oleksss In proposing this, did you hit the issue/confusion yourself when solving the exercise? Did you try responding with multiple statements? Or what was it that you ran into that made you suggest this? Or did you just think it was a good idea?

My initial implementation on the awk track (albeit, based on the description alone, without reading the tests) was to respond to each line of input. Which is not an unreasonable intrepretation given that awk is, quite inherently, a line-processing utility.

I then discovered that this wasn’t what the awk tests wanted, and adjusted my implementation. My motivation for suggesting this change is to not lead future learners astray with an insufficiently precise description, as I was.

However, I think my confusion comes natural in a setting where the skeleton processes a stream of input characters (e.g., awk or bash). In languages where the skeleton takes in and outputs a string, I agree that the confusion is probably unlikely.

Hence, I no longer think that we need to change the general exercise text. Just make additions on the awk track, and perhaps others.

1 Like

This exercise asks you to write code which takes a string and returns a string.

Nowhere in the exercise text does it mention the word “string”. That is a language-specific detail that on some tracks, the skeleton will take in and output a string.

Nowhere in the exercise text does it mention the word “string”. That is a language-specific detail that on some tracks, the skeleton will take in and output a string.

It also doesn’t say anything about sentences. Actually, an equally valid implementation is to just answer “Whatever.” if you query Bob with multiple sentences (regardless of the format of the last sentnce), due to the formulation of the “Whatever.” case.

I feel like maybe an amend is worth considering, because as it is now, there can be ambiguity, as Jeremy said

It is never explicitly stated that the entire string should be considered one input, or if each sentence is a separate input. This may be considered trivial in many languages and to many people, but it can also be misinterpreted by some learners.

Possible solutions would be, either of:

  • do nothing and assume people will look at the tests to figure it out
  • change the canonical description to mention that “Bob always gives a single reply, regardless of how many things are said to him at a time”
  • leave it to a per track basis instructions amend, for languages where it can be easily misinterpreted (for example because the default input is separated by newline or etc.)
  • rename the existing test/make a new one that better represents the situation.

Another option available to tracks like awk and bash is to exclude specific test cases with newlines in the input.

It would be a judgement call that the track maintainers might have already considered and rejected.

I don’t think we need to change the canonical description.

Having pondered this and read everyone’s comments, I think the issue is with the test case. I think it’s an unnatural test for the exercise.

I propose changing (via a reimplemented case)
"\nDoes this cryogenic chamber make me look fat?\nNo."
to
"\nDoes this cryogenic chamber make\n me look fat?"

I think that’s a better example for something described as “multiple line question” and I think it’s intuitive to the consumer what to do with it.

If tracks like AWK then want to add a clarifying statement, that would be extra nice. But I think the issue is with the test case being “unnatural”, rather than the problem statement being too undescript.

How would people feel about that change?

9 Likes

@oleksss would you be willing to propose this change instead?

9 people (including you) have given their thumbs-up and no one has complained, so I think we have enough consensus. The great thing about this is that tracks do not need to accept the change but the overall state will be better indeed with this change.

1 Like

Sure. However, I am not sure as to the intended procedure - should I create a new PR and reference this forum topic, have that PR automatically closed, and then manually re-opened, OR do I augment the original PR, and ask you to re-open it?

1 Like