Any guidance for String vs Numeric comparison

Hi

I was wondering if anybody could provide some guidance on string vs numeric comparison.

You see, I recently completed the change exercise for AWK. My solution was working in my local AWK (GNU Awk 5.2.2). on Exercism, though, one particular test was failing. After lots of scratching around with the online editor, I found that this is because for one test case my comparitors >/< were interpreted as operating on strings, while in all the other test cases they were (correctly) interpreted as operating on numbers.

This baffled me, because the other tests passed, even though numeric and string interpretation would have been wrong in some of these cases.

Can anybody give any guidance on string vs numeric interpretation in awk?

It’s hard to answer without seeing your code.

Awk handles strings and numbers mostly interchangably, converting from one type to the other as needed.

I’ve got some work-in-progress material for an AWK learning track, see if this helps: awk/concepts/nums-strs/about.md at main · exercism/awk · GitHub

Thank you, it is.

It is. I think my code is too long to post here for this. I suppose the take home message is to always ensure one always “casts” to the right type.

MAWK operates with the following type states: not initialised, number and NOT string, string and NOT number, number AND string, and “could be either, I haven’t checked yet”. The dual state is to ensure that particular string or number gets converted at most once. Things from ARGV, ENVIRON, and $i are in the “could be either” state until you do something that forces them to be one or another.

When you do a comparison, a “could be either” operand is checked. If it looks like a number (possibly with leading and or trailing white space) it will be converted to “string AND number”.
An uninitialised operand is treated as string “” AND number 0. Finally, if either operand is string and NOT number, a string comparison will be done, otherwise a number comparison. Bear in mind that string literals in a program are string and NOT number. “99 bottles of beer” can be automatically converted to 99 by arithmetic but not by comparison.

I take care not to rely on automatic conversion to number in arithmetic by using s+0 as an explicit cast, and not to rely on automatic conversion to string in string operations like concatenation, substring, and matching by using n “” as an explicit cast. Comparison of a field with a literal string or number will do the right thing, in other cases it’s explicit cast again.

1 Like