Rotational-cipher, atbash-cipher, affine-cipher: the English alphabet is not the Latin/Roman alphabet

ellnix · February 9, 2025, 8:46am

In problem-specifications,

atbash-cipher transforms what it calls the “Latin” alphabet but it’s really the English alphabet
affine-cipher claims the “Roman” alphabet has 26 letters
rotational-cipher similar to atbash transforms the “Latin” alphabet which is really just the English alphabet

I’m no alphabet expert, but there are many Latin/Roman alphabets (including the English one), and the classical Roman alphabet had 23 letters, not 26. Besides, I think just saying “English” would be clearer, since really that’s what’s relevant since it’s contiguous in ASCII.

P. S.: Sorry if I’m in the wrong category, wrote this in a bit of a rush.

ellnix · February 9, 2025, 12:33pm

Here’s the PR:

IsaacG · February 9, 2025, 12:49pm

(The Latin alphabet is also called the Roman alphabet)[Latin alphabet - Wikipedia].

(The modern Latin alphabet is used by many, many languages, not just English.)[ISO basic Latin alphabet - Wikipedia]

I don’t think this should be changed.

ellnix · February 9, 2025, 1:22pm

I was not aware of the ISO basic Latin alphabet at all, and it seems very recent (post-ASCII). Your first Wikipedia link for “Latin alphabet” itself shows 23 letters, I would argue that nobody takes “Latin alphabet” to mean “ISO basic Latin Alphabet”.

I don’t really have a horse in this race, I only noticed this to begin with since the atbash exercise mentions how ancient the cipher is and when it mentioned the Latin alphabet it confused me a bit.

tasx · February 9, 2025, 1:44pm

The canonical data for both the Atbash and Affine ciphers use the word ‘English.’ The Rotational cipher doesn’t specify anything, but I prefer using ‘English’ for this one as well. I don’t know much about alphabets, and I’d rather avoid confusion by using terminology I’m unfamiliar with.

iHiD · February 9, 2025, 1:48pm

I appreciate the post.

I think that “latin” is very commonly used to mean what these exercises imply. In encoding we see latin1 in MySQL for example. I think it’s a commonly enough used word now in tech, that its useful for people to be familiar with it.

That said if you are from a part of the world where the Romans and Latin have basically zero meaning to you, I imagine this could be confusing without some explanation.

So I think my initial feeling is that I think changing these to “English alphabet” doesn’t really have any disadvantages and does improve clarity. So I’m a tentative +1, but I don’t have a very strong feeling.

If people don’t want to change Latin to English, I feel more strongly that changing Roman to Latin is good so that we’re internally consistent

BNAndras · February 9, 2025, 4:19pm

In high school and college, I don’t recall someone ever saying the Roman alphabet. The people are Roman, and the language is Latin.

As an aside, the Latin alphabet has been different lengths over the years: 21 letters (archaic), 23 (classical), 25 (medieval), or 26 (modern). So if we say Latin, we should specify modern Latin to remove ambiguity.

IsaacG · February 9, 2025, 4:58pm

It feels a bit colonialist to me to claim the Latin alphabet is the English alphabet. We could just as easily call it the Indonesian alphabet or Malaysian alphabet. That’s why I prefer sticking to Latin over English.

iHiD · February 9, 2025, 5:08pm

I like “Modern Latin”.

BethanyG · February 9, 2025, 6:07pm

I like ‘Modern Latin’ as well.

My temptation would be to refer to the ISO latin1 (or IBM CP 437/ Windows CP 1252) standard in parenthesis, but the reality is that these exercises are only using ACII < 128 (and not even upper case at that), so those standards don’t describe it. But caveating to ASCII < 128 might lead to more student confusion than leaving it.

And maybe we change the ASCII < 128 at some point in the tests ? It would break fancy bit shifting solutions (or might), but it would then also be more accurately “latin”, as described by the ISO/IBM/MS standards.

BNAndras · February 10, 2025, 3:14am

I think it’s preferable to say English alphabet here. Our test data consists of English words and phrases. Those are written in the English alphabet, which just happens to be the same set of letters as the modern Latin letters. If we had Latin words or phrases, the modern Latin alphabet would be appropriate, and English alphabet would not.

tasx · February 10, 2025, 8:48am

The instructions should also be clarified in some places:

The Caesar cipher is a simple shift cipher that relies on transposing all the letters in the alphabet using an integer key between 0 and 26.

This is valid only when the alphabet has 26 letters.

Using a key of 0 and 26 will always yield the same output due to modular arithmetic.

What does output refer to here? Why it does not say “letter”?

This is another one of those math oriented exercises. The more specific they are, the better.

As for “modern latin”, this will force most people to start googling about it. Not really the best option when someone is already struggling to understand the math. The affine cipher does not even mention the letters of the alphabet.

IsaacG · February 10, 2025, 4:08pm

0 to 26 inclusive actually requires 27 letters

shebang · February 11, 2025, 12:38am

I would argue that nobody takes “Latin alphabet” to mean “ISO basic Latin Alphabet”.

Perhaps outside the IT industry that’s true, but in a COMPSCI context the terminology is based on this standard which was introduced in 1998. The first 127 characters have identical encodings to US-ASCII.

https://www.charset.org/charsets/iso-8859-1

The expected product of most of the exercises (at least that I’ve done so far) aren’t evident strictly from the textual descriptions on the website, but rather from the contents and output of the test script. Learning to interpret those is as much a valid and necessary aspect of software development as is writing the actual code.

BNAndras · February 14, 2025, 11:32pm

If we don’t have a consensus on changing all three references to English or modern Latin alphabet, do we have a consensus on renaming the affine-cipher reference to Latin for consistency with the other two references?

iHiD · February 19, 2025, 5:24pm

You have an executive nod. Please go ahead. (or @ellnix as they opened this if they’d like the rep! :))

ellnix · February 19, 2025, 6:26pm

It’s fine by me, @BNAndras can go ahead

BNAndras · February 22, 2025, 1:57am