If you’re not from a computer science background you can’t even make heads or tails of this exercise with the current exercise. I have asked ChatGPT to provide me with a better instructions for this you can directly copy paste it there. I can’t contribute to the GitHub repo so posting it here.
Problem Overview
You are going to write two functions: encode
and decode
, which work with a way of storing numbers called Variable Length Quantity (VLQ).
VLQ is a method for storing integers using fewer bytes when the number is small. This helps save space when encoding many numbers.
Each number is broken into 7-bit chunks, and each chunk is stored in a single byte. The most significant bit (leftmost) of each byte is used to indicate whether there are more bytes to come:
- If the most significant bit is
1
, it means “more bytes follow.” - If the most significant bit is
0
, it means “this is the final byte.”
This way, small numbers (0–127) need just one byte, while larger numbers can take more.
Example Conversions:
Decimal Number | Binary (7-bit chunks) | VLQ Bytes (in hex) |
---|---|---|
0 | 0000000 | 0x00 |
127 | 1111111 | 0x7F |
128 | 0000001 0000000 | 0x81 0x00 |
8192 | 0000001 0000000 0000000 | 0xC0 0x80 0x00 |
You will only work with unsigned 32-bit integers in this exercise.
Functions to Implement
encode(numbers: List[int]) -> List[int]
Takes a list of integers and returns a list of bytes (as integers) in VLQ format.
decode(bytes: List[int]) -> List[int]
Takes a list of bytes (as integers) and reconstructs the original list of numbers.
Handling Errors
While decoding, you may get a list of bytes that doesn’t end properly — for example, it keeps saying “more bytes are coming,” but no more bytes are actually present.
This is known as an incomplete sequence, and your function must raise a ValueError
when it happens:
raise ValueError("incomplete sequence")
This helps catch and report corrupted or invalid input.
What is an “Incomplete Sequence”?
When decoding, a byte with the most significant bit set to 1
tells your code: “there is another byte coming after me.”
If the list ends without a final byte (i.e. one with a leading 0
bit), then the input is incomplete, and it’s impossible to fully decode the number.
Example of invalid input:
decode([0x81]) # invalid — no final byte with MSB = 0
# should raise ValueError("incomplete sequence")