My exercise solution vs performance benchmark

walker-dev · January 4, 2024, 9:08am

This article was fun to read but it would be awesome if I could hit compare and see how my code stacks up to the benchmark as well. I know you had a link to the code but integrating it as an advanced stat in exercism would be fun too.

IsaacG · January 4, 2024, 2:59pm

Benchmarking tends to be CPU intensive since you often need to run the code thousands of times. Exercism runs on the cloud so CPU intensive loads means expensive.

Benchmarking is affected by other loads on the machine. Exercism uses shared containers which are often highly loaded. This makes their use of containers more efficient and less wasteful, but makes them not good for benchmarking.

timeit is part of the Python Standard Library. This would be a good opportunity to learn to use it and to implement different approaches if you have a local computer with Python installed!

walker-dev · January 4, 2024, 3:35pm

I thought those might be along the lines of why it’s not done.

I’ll give that a go locally thanks.

kotp · January 4, 2024, 3:41pm

That would be good to see, you can always post your findings, either as a comment in the “published” comment area, or as comments in the code, if you want to share them on the platform, tied to your solution/exercise.

walker-dev · January 6, 2024, 6:41am

Mine ended up being a decent bit slower than the non recursive deep dive solutions but a touch quicker than the recursive.

So strange seeing the code run from a docstring but I guess doctest can do similar sort of things. Thanks for the suggestion to give it a go.

reversed for:                 3.921377321999898e-06
replace reverse enumerate:    3.884075114000552e-06
recursion:                    1.2924645969999801e-05
my regex solution:                    6.8628480469997155e-06

IsaacG · January 6, 2024, 1:01pm

Note, with times like that, one solution can be significantly impacted by, say, a CPU spike caused by your browser getting an update in a tab. It might be worth running 1,000 to 1,000,000 more iterations to get a more stable value

walker-dev · January 12, 2024, 1:17am

rna-transcription (performance)

There is a cool little comment at the end of the dig deeper section saying that the translate / maketrans method tested four times faster which made want to ask:

Is there a good way to anticipate this performance gain? and/or,
A good way to work out why it is faster rather than just writing lots of implementations and benchmarking?

IsaacG · January 12, 2024, 3:27am

There is big-O analysis which gets you the broad picture. To understanding why translate is faster would require understanding what is going on under the hood, and would likely depend on which Python implementation you are using.

Benchmarking is usually the way to go. Reading the byte code can help. Without actually looking at it, I would guess translate uses compiled C code, as many built-in cpython method do.

IsaacG · January 12, 2024, 3:39am

>>> LOOKUP = str.maketrans("GCTA", "CGAU")
>>> def to_rna_translate(dna_strand):
...     return dna_strand.translate(LOOKUP)

>>> D_LOOKUP = {"G": "C", "C": "G", "T": "A", "A": "U"}
>>> def to_rna_dict(dna_strand):
...     return ''.join(D_LOOKUP[chr] for chr in dna_strand)

>>> import dis
>>> dis.dis(to_rna_dict)
  1           0 RESUME                   0

              # ''.join
  2           2 LOAD_CONST               1 ('')
              4 LOAD_METHOD              0 (join)  
              # the generator expression passed to join() -- see below
             26 LOAD_CONST               2 (<code object <genexpr> at 0x7fe41cfbba50, file "<stdin>", line 2>)
             28 MAKE_FUNCTION            0
             30 LOAD_FAST                0 (dna_strand)
              # Call the generator and join() function
             32 GET_ITER
             34 PRECALL                  0
             38 CALL                     0
             48 PRECALL                  1
             52 CALL                     1
             62 RETURN_VALUE

Disassembly of <code object <genexpr> at 0x7fe41cfbba50, file "<stdin>", line 2>:
  2           0 RETURN_GENERATOR
              2 POP_TOP
              4 RESUME                   0
              6 LOAD_FAST                0 (.0)
        >>    8 FOR_ITER                17 (to 44)
              # Load chr and the dictionary
             10 STORE_FAST               1 (chr)
             12 LOAD_GLOBAL              0 (D_LOOKUP)
             24 LOAD_FAST                1 (chr)
              # Fetch the value from the dictionary
             26 BINARY_SUBSCR
              # yield one character to join
             36 YIELD_VALUE
             38 RESUME                   1
             40 POP_TOP
             42 JUMP_BACKWARD           18 (to 8)
        >>   44 LOAD_CONST               0 (None)
             46 RETURN_VALUE

# Ignoring the maketrans() call...
>>> dis.dis(to_rna_translate)
  1           0 RESUME                   0

  2           2 LOAD_FAST                0 (dna_strand)
              4 LOAD_METHOD              0 (translate)
             26 LOAD_GLOBAL              2 (LOOKUP)
             38 PRECALL                  1
              # Not much to see. I think this means it's a C-function that is called with the two inputs.
             42 CALL                     1
             52 RETURN_VALUE