# Comparison between `CAI` and `Biopython` Performance

To see how Biopython and CAI perform, we're going to benchmark them. First, let's get the latest version of `CAI`.

In [1]:
! pip install -e ../

Obtaining file:///Users/BenjaminLee/Desktop/Python/Research/cai/CodonAdaptionIndex
Installing collected packages: CAI
  Found existing installation: CAI 0.1.8
    Uninstalling CAI-0.1.8:
      Successfully uninstalled CAI-0.1.8
  Running setup.py develop for CAI
Successfully installed CAI


Now, we'll import the two libraries.

In [2]:
from Bio import SeqIO

from CAI import CAI, relative_adaptiveness
from Bio.SeqUtils import CodonUsage

We're going to use the highly expressed genes of _E. coli_ for our reference set as well as a test set of 100 3000bp CDSs generated from the [Sequence Manipulation Site](http://www.bioinformatics.org/sms2/random_coding_dna.html).

In [3]:
reference = [str(seq.seq) for seq in SeqIO.parse("ecoli.heg.fasta", "fasta")]
sequence = [str(seq.seq) for seq in SeqIO.parse("test.fasta", "fasta")]

## `Biopython`

In [4]:
bp = CodonUsage.CodonAdaptationIndex()
bp.generate_index("ecoli.heg.fasta")
%timeit [bp.cai_for_gene(seq) for seq in sequence]

777 ms ± 36.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


## `CAI`

In [5]:
weights = relative_adaptiveness(sequences=sequence)
%timeit [CAI(seq, weights=weights) for seq in sequence]

469 ms ± 18.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
