Competitive Programming Lecture

Contest strategies

Preparation
  • Thinking costs energy!
  • Sleep enough; early to bed the 2 nights before.
  • No practising on contest day (and the day before); it just takes energy.
During the contest
  • Eat! At the very least take a break halfway with the entire team and eat some snacks.
  • Make sure to read all the problems before the end of the contest. In the beginning, split the problems to find the simple ones, but towards the end, find a problem you think you can solve (because of the scoreboard or because you like it), and work on it as a team.
Coding
  • Ideally, use C++. Otherwise, Python can be used too.
    • For big-integer problems, prefer Python.
  • Use a TCR (e.g. https://github.com/TimonKnigge/TCR): a 25 page document containing algorithms. Ideally, implement all of them yourself so you know how they work. Otherwise download one.
  • Make a template, and add it to your TCR. One person should type this in the first minutes of the contest and copy it to A.cpp, B.cpp, … .
  • When you think you solved a problem:
    • Decide exactly how the code will look. Maybe write pseudocode on paper.
    • For hard problems: verify your solution with a teammate.
    • Once the keyboard is free, start typing it out. If needed, ask one teammate to look while you code.
    • Typical distribution:
      • 1 person typing
      • 1 person solving a new problem
      • 1 person helping the other 2: spotting typos or working on problems.

Pairwise Alignment using A*

Some resources you can use:

Exercises

  1. Implement Needleman-Wunsch’s \(O(n^2)\) DP algorithm
  2. Implement Dijkstra’s algorithm
  3. Implement Ukkonen’s \(O(nd)\) band-doubling algorithm
  4. Optional: Implement the Seed heuristic
  5. Optional: Implement A*PA
  6. Very optional: Implement the \(O(d^2)\) diagonal-transition/WFA algorithm (Ukkonen 1985; Myers 1986; Marco-Sola et al. 2020). I didn’t discuss this in detail in the class, so you will have to read the papers yourself.

You can test your code on the testdata here. Each file has two lines containing a string. The first line starts with > and the second line with <. All other characters are from ACTG. There are testcases for length from \(10^3\) to \(10^6\) and error rate from \(1\%\) to \(30\%\). Ideally each algorithm you implement is faster than the one before, and can handle larger inputs.

References

Groot Koerkamp, Ragnar, and Pesho Ivanov. 2022. “Exact Global Alignment Using A* with Seed Heuristic and Match Pruning,” September. https://doi.org/10.1101/2022.09.19.508631.
Hart, Peter, Nils Nilsson, and Bertram Raphael. 1968. “A Formal Basis for the Heuristic Determination of Minimum Cost Paths.” Ieee Transactions on Systems Science and Cybernetics 4 (2): 100–107. https://doi.org/10.1109/tssc.1968.300136.
Marco-Sola, Santiago, Juan Carlos Moure, Miquel Moreto, and Antonio Espinosa. 2020. “Fast gap-affine pairwise alignment using the wavefront algorithm.” Bioinformatics 37 (4): 456–63. https://doi.org/10.1093/bioinformatics/btaa777.
Myers, Eugene W. 1986. “An $O(ND)$ Difference Algorithm and Its Variations.” Algorithmica 1 (1–4): 251–66. https://doi.org/10.1007/bf01840446.
Needleman, Saul B., and Christian D. Wunsch. 1970. “A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins.” Journal of Molecular Biology 48 (3): 443–53. https://doi.org/10.1016/0022-2836(70)90057-4.
Ukkonen, Esko. 1985. “Algorithms for Approximate String Matching.” Information and Control 64 (1–3): 100–118. https://doi.org/10.1016/s0019-9958(85)80046-2.
Šošić, Martin, and Mile Šikić. 2017. “Edlib: A c/c ++ Library for Fast, Exact Sequence Alignment Using Edit Distance.” Edited by John Hancock. Bioinformatics 33 (9): 1394–95. https://doi.org/10.1093/bioinformatics/btw753.