CS 5660 Bioinformatics I, Spring 2006
- Time and place: Mon Wed Fri 2:30pm - 3:20pm, Main 121
- Course website: http://www.cs.usu.edu/~mjiang/cs5660/spring2006/
- Professor: Dr. Minghui Jiang
- Contact: mjiang at cc.usu.edu, 435-797-0347
- Office hours: Mon Wed Fri 1:25pm - 2:25pm, Main 402G
- Textbook: Neil C. Jones and Pavel A. Pevzner.
An Introduction to Bioinformatics Algorithms.
MIT Press.
ISBN: 0-262-10106-8.
- Course goals: The student will
-
Understand the basic concepts of biology and computer science related to bioinformatics.
-
Be able to identify the biological motivation and the mathematical abstraction of common bioinformatics problems.
-
Be able to design algorithms for new bioinformatics problems using standard algorithmic techniques.
-
Be able to compare the effectiveness of different algorithmic techniques in solving different bioinformatics problems.
- Preparation:
This is an introductory course on bioinformatics algorithms.
As an algorithms course, it is mathematically rigorous and requires intensive
abstract thinking.
The students are expected to prepare for each lecture
by reading related sections of the textbook before the lecture,
and to reinforce the learning by diligently doing algorithmic exercises
beside the homework assignments.
- Grading
- Homework (30%):
- Homework 1 (3 points): Problem 2.20, due before class on Mon, Jan 23.
- Homework 2 (3 points): Problem 4.10, due before class on Mon, Jan 30.
- Homework 3 (3 points): Problem 5.3, due before class on Mon, Feb 6.
- Homework 4 (8 points): Implement Smith-Waterman local alignment algorithm, due before class on Wed, Feb 22.
- Homework 5 (3 points): Problem 6.25, due before class on Fri, March 3.
- In-class test (25%)
- Project (45%)
- Literature survey of a research problem.
- In-depth study of a research paper.
- Implementation of an algorithm.
- Project report and peer-reviewed presentation.
- Sources
Bioinformatics
TCBB
RECOMB
JBCB
Briefings
BMC
NAR
- Papers
-
Zheng Zhang.
An exponential example for a partial digest mapping algorithm.
Journal of Computational Biology,
1(3):235-239, 1994.
-
Alain Daurat, Yan Gerard, and Maurice Nivat.
The chords' problem.
Theoretical Computer Science,
282:319-336, 2002.
-
Laurent Marsan and Marie-France Sagot.
Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification.
Journal of Computational Biology,
7:345-362, 2000.
-
Eleazar Eskin and Pavel A. Pevzner.
Finding composite regulatory patterns in DNA sequences.
Bioinformatics,
18(1):354-363, 2002.
-
Gerald Z. Hertz and Gary D. Stormo.
Identifying DNA and protein patterns with statistically significant alignments of multiple sequences.
Bioinformatics,
15(7-8):563-577, 1999.
-
William H. Gates and Christos H. Papadimitriou.
Bounds for sorting by prefix reversal.
Discrete Mathematics,
27:47-57, 1979.
-
John Kececioglu and David Sankoff.
Exact and approximation algorithms for sorting by reversals, with application to genome rearrangement.
Algorithmica,
13:180-210, 1995.
-
T.F. Smith and M.S. Waterman.
Identification of common molecular subsequences.
Journal of Molecular Biology,
147:195-197, 1981.
-
Mikhail S. Gelfand, Andrey A. Mironov, and Pavel A. Pevzner.
Gene recognition via spliced sequence alignment.
Proceedings of the National Academy of Sciences of USA,
93:9061-9066, 1996.
-
Sing-Hoi Sze, Yue Lu, and Qingwu Yang.
A polynomial time solvable formulation of multiple sequence alignment.
In Proceedings of the Ninth Annual International Conference on Research in Computational Molecular Biology (RECOMB'05), LNBI 3500,
pages 204-216, 2005.
-
Vineet Bafna, Eugene L. Lawler, and Pavel A. Pevzner.
Approximation algorithms for multiple sequence alignment.
Theoretical Computer Science, 182:233-244, 1997.
-
Bhaskar DasGupta, Tao Jiang, Sampath Kannan, Ming Li, and Elizabeth Sweedyk.
On the complexity and approximation of syntenic distance.
Discrete Applied Mathematics, 88(1-3):59-82, 1998.
- Lectures (schedule subject to change)
- Mon, Jan 9: Introduction. (Chapter 1)
- Wed, Jan 11: Molecular biology primer. (Chapter 3)
- Fri, Jan 13: Algorithms and complexity. (Chapter 2)
- Mon, Jan 16: No class. (Dr. Martin Luther King, Jr. Day)
- Wed, Jan 18: Exhaustive search: DNA restriction mapping. (Sections 4.1-4.3)
- Fri, Jan 20: Exhaustive search: regulatory motif finding. (Sections 4.4-4.9)
- Mon, Jan 23: Exhaustive search: regulatory motif finding. (Problem 4.17 and Section 5.5)
- Wed, Jan 25: Greedy algorithms: sorting by reversal and pancake flipping. (Sections 5.1-5.2)
- Fri, Jan 27: Sorting by reversal: 4-approximation. (Sections 5.3-5.4)
- Mon, Jan 30: Sorting by reversal: 2-approximation. (Read KS95 paper)
- Wed, Feb 1: Dynamic programming: coin changing problem. (Sections 6.1-6.2)
- Fri, Feb 3: Dynamic programming: Manhattan tourist problem. (Section 6.3)
- Mon, Feb 6: Dynamic programming: edit distance and longest common subsequence. (Sections 6.4-6.5)
- Wed, Feb 8: Dynamic programming: global and local sequence alignment. (Sections 6.6-6.8)
- Fri, Feb 10: Dynamic programming: local alignment with affine gap penalties. (Sections 6.8-6.9)
- Mon, Feb 13: Multiple alignment. (Section 6.10)
- Wed, Feb 15: Gene prediction with statistical approaches. (Sections 6.11-6.12)
- Fri, Feb 17: Similarity-based gene prediction and exon chaining problem. (Section 6.13)
- Mon, Feb 20: No class (President's Day).
- Wed, Feb 22: Spliced alignment. (Section 6.14)
- Fri, Feb 24: Divide-and-conquer and linear-space sequence alignment. (Sections 7.1-7.2)
- Mon, Feb 27: Block alignment and the four-russians speedup. (Section 7.3)
- Wed, Mar 1: Subquadratic-time LCS. (Section 7.4)
- Fri, Mar 3: Review for in-class test.
- Mon, Mar 6: In-class test.
- Wed, Mar 8: Review solutions for in-class test.
- Fri, Mar 10: Project topics.
- Mon, Mar 13/15/17: No classes (Spring break).
- Mon, Mar 20: Project proposals (Phillips, Anderson, Qi).
- Wed, Mar 22: Project proposals (Lu, Wahal).
- Fri, Mar 24: Research topic: protein folding.
- Mon, Mar 27: Research topic: protein folding.
- Wed, Mar 29: Research topic: protein folding.
- Fri, Mar 31: Research topic: subsequence packing.
- Mon, Apr 3: Research topic: subsequence packing.
- Wed, Apr 5: Research topic: subsequence packing.
- Fri, Apr 7: Research topic: maximum-score segments.
- Mon, Apr 10: Research topic: syntenic distance.
- Wed, Apr 12: Research topic: syntenic distance.
- Fri, Apr 14: Research topic: syntenic distance.
- Mon, Apr 17: Project demonstration. (Lu)
- Wed, Apr 19: Project demonstration. (Qi)
- Fri, Apr 21: Project demonstration. (Anderson)
- Mon, Apr 24: Project demonstration. (Phillips)
- Wed, Apr 26: Project demonstration. (Wahal)
- Fri, Apr 28: Research topic: protein structure alignment.
- Registration policy
-
The last day to add this class is January 30.
-
The last day to drop this class without notation on your transcript is
January 30.
-
Attending this class beyond January 30 without being officially registered
will not be approved by the Dean's Office. Students must be officially
registered for this course. No assignments or tests of any kind will be
graded for students whose names do not appear on the class list.
- Cheating policy:
-
Students are encouraged to discuss and exchange ideas on homework assignments,
but must write up the solutions independently;
neither discussion nor cheat sheets are allowed in tests and exams.
-
Students who are caught cheating immediately receive "Fail" grades.
- ADA compliance:
If a student has any disability that will likely require some accommodation by
the instructor, the student must contact the instructor and document the
disability through the Disability Resource Center, preferably during the first
week of the course. Any requests for special considerations relating to
attendance, method of instruction, taking of examinations, etc., must be
discussed with and approved by the Disability Resource Center and the
instructor. In cooperation with the Disability Resource Center, course
materials can be provided in alternative formats such as large print, audio,
diskette, or Braille.