The aim of this workshop called Bioinformatics and Artificial Intelligence (BAI) is to bring together active scholars and practionners in the frontier of Artificial Intelligence (AI) and Bioinformatics. AI holds a tremendous repertoire of algorithms and methods that constitute the core of different topics of bioinformatics and computational biology research. BAI goals are twofolds : How can AI techniques contribute to bioinformatics research ?, and How can bioinformatics research raise new fundamental questions in AI ? Contributions will clearly point out answers to one of these goals focusing on AI techniques as well as focusing on biological problems.

Important dates:

  • Deadline for Paper Submission: April 27th, 2015 May 11th, 2015
  • Author Notification: June 8th, 2015
  • Camera Ready Deadline: June 22th, 2015
  • Workshop: July 27th, 2015

Useful links:

logo Dr. Thomas SCHIEX, PhD.
Research Director
INRA Toulouse, France

Thomas Schiex is a computer scientist who shares his time between extending AI core technology and developing computational biology tools. On the AI side, he is interested in pushing the limits of problem modeling and solving using Constraint Satisfaction (CSP) technology. One of his most visible contribution is the framework of Valued CSP (IJCAI 95) generalizing Constraints Networks to Cost Functions Networks, thus defining an algebraic Graphical Modeling framework for optimization. He later equipped it with generalized local consistency properties and algorithms. These notions are at the core of the successful toolbar and toulbar2 solvers and have also been incorporated in MaxSAT solvers such as MiniMaxSat.

When he started to work for the French Institute for Agronomical research (INRA), he was amazed to see that Biology offers a variety of challenging discrete or mixed optimization problems. In genetics, Thomas developed a radiated hybrid and genetic mapping tool combining TSP optimization with a maximum likelihood approach. CarthaGene is cited hundreds of times and produced maps for a number of plants and animals. For DNA analysis, he developed the integrative gene finders FrameD and EuGene, which have been used to annotate various bacterial, plant and animal genomes (his two favorites being tomato and cocoa, yum !). These tools rely on Cost Function Networks models, which also underly the RNA gene finders MilPat and DARN!. Thomas is now increasingly interested by synthetic structural biology and Computational Protein Design where biophysicists defined exponentially sized factored discrete search spaces. A new space to explore !

Thomas is also an (Associate) Editor of the Journal of Artificial Intelligence Research, of the Constraints journal and of the Artificial Intelligence Journal. For a decade, he has been a member of the Executive Committee of the Association for Constraint Programming.

Title: Computational Protein Design as an Optimization Problem

After less than two decades, an increasing number of new proteins have been designed following a semi-rational design process. The ultimate aim of Protein Design is to produce an amino-acid sequence (a protein) that will fold in 3D-space according to a desired scaffold. In most cases, the aim is to obtain a new enzyme catalyzing a new reaction, improving an existing catalysis or creating affinity for new partners. The design may also have nanotechnological purposes.Applications are numerous, including in medecine, bioenergies, food and cosmetics.

With 20 amino-acids, the space of all amino-acid sequences is extremely combinatorial and its systematic exploration, even if it is directed through experimental selection, is out of reach of experimental approaches. To focus this search, the rational design approach consists in modeling the protein as a 3D object, subjected to various forces (internal torsions, van der Waals, electrostatic, hydrogen bonds and solvation) and to seek an optimal sequence, with criteria that include stability and affinity for a chosen partner.

Even with strong simplifying assumptions, this defines very complex combinatorial optimization problems, from both a modeling and solving perspective. At the core of most existing stability approaches lies a simple formulation of this problem, with a rigid scaffold, flexible side-chains represented by a discrete library of conformations (rotamers) and a decomposable energy field. We will see how this NP-hard problem can be modeled using a variety of usual discrete optimization frameworks from both Artificial Intelligence (Constraint Programming, Satisfiability, Machine Learning) and Operations Research (Integer Linear and Quadratic Programming and Optimization). On a benchmark of CPD problems, the efficiency of these different approaches varies tremendously. Among all those, we will quickly detail how the most successful approach works.

Note: The second invited talk is cancelled due to family reason.
logo Dr. Anshul Kundaje, PhD.
Assistant Professor
Stanford University, USA

Anshul Kundaje is an Assistant Professor of Genetics and Computer Science at Stanford University. Before joining the Stanford faculty, he obtained his PhD from Columbia University (2008), was a postdoc at Stanford (2012) and a Research Scientist at MIT/Broad Institute (2013). His lab develops novel statistical and machine learning methods for integrating large-scale genomic and epigenomic data to identify context-specific regulatory elements and their functions; characterize epigenomic variation across individuals, cell types and species; learn models of transcriptional regulation and decipher the genetic and molecular basis of complex diseases. He has led the computational analysis efforts of two of the largest functional genomics consortia - The Encyclopedia of DNA Elements Project and The Roadmap Epigenomics Projects to obtain the most comprehensive reference set of non-coding elements in the human genome.