Haplotype inference is described in the paper "Comparing ASP, CP, ILP on Haplotype Inference by Pure Parsimony", as follows:
A
genotype is the specific genetic makeup of an individual. It is in the form of
a
Haplotype Inference by Pure Parsimony (HIPP) is a slight modification of the Haplotype Inference problem, which asks for a minimal set of haplotypes that explain the given genotypes.
We consider two problems: an optimization problem (HIPP) and a decision problem that corresponds to it (HIPP-DEC).
HIPP Given a set G of n genotypes each with m sites, find a minimal set H of haplotypes such that each genotype in G is explained by two haplotypes in H.
HIPP-DEC Given a set G of n genotypes each with m sites, and a positive integer k, decide whether there is a set H of k haplotypes such that each genotype in G is explained by two haplotypes in H.
HIPP-DEC is NP-hard.
Both problems depend on the concept of two haplotypes “explaining” a genotype. This concept is defined in two ways, the standard definition by D. Gusfield and an alternative one by D. Brown & I. Harrower; essentially they are equivalent. We will mainly consider Gusfield’s definition to formulate the problem, and then compare it to Brown & Harrower’s.
In the following we present formulations of HIPP-DEC for each of the programming paradigms: Constraint Programming, Integer Linear Programming, and Answer Set Programming. First, we formalized the HIPP-DEC in a straightforward way. These formulations of HIPP-DEC are minimally modified to obtain formulations for HIPP, as explained in the paper above. In order to test elaboration tolerance and observe performance changes, we can add two kinds of symmetry breaking constraints to these formulations. Formulations are in the language of OPL (in the case of CP and ILP), and in the language of lparse (in the case of ASP).
Formulation |
CP |
ILP |
ASP |
HIPP-DEC |
|||
HIPP-DEC with S1 |
|||
HIPP-DEC with S2 |
|||
HIPP (from HIPP-DEC) |
|||
HIPP with S1 |
|||
HIPP with S2 |
|||
Alternative HIPP |
|||
Alternative HIPP-DEC |
All formulations above are available in one file: hipp-models-gus.zip
HIPP can also be formulated according to Dan Brown and Ian Harrower’s definition. These formulations are available in this file: hipp-models-bro.zip.
With these files, we can
solve problem instances of HIPP-DEC using Clasp for ASP, and using
lparse -c n=8 -c m=9 –c k=8 -d none --true-negation hipp-dec-s1-gus-asp.lp P2.happ | clasp
Here are the input files (*.happ files) describing problems P0-P3 in lparse (same for both definitions): hipp-dec-input-gus-lparse.zip
In OPL, we run programs (*.mod files) with data files (*.dat files) and different options for CP and ILP. Here are data files describing problems P0-P3 in OPL, relative to Gusfield’s definition: hipp-dec-input-gus-opl.zip, and relative to Brown’s definition: hipp-dec-input-bro-opl.zip.
The options -conflict and -relax are included only for CP formulations. For instance, a solution to problem P2 can be computed by the following command in OPL, for CP:
oplrun –v –conflict -relax -D n=8 -D m=9 –D k=8 hipp-dec-gus-cp.mod P2.gus.dat
and for ILP:
oplrun –v -D n=8 -D m=9 –D k=8 hipp-dec-gus-ilp.mod P2.gus.dat
If OPL is running through
an IDE, such as
In order to solve the original problem HIPP, we can run these commands with HIPP-DEC formulations for different values of k (search for minimal k that terminates with decision “YES”), or we can use the HIPP formulations (obtained by modifying HIPP-DEC formulations) to find the minimal value of k. When running HIPP formulations, we discard the parameter k and add the parameter 0 in the case of ASP; for example, a solution to problem P2 can be computed as follows:
lparse -c n=8 -c m=9 –d none -–true-negation hipp-opt-s1-gus-asp.lp P2.happ | clasp 0
In the case of OPL, we discard the lines in data files that assign values to parameter k, and run as the following:
oplrun -v -D n=8 -D m=9 hipp-opt-gus-ilp.mod P2-opt.gus.dat
Here are the modified data files: hipp-opt-input-gus-opl.zip and hipp-opt-input-bro-opl.zip.
With the formulations
above, we tried to solve four problem instances of HIPP, using the answer
set solver clasp (Version 1.0.4) and
The experimental results can be found here: hipp-results.pdf