We
view a genotype as a vector of sites, each site having a value from the domain
{0, 1, 2}; and a haplotype as a vector of sites, each
site having a value from the domain {0, 1}. According to Gusfield,
a genotype is ambiguous if its value
is 2; and resolved otherwise. Two haplotypes h1
and h2 form (or explain) a genotype g if
for every site j the following hold:
if g[j]=2 then h1[j]=0 and h2[j]=1;
if g[j]=1 then h1[j]=1 and h2[j]=1; and
if g[j]=0 then h1[j]=0 and h2[j]=0.
For
instance, the genotype 20110 can be explained by the haplotypes 10110 and 00110:
10110
00110
-------
20110
Consider a
set H of k haplotypes. For the problem above, H is a solution to HIPP-DEC if the
following constraints are satisfied:
C1 Every genotype g in G is mapped to two haplotypes in H.
C2 For every genotype g in G, for every
ambiguous site j of g, the values of the j'th sites of these haplotypes are different.
C3 For every genotype g in G, for every
resolved site j of g, the values of the j'th site of these haplotypes are g[j].