Introduction
One of the - confusing - topics in Algorithm design is the concept of NP
completeness. If you search the topic on the internet you will probably
find tons of articles and lectures on the subject however in this short
article I will summarize it so that it is easy to remember by the
average student or software engineer.
It is all about Running Time
Computational complexity analysis refers to the study of computer
algorithms in terms of efficiency. Given a problem of some input size,
we need to know how fast or slow it runs as the input size grows to
large values. Consider the following examples:
- Looking
an item up in a perfect hash table takes a constant time, theoretically
speaking, no matter how big the hash table is, the item can be found
instantly (assuming it exists), the algorithm is said to have a running
time of O(1). This is indeed very fast.
- The
running time to look up an item in a sorted binary search tree of (n)
elements is O(Log(n)) for example if the number of items is (32) then
the number of comparisons needed to find an element is (5) in the worst
case. As you can see this is a fast algorithm
- Searching
for an element in an unsorted array runs in linear time O(n). If the
array contains 100 elements then you need to make 100 comparisons to
find the element in the worst case. This means, the algorithm will slow
down linearly as the input size grows. To be exact, the running time
linearly grows as the input size becomes larger.
Polynomial versus Exponential
Algorithms with a running time function of the form O(n^k) where (k) is a
constant are said to run (or solved) in polynomial time. Theoretically
speaking, they are fast even if (k) is large. On the other hand,
algorithms with a running time of the form O(k^n) where (k) is a
constant are said to run in exponential time. These algorithms are very
slow. Actually it may take a computer years to finish running the
algorithm.
P, NP, NP-Complete, NP-Hard
Now we know what running time means, based on that we can classify
problems into classes depending on how complex they are. I will be using
plain English as opposed to mathematical terms in order to make it easy
to understand.
P Problem
P means that there exists an algorithm to solve the problem which runs
in polynomial time. If you are still not sure what running time or
polynomial means then please read the introduction one more time.
NP Problem
NP problem means that there exists an algorithm to verify a given
solution is correct in polynomial time. Note that we are talking about
solution verification. We are not talking about the solution for the
problem because no one has ever yet discovered a fast (polynomial)
solution for NP problems nor proved the solution does not exist in the
first place. NP does not mean None Polynomial, please do not say that in
front of people because it is embarrassing, NP stands for non
deterministic which means the problem can be solved in polynomial time
BUT using none deterministic machine (of course it is not the regular
computer we use every day, this computer is in the mind of some weird
computer scientists. If you are interested you can research it on your
own. Try to search for Turing Machine)
NP-Hard Problem
Now we know what NP means, NP-Hard problem is AT LEAST as hard as any NP
problem. It could be harder, who knows. Again, when thinking about
problem difficulty we are still referring to the running time of the
algorithm. A harder problem takes more time to finish running.
Reduction
It feels bad when you try to solve a problem and it turns to be very
hard to the extent that many people have tried it without success. In
order to claim that the problem cannot be solved by many people then you
need to prove that it is equivalent to some known hard problem.
Reduction refers to transforming a problem to another well known (hard)
problem so that people won’t dare to call you stupid. This conversion
process should run in polynomial time and can be used to prove a problem
is NP-Complete as we will indicate later.
NP-Complete Problem
NP-Complete problem is both NP and NP-Hard. You know what NP and NP-Hard
means, so NP-Complete means a problem that is easy to verify a given
solution and every problem in NP can be reduced to our problem. The hard
part is proving the first example of an NP-complete problem but our
friend Steve Cook did that in the 1970s. Thanks to him because he did a
great job so that we are not called stupid. In few words, in order to
prove that a given problem is NP-Complete then
- Show it is NP: given a solution, show that it can be verified in polynomial time
- Show
it is NP-Hard: pick an already known NP-Complete problem and show that
you can reduce it back to our problem in polynomial time
What is the big deal about the question P = NP?
- If P is the same as NP then this is good news, there are many interesting real life problems that could be solved very quickly.
- It
is going to be really embarrassing for the folks working in computer
science field because it is believed (I am assuming someone got the
$1000000 prize) for long time by many people that they are not the same.
- In
reality many believe they are not the same because many people knocked
their heads against the wall for so long without any success and because
crazy scientists need a proof a solution does not exist, it is going to
be an open question.
Class NP
- The class of decision problems for which there is a polynomially bounded
nondeterministic algorithm.
- The class of problems that are "slightly harder" than P.
- In general, an exponential number of subproblems, each of which can be solved
or verified in polynomial time.
- Being in class NP constitutes an upper bound on problem complexity.
Nondeterministic Turing Machine (NDTM)
Model a problem as a tree in which the nodes represent states, with time flowing
from the root to the children. A leaf represents a solution.
In an NDTM, the computation splits whenever a guess is needed.
- The tree has polynomial depth but is exponentially bushy.
"Guess" nodes are like OR nodes: if either of the children returns T,
the "guess" node returns T.
NP-hard
A problem Q is NP-hard if every problem in NP is reducible to Q.
A lower bound on a problem: "at least as hard as any problem in NP."
NP-Complete
The hardest decision problems in NP.
If there were a polynomially bounded algorithm for an NP-complete problem,
then there would be a polynomially bounded algorithm for each problem in NP.
Establishes both a lower and an upper bound (although it may be that P=NP).
If any NP-complete problem is in P, then P=NP.
Satisfiability (SAT)
Instance: Given a set U = {u1, ..., um}
of variables and a collection C = {c1, ..., cn} of clauses
over U.
Question: Is there a truth assignment to U that satisfies C?
A logical formula in conjunctive normal form (CNF):
a conjunction of disjunctions
e.g., (x1 + x2 + ¬x3) ·
(x1 + x3) · (¬x2)
- easy to falsify: make all terms in one clause fail.
- hard to satisfy: terms in clauses interact.
Cook's Theorem: SAT is NP-complete.
- uses SAT to simulate a machine
Once the primordial NP-complete problem is found, we can prove other
problems NP-complete without reference to machines:
- To prove that problem Q is NP-complete, reduce SAT
(or any other NP-complete problem) to Q:
SAT <=P Q
- E.g., 3SAT: like SAT, but | ci | = 3 for 1 <= i <= n.
- Need to show that 3SAT is NP-hard:
Since X <=P SAT for any problem X in NP,
we need only show SAT <=P 3SAT.
- Proof:
General form of clauses in SAT is
cj = (z1 + z2 + ... + zk),
where each zi is either an input or the negation of an input.
- k = 1
- {z1 + z1 + z1}
- k = 2
- {z1 + z2 + z2}
- k = 3
- {z1 + z2 + z3}
- k >= 4
- Add auxiliary variables that transmit information between pieces.
E.g., {z1 + z2 + z3 + z4 + z5}
becomes
{(z1 + z2 + y1)
(¬y1 + z3 + y2)
(¬y2 + z4 + z5)}
The transformation is polynomial in the length of the SAT instance.
Note: 2SAT is solvable in polynomial time.
Other NP-Complete Problems
-
- Graph coloring
- A coloring of a graph is the assignment of a "color" to each vertex
of the graph such that adjacent vertices are not assigned the same color.
The chromatic number of a graph G, X(G) [Chi(G)],
is the smallest number of colors needed to color G.
Optimization problem: Given an undirected graph G, determine X(G)
(and produce an optimal coloring).
Decision problem: Given G and a positive integer k,
is there a coloring of G using at most k colors
(i.e., is G k-colorable)?
-
- Job scheduling with penalties
- Suppose we are given n jobs to be executed one at a time and,
for each job, an execution time, deadline, and penalty for missing the deadline.
Optimization problem: Determine the minimum possible penalty (and find an
optimal schedule).
Decision problem: Given a nonnegative integer k,
is there a schedule with penalty < k?
Note: Job scheduling without penalties such that at most k jobs
miss their deadlines is in P.
-
- Multiprocessor scheduling
- Suppose we are given a deadline and a set of tasks of varying length to be performed
on two identical processors.
Optimization problem: Determine an allocation of the tasks to the two
processors such that the deadline can be met.
Decision problem: Can the tasks be arranged so that the deadline is met?
-
- Bin packing
- Optimization problem: Given an unlimited number of bins, each of size 1,
and n objects with sizes s1, ..., sn
where 0 < si < 1,
determine the smallest number of bins into which the objects can be packed (and find
an optimal packing).
Decision problem: Given, in addition to the inputs described above,
an integer k, do the objects fit in k bins?
-
- Knapsack problem
- Optimization problem: Given a knapsack of capacity C and
n objects with sizes s1, ..., sn
and "profits" p1, ..., pn,
find the largest total profit of any subset of the objects that fits in the knapsack
( and find a subset that achieves the maximum profit).
Decision problem: Given k, is there a subset of the objects
that fits in the knapsack and has total profit at least k?
-
- Subset sum problem
- Optimization problem: Given a positive integer C and n objects
with positive sizes s1, ..., sn, among subsets
of the objects with sum at most C, what is the largest subset sum?
Decision problem: Is there a subset of the objects whose sizes add up to
exactly C?
The subset sum problem is a simpler version of the knapsack problem, in which the
profit for each object is the same as its size.
-
- Partition
- Suppose we are given a set of integers.
Decision problem: Can the integers be partitioned into two sets whose sum is equal?
-
- Hamiltonian cycles and Hamiltonian paths
- A Hamiltonian cycle (path) is a simple cycle (path)
that passes through every vertex of a graph exactly once.
Decision problem: Does a given undirected graph have a Hamiltonian cycle (path)?
Note: An Euler tour -- a cycle that traverses each edge of a graph
exactly once -- can be found in O(e) time.
-
- Traveling salesperson problem (TSP) or minimum tour problem
- Optimization problem: Given a complete, weighted graph,
find a minimum-weight Hamiltonian cycle.
Decision problem: Given a complete, weighted graph and an integer k,
is there a Hamiltonian cycle with total weight at most k?
A complete graph has an edge between each pair of vertices.
-
- Vertex cover
- A vertex cover for an undirected graph G is a subset V' of vertices
such that each edge in G is incident upon some vertex in V'.
Optimization problem: Find a vertex cover for G with as few vertices as possible.
Decision problem: Given an integer k, does G have a vertex cover
consisting of k vertices?
Note: The edge cover problem is in P.
-
- Clique
- A clique is a subset V' of vertices in an undirected graph G such that
every pair of distinct vertices in V' is joined by an edge in G (i.e., the subgraph
induced by V' is complete). A clique with k vertices is called a k-clique.
Optimization problem: Find a clique with as many vertices as possible.
Decision problem: Given an integer k, does G have a k-clique?
-
- Independent set
- An independent set is a subset V' of vertices in an undirected graph G
such that no pair of vertices in V' is joined by an edge of G.
Optimization problem: Find an independent set with as many vertices as possible.
Decision problem: Given an integer k, does G have an independent set
consisting of k vertices?
-
- Feedback edge set
- A feedback edge set in a digraph G is a subset E' of edges such that
every cycle in G has an edge in E'.
Optimization problem: Find a feedback edge set with as few edges as possible.
Decision problem: Given an integer k, does G have a feedback edge set
consisting of k edges?
Note: The feedback edge set for undirected graphs is in P.
-
- Longest simple path problem
- Optimization problem: What is the longest simple path between any
two vertices in graph G?
Decision problem: Given an integer k, is there a path in graph G longer
than k?
Note: The shortest path between any two vertices can be found in O(ne) time.
-
- Subgraph isomorphism
- Decision problem: Given two graphs G1 and G2,
determine if G1 is isomorphic to a subgraph of G2.
-
- Integer linear programming
- Linear programming is a method of analyzing systems of linear inequalities
to arrive at optimal solutions to problems
Decision problem: Given a linear program, is there an integral solution?
Class NP
- The class of decision problems for which there is a polynomially bounded
nondeterministic algorithm.
- The class of problems that are "slightly harder" than P.
- In general, an exponential number of subproblems, each of which can be solved
or verified in polynomial time.
- Being in class NP constitutes an upper bound on problem complexity.
Nondeterministic Turing Machine (NDTM)
Model a problem as a tree in which the nodes represent states, with time flowing
from the root to the children. A leaf represents a solution.
In an NDTM, the computation splits whenever a guess is needed.
- The tree has polynomial depth but is exponentially bushy.
"Guess" nodes are like OR nodes: if either of the children returns T,
the "guess" node returns T.
NP-hard
A problem Q is NP-hard if every problem in NP is reducible to Q.
A lower bound on a problem: "at least as hard as any problem in NP."
NP-Complete
The hardest decision problems in NP.
If there were a polynomially bounded algorithm for an NP-complete problem,
then there would be a polynomially bounded algorithm for each problem in NP.
Establishes both a lower and an upper bound (although it may be that P=NP).
If any NP-complete problem is in P, then P=NP.
Satisfiability (SAT)
Instance: Given a set U = {u1, ..., um}
of variables and a collection C = {c1, ..., cn} of clauses
over U.
Question: Is there a truth assignment to U that satisfies C?
A logical formula in conjunctive normal form (CNF):
a conjunction of disjunctions
e.g., (x1 + x2 + ¬x3) ·
(x1 + x3) · (¬x2)
- easy to falsify: make all terms in one clause fail.
- hard to satisfy: terms in clauses interact.
Cook's Theorem: SAT is NP-complete.
- uses SAT to simulate a machine
Once the primordial NP-complete problem is found, we can prove other
problems NP-complete without reference to machines:
- To prove that problem Q is NP-complete, reduce SAT
(or any other NP-complete problem) to Q:
SAT <=P Q
- E.g., 3SAT: like SAT, but | ci | = 3 for 1 <= i <= n.
- Need to show that 3SAT is NP-hard:
Since X <=P SAT for any problem X in NP,
we need only show SAT <=P 3SAT.
- Proof:
General form of clauses in SAT is
cj = (z1 + z2 + ... + zk),
where each zi is either an input or the negation of an input.
- k = 1
- {z1 + z1 + z1}
- k = 2
- {z1 + z2 + z2}
- k = 3
- {z1 + z2 + z3}
- k >= 4
- Add auxiliary variables that transmit information between pieces.
E.g., {z1 + z2 + z3 + z4 + z5}
becomes
{(z1 + z2 + y1)
(¬y1 + z3 + y2)
(¬y2 + z4 + z5)}
The transformation is polynomial in the length of the SAT instance.
Note: 2SAT is solvable in polynomial time.
Other NP-Complete Problems
-
- Graph coloring
- A coloring of a graph is the assignment of a "color" to each vertex
of the graph such that adjacent vertices are not assigned the same color.
The chromatic number of a graph G, X(G) [Chi(G)],
is the smallest number of colors needed to color G.
Optimization problem: Given an undirected graph G, determine X(G)
(and produce an optimal coloring).
Decision problem: Given G and a positive integer k,
is there a coloring of G using at most k colors
(i.e., is G k-colorable)?
-
- Job scheduling with penalties
- Suppose we are given n jobs to be executed one at a time and,
for each job, an execution time, deadline, and penalty for missing the deadline.
Optimization problem: Determine the minimum possible penalty (and find an
optimal schedule).
Decision problem: Given a nonnegative integer k,
is there a schedule with penalty < k?
Note: Job scheduling without penalties such that at most k jobs
miss their deadlines is in P.
-
- Multiprocessor scheduling
- Suppose we are given a deadline and a set of tasks of varying length to be performed
on two identical processors.
Optimization problem: Determine an allocation of the tasks to the two
processors such that the deadline can be met.
Decision problem: Can the tasks be arranged so that the deadline is met?
-
- Bin packing
- Optimization problem: Given an unlimited number of bins, each of size 1,
and n objects with sizes s1, ..., sn
where 0 < si < 1,
determine the smallest number of bins into which the objects can be packed (and find
an optimal packing).
Decision problem: Given, in addition to the inputs described above,
an integer k, do the objects fit in k bins?
-
- Knapsack problem
- Optimization problem: Given a knapsack of capacity C and
n objects with sizes s1, ..., sn
and "profits" p1, ..., pn,
find the largest total profit of any subset of the objects that fits in the knapsack
( and find a subset that achieves the maximum profit).
Decision problem: Given k, is there a subset of the objects
that fits in the knapsack and has total profit at least k?
-
- Subset sum problem
- Optimization problem: Given a positive integer C and n objects
with positive sizes s1, ..., sn, among subsets
of the objects with sum at most C, what is the largest subset sum?
Decision problem: Is there a subset of the objects whose sizes add up to
exactly C?
The subset sum problem is a simpler version of the knapsack problem, in which the
profit for each object is the same as its size.
-
- Partition
- Suppose we are given a set of integers.
Decision problem: Can the integers be partitioned into two sets whose sum is equal?
-
- Hamiltonian cycles and Hamiltonian paths
- A Hamiltonian cycle (path) is a simple cycle (path)
that passes through every vertex of a graph exactly once.
Decision problem: Does a given undirected graph have a Hamiltonian cycle (path)?
Note: An Euler tour -- a cycle that traverses each edge of a graph
exactly once -- can be found in O(e) time.
-
- Traveling salesperson problem (TSP) or minimum tour problem
- Optimization problem: Given a complete, weighted graph,
find a minimum-weight Hamiltonian cycle.
Decision problem: Given a complete, weighted graph and an integer k,
is there a Hamiltonian cycle with total weight at most k?
A complete graph has an edge between each pair of vertices.
-
- Vertex cover
- A vertex cover for an undirected graph G is a subset V' of vertices
such that each edge in G is incident upon some vertex in V'.
Optimization problem: Find a vertex cover for G with as few vertices as possible.
Decision problem: Given an integer k, does G have a vertex cover
consisting of k vertices?
Note: The edge cover problem is in P.
-
- Clique
- A clique is a subset V' of vertices in an undirected graph G such that
every pair of distinct vertices in V' is joined by an edge in G (i.e., the subgraph
induced by V' is complete). A clique with k vertices is called a k-clique.
Optimization problem: Find a clique with as many vertices as possible.
Decision problem: Given an integer k, does G have a k-clique?
-
- Independent set
- An independent set is a subset V' of vertices in an undirected graph G
such that no pair of vertices in V' is joined by an edge of G.
Optimization problem: Find an independent set with as many vertices as possible.
Decision problem: Given an integer k, does G have an independent set
consisting of k vertices?
-
- Feedback edge set
- A feedback edge set in a digraph G is a subset E' of edges such that
every cycle in G has an edge in E'.
Optimization problem: Find a feedback edge set with as few edges as possible.
Decision problem: Given an integer k, does G have a feedback edge set
consisting of k edges?
Note: The feedback edge set for undirected graphs is in P.
-
- Longest simple path problem
- Optimization problem: What is the longest simple path between any
two vertices in graph G?
Decision problem: Given an integer k, is there a path in graph G longer
than k?
Note: The shortest path between any two vertices can be found in O(ne) time.
-
- Subgraph isomorphism
- Decision problem: Given two graphs G1 and G2,
determine if G1 is isomorphic to a subgraph of G2.
-
- Integer linear programming
- Linear programming is a method of analyzing systems of linear inequalities
to arrive at optimal solutions to problems
Decision problem: Given a linear program, is there an integral solution?
Class NP
- The class of decision problems for which there is a polynomially bounded
nondeterministic algorithm.
- The class of problems that are "slightly harder" than P.
- In general, an exponential number of subproblems, each of which can be solved
or verified in polynomial time.
- Being in class NP constitutes an upper bound on problem complexity.
Nondeterministic Turing Machine (NDTM)
Model a problem as a tree in which the nodes represent states, with time flowing
from the root to the children. A leaf represents a solution.
In an NDTM, the computation splits whenever a guess is needed.
- The tree has polynomial depth but is exponentially bushy.
"Guess" nodes are like OR nodes: if either of the children returns T,
the "guess" node returns T.
NP-hard
A problem Q is NP-hard if every problem in NP is reducible to Q.
A lower bound on a problem: "at least as hard as any problem in NP."
NP-Complete
The hardest decision problems in NP.
If there were a polynomially bounded algorithm for an NP-complete problem,
then there would be a polynomially bounded algorithm for each problem in NP.
Establishes both a lower and an upper bound (although it may be that P=NP).
If any NP-complete problem is in P, then P=NP.
Satisfiability (SAT)
Instance: Given a set U = {u1, ..., um}
of variables and a collection C = {c1, ..., cn} of clauses
over U.
Question: Is there a truth assignment to U that satisfies C?
A logical formula in conjunctive normal form (CNF):
a conjunction of disjunctions
e.g., (x1 + x2 + ¬x3) ·
(x1 + x3) · (¬x2)
- easy to falsify: make all terms in one clause fail.
- hard to satisfy: terms in clauses interact.
Cook's Theorem: SAT is NP-complete.
- uses SAT to simulate a machine
Once the primordial NP-complete problem is found, we can prove other
problems NP-complete without reference to machines:
- To prove that problem Q is NP-complete, reduce SAT
(or any other NP-complete problem) to Q:
SAT <=P Q
- E.g., 3SAT: like SAT, but | ci | = 3 for 1 <= i <= n.
- Need to show that 3SAT is NP-hard:
Since X <=P SAT for any problem X in NP,
we need only show SAT <=P 3SAT.
- Proof:
General form of clauses in SAT is
cj = (z1 + z2 + ... + zk),
where each zi is either an input or the negation of an input.
- k = 1
- {z1 + z1 + z1}
- k = 2
- {z1 + z2 + z2}
- k = 3
- {z1 + z2 + z3}
- k >= 4
- Add auxiliary variables that transmit information between pieces.
E.g., {z1 + z2 + z3 + z4 + z5}
becomes
{(z1 + z2 + y1)
(¬y1 + z3 + y2)
(¬y2 + z4 + z5)}
The transformation is polynomial in the length of the SAT instance.
Note: 2SAT is solvable in polynomial time.
Other NP-Complete Problems
-
- Graph coloring
- A coloring of a graph is the assignment of a "color" to each vertex
of the graph such that adjacent vertices are not assigned the same color.
The chromatic number of a graph G, X(G) [Chi(G)],
is the smallest number of colors needed to color G.
Optimization problem: Given an undirected graph G, determine X(G)
(and produce an optimal coloring).
Decision problem: Given G and a positive integer k,
is there a coloring of G using at most k colors
(i.e., is G k-colorable)?
-
- Job scheduling with penalties
- Suppose we are given n jobs to be executed one at a time and,
for each job, an execution time, deadline, and penalty for missing the deadline.
Optimization problem: Determine the minimum possible penalty (and find an
optimal schedule).
Decision problem: Given a nonnegative integer k,
is there a schedule with penalty < k?
Note: Job scheduling without penalties such that at most k jobs
miss their deadlines is in P.
-
- Multiprocessor scheduling
- Suppose we are given a deadline and a set of tasks of varying length to be performed
on two identical processors.
Optimization problem: Determine an allocation of the tasks to the two
processors such that the deadline can be met.
Decision problem: Can the tasks be arranged so that the deadline is met?
-
- Bin packing
- Optimization problem: Given an unlimited number of bins, each of size 1,
and n objects with sizes s1, ..., sn
where 0 < si < 1,
determine the smallest number of bins into which the objects can be packed (and find
an optimal packing).
Decision problem: Given, in addition to the inputs described above,
an integer k, do the objects fit in k bins?
-
- Knapsack problem
- Optimization problem: Given a knapsack of capacity C and
n objects with sizes s1, ..., sn
and "profits" p1, ..., pn,
find the largest total profit of any subset of the objects that fits in the knapsack
( and find a subset that achieves the maximum profit).
Decision problem: Given k, is there a subset of the objects
that fits in the knapsack and has total profit at least k?
-
- Subset sum problem
- Optimization problem: Given a positive integer C and n objects
with positive sizes s1, ..., sn, among subsets
of the objects with sum at most C, what is the largest subset sum?
Decision problem: Is there a subset of the objects whose sizes add up to
exactly C?
The subset sum problem is a simpler version of the knapsack problem, in which the
profit for each object is the same as its size.
-
- Partition
- Suppose we are given a set of integers.
Decision problem: Can the integers be partitioned into two sets whose sum is equal?
-
- Hamiltonian cycles and Hamiltonian paths
- A Hamiltonian cycle (path) is a simple cycle (path)
that passes through every vertex of a graph exactly once.
Decision problem: Does a given undirected graph have a Hamiltonian cycle (path)?
Note: An Euler tour -- a cycle that traverses each edge of a graph
exactly once -- can be found in O(e) time.
-
- Traveling salesperson problem (TSP) or minimum tour problem
- Optimization problem: Given a complete, weighted graph,
find a minimum-weight Hamiltonian cycle.
Decision problem: Given a complete, weighted graph and an integer k,
is there a Hamiltonian cycle with total weight at most k?
A complete graph has an edge between each pair of vertices.
-
- Vertex cover
- A vertex cover for an undirected graph G is a subset V' of vertices
such that each edge in G is incident upon some vertex in V'.
Optimization problem: Find a vertex cover for G with as few vertices as possible.
Decision problem: Given an integer k, does G have a vertex cover
consisting of k vertices?
Note: The edge cover problem is in P.
-
- Clique
- A clique is a subset V' of vertices in an undirected graph G such that
every pair of distinct vertices in V' is joined by an edge in G (i.e., the subgraph
induced by V' is complete). A clique with k vertices is called a k-clique.
Optimization problem: Find a clique with as many vertices as possible.
Decision problem: Given an integer k, does G have a k-clique?
-
- Independent set
- An independent set is a subset V' of vertices in an undirected graph G
such that no pair of vertices in V' is joined by an edge of G.
Optimization problem: Find an independent set with as many vertices as possible.
Decision problem: Given an integer k, does G have an independent set
consisting of k vertices?
-
- Feedback edge set
- A feedback edge set in a digraph G is a subset E' of edges such that
every cycle in G has an edge in E'.
Optimization problem: Find a feedback edge set with as few edges as possible.
Decision problem: Given an integer k, does G have a feedback edge set
consisting of k edges?
Note: The feedback edge set for undirected graphs is in P.
-
- Longest simple path problem
- Optimization problem: What is the longest simple path between any
two vertices in graph G?
Decision problem: Given an integer k, is there a path in graph G longer
than k?
Note: The shortest path between any two vertices can be found in O(ne) time.
-
- Subgraph isomorphism
- Decision problem: Given two graphs G1 and G2,
determine if G1 is isomorphic to a subgraph of G2.
-
- Integer linear programming
- Linear programming is a method of analyzing systems of linear inequalities
to arrive at optimal solutions to problems
Decision problem: Given a linear program, is there an integral solution?
Class NP
- The class of decision problems for which there is a polynomially bounded
nondeterministic algorithm.
- The class of problems that are "slightly harder" than P.
- In general, an exponential number of subproblems, each of which can be solved
or verified in polynomial time.
- Being in class NP constitutes an upper bound on problem complexity.
Nondeterministic Turing Machine (NDTM)
Model a problem as a tree in which the nodes represent states, with time flowing
from the root to the children. A leaf represents a solution.
In an NDTM, the computation splits whenever a guess is needed.
- The tree has polynomial depth but is exponentially bushy.
"Guess" nodes are like OR nodes: if either of the children returns T,
the "guess" node returns T.
NP-hard
A problem Q is NP-hard if every problem in NP is reducible to Q.
A lower bound on a problem: "at least as hard as any problem in NP."
NP-Complete
The hardest decision problems in NP.
If there were a polynomially bounded algorithm for an NP-complete problem,
then there would be a polynomially bounded algorithm for each problem in NP.
Establishes both a lower and an upper bound (although it may be that P=NP).
If any NP-complete problem is in P, then P=NP.
Satisfiability (SAT)
Instance: Given a set U = {u1, ..., um}
of variables and a collection C = {c1, ..., cn} of clauses
over U.
Question: Is there a truth assignment to U that satisfies C?
A logical formula in conjunctive normal form (CNF):
a conjunction of disjunctions
e.g., (x1 + x2 + ¬x3) ·
(x1 + x3) · (¬x2)
- easy to falsify: make all terms in one clause fail.
- hard to satisfy: terms in clauses interact.
Cook's Theorem: SAT is NP-complete.
- uses SAT to simulate a machine
Once the primordial NP-complete problem is found, we can prove other
problems NP-complete without reference to machines:
- To prove that problem Q is NP-complete, reduce SAT
(or any other NP-complete problem) to Q:
SAT <=P Q
- E.g., 3SAT: like SAT, but | ci | = 3 for 1 <= i <= n.
- Need to show that 3SAT is NP-hard:
Since X <=P SAT for any problem X in NP,
we need only show SAT <=P 3SAT.
- Proof:
General form of clauses in SAT is
cj = (z1 + z2 + ... + zk),
where each zi is either an input or the negation of an input.
- k = 1
- {z1 + z1 + z1}
- k = 2
- {z1 + z2 + z2}
- k = 3
- {z1 + z2 + z3}
- k >= 4
- Add auxiliary variables that transmit information between pieces.
E.g., {z1 + z2 + z3 + z4 + z5}
becomes
{(z1 + z2 + y1)
(¬y1 + z3 + y2)
(¬y2 + z4 + z5)}
The transformation is polynomial in the length of the SAT instance.
Note: 2SAT is solvable in polynomial time.
Other NP-Complete Problems
-
- Graph coloring
- A coloring of a graph is the assignment of a "color" to each vertex
of the graph such that adjacent vertices are not assigned the same color.
The chromatic number of a graph G, X(G) [Chi(G)],
is the smallest number of colors needed to color G.
Optimization problem: Given an undirected graph G, determine X(G)
(and produce an optimal coloring).
Decision problem: Given G and a positive integer k,
is there a coloring of G using at most k colors
(i.e., is G k-colorable)?
-
- Job scheduling with penalties
- Suppose we are given n jobs to be executed one at a time and,
for each job, an execution time, deadline, and penalty for missing the deadline.
Optimization problem: Determine the minimum possible penalty (and find an
optimal schedule).
Decision problem: Given a nonnegative integer k,
is there a schedule with penalty < k?
Note: Job scheduling without penalties such that at most k jobs
miss their deadlines is in P.
-
- Multiprocessor scheduling
- Suppose we are given a deadline and a set of tasks of varying length to be performed
on two identical processors.
Optimization problem: Determine an allocation of the tasks to the two
processors such that the deadline can be met.
Decision problem: Can the tasks be arranged so that the deadline is met?
-
- Bin packing
- Optimization problem: Given an unlimited number of bins, each of size 1,
and n objects with sizes s1, ..., sn
where 0 < si < 1,
determine the smallest number of bins into which the objects can be packed (and find
an optimal packing).
Decision problem: Given, in addition to the inputs described above,
an integer k, do the objects fit in k bins?
-
- Knapsack problem
- Optimization problem: Given a knapsack of capacity C and
n objects with sizes s1, ..., sn
and "profits" p1, ..., pn,
find the largest total profit of any subset of the objects that fits in the knapsack
( and find a subset that achieves the maximum profit).
Decision problem: Given k, is there a subset of the objects
that fits in the knapsack and has total profit at least k?
-
- Subset sum problem
- Optimization problem: Given a positive integer C and n objects
with positive sizes s1, ..., sn, among subsets
of the objects with sum at most C, what is the largest subset sum?
Decision problem: Is there a subset of the objects whose sizes add up to
exactly C?
The subset sum problem is a simpler version of the knapsack problem, in which the
profit for each object is the same as its size.
-
- Partition
- Suppose we are given a set of integers.
Decision problem: Can the integers be partitioned into two sets whose sum is equal?
-
- Hamiltonian cycles and Hamiltonian paths
- A Hamiltonian cycle (path) is a simple cycle (path)
that passes through every vertex of a graph exactly once.
Decision problem: Does a given undirected graph have a Hamiltonian cycle (path)?
Note: An Euler tour -- a cycle that traverses each edge of a graph
exactly once -- can be found in O(e) time.
-
- Traveling salesperson problem (TSP) or minimum tour problem
- Optimization problem: Given a complete, weighted graph,
find a minimum-weight Hamiltonian cycle.
Decision problem: Given a complete, weighted graph and an integer k,
is there a Hamiltonian cycle with total weight at most k?
A complete graph has an edge between each pair of vertices.
-
- Vertex cover
- A vertex cover for an undirected graph G is a subset V' of vertices
such that each edge in G is incident upon some vertex in V'.
Optimization problem: Find a vertex cover for G with as few vertices as possible.
Decision problem: Given an integer k, does G have a vertex cover
consisting of k vertices?
Note: The edge cover problem is in P.
-
- Clique
- A clique is a subset V' of vertices in an undirected graph G such that
every pair of distinct vertices in V' is joined by an edge in G (i.e., the subgraph
induced by V' is complete). A clique with k vertices is called a k-clique.
Optimization problem: Find a clique with as many vertices as possible.
Decision problem: Given an integer k, does G have a k-clique?
-
- Independent set
- An independent set is a subset V' of vertices in an undirected graph G
such that no pair of vertices in V' is joined by an edge of G.
Optimization problem: Find an independent set with as many vertices as possible.
Decision problem: Given an integer k, does G have an independent set
consisting of k vertices?
-
- Feedback edge set
- A feedback edge set in a digraph G is a subset E' of edges such that
every cycle in G has an edge in E'.
Optimization problem: Find a feedback edge set with as few edges as possible.
Decision problem: Given an integer k, does G have a feedback edge set
consisting of k edges?
Note: The feedback edge set for undirected graphs is in P.
-
- Longest simple path problem
- Optimization problem: What is the longest simple path between any
two vertices in graph G?
Decision problem: Given an integer k, is there a path in graph G longer
than k?
Note: The shortest path between any two vertices can be found in O(ne) time.
-
- Subgraph isomorphism
- Decision problem: Given two graphs G1 and G2,
determine if G1 is isomorphic to a subgraph of G2.
-
- Integer linear programming
- Linear programming is a method of analyzing systems of linear inequalities
to arrive at optimal solutions to problems
Decision problem: Given a linear program, is there an integral solution?