Monday 14 April 2014

Traveling Salesman Problem

The traveling salesman problem consists of a collection of n cities, a symmetric distance function d(i,j) and a number k. The question is whether there is an ordering of the cities so that a tour through those cities and back to the start can be done in total distance at most k. If d takes on rational values, the problem is NP-complete in general and for many restrictions of the possible functions for d, such as requiring d to have the triangle inequality.

Consider the case where the cities are given as points (x,y) in the plane and the distance function is the regular Euclidean distance, i.e.,

d((r,s),(u,v))=((r-u)2+(s-v)2)1/2
 
It is open whether this traveling salesman problem on the plane is NP-complete. In most cases, it is easy to show the problem is in NP and more difficult to show it NP-hard. For TSP on the plane, Garey, Graham and Johnson showed it NP-hard way back in 1976; it remains open whether the problem is in NP.

In NP one can guess the ordering of the cities, the problem is in checking whether the sum of distances is at most k. Since we can approximate the square root to a polynomial number of digits the issue is how many digits of accuracy we need. So it boils down to the following purely mathematical question: Given an expression of length m over integers with addition, subtraction, squares and square roots (but no recursive squares or square roots) what is the smallest positive number than can be expressed? If the answer is at least 2-mk for some k then TSP on the plane is in NP and NP-complete.

Let me also mention that one can well approximate traveling salesman on the plane.

C Operator Precedence Table

This page lists C operators in order of precedence (highest to lowest). Their associativity indicates in what order operators of equal precedence in an expression are applied.

Operator
Description
Associativity
( )
[ ]
.
->
++ --
Parentheses (function call) (see Note 1)
Brackets (array subscript)
Member selection via object name
Member selection via pointer
Postfix increment/decrement (see Note 2)
left-to-right
++ --
+ -
! ~
(type)
*
&
sizeof
Prefix increment/decrement
Unary plus/minus
Logical negation/bitwise complement
Cast (convert value to temporary value of type)
Dereference
Address (of operand)
Determine size in bytes on this implementation
right-to-left
*  /  % Multiplication/division/modulus left-to-right
+  - Addition/subtraction left-to-right
<<  >> Bitwise shift left, Bitwise shift right left-to-right
<  <=
>  >=
Relational less than/less than or equal to
Relational greater than/greater than or equal to
left-to-right
==  != Relational is equal to/is not equal to left-to-right
& Bitwise AND left-to-right
^ Bitwise exclusive OR left-to-right
| Bitwise inclusive OR left-to-right
&& Logical AND left-to-right
| | Logical OR left-to-right
? : Ternary conditional right-to-left
=
+=  -=
*=  /=
%=  &=
^=  |=
<<=  >>=
Assignment
Addition/subtraction assignment
Multiplication/division assignment
Modulus/bitwise AND assignment
Bitwise exclusive/inclusive OR assignment
Bitwise shift left/right assignment
right-to-left
,
Comma (separate expressions) left-to-right
Note 1:
Parentheses are also used to group sub-expressions to force a different precedence; such parenthetical expressions can be nested and are evaluated from inner to outer.
Note 2:
Postfix increment/decrement have high precedence, but the actual increment or decrement of the operand is delayed (to be accomplished sometime before the statement completes execution). So in the statement y = x * z++; the current value of z is used to evaluate the expression (i.e., z++ evaluates to z) and z only incremented after all else is done.

Semaphores

Problem 1. The following is a set of three interacting processes that can access two shared semaphores:
semaphore U = 3;
semaphore V = 0;

[Process 1]     [Process 2]    [Process 3]

L1:wait(U)      L2:wait(V)     L3:wait(V)
   type("C")       type("A")      type("D")
   signal(V)       type("B")      goto L3
   goto L1         signal(V)
                   goto L2
Within each process the statements are executed sequentially, but statements from different processes can be interleaved in any order that's consistent with the constraints imposed by the semaphores. When answering the questions below assume that once execution begins, the processes will be allowed to run until all 3 processes are stuck in a wait() statement, at which point execution is halted.
  1. Discussed in section Assuming execution is eventually halted, how many C's are printed when the set of processes runs?
    Exactly 3. Each time Process 1 executes the "wait(U)" statement, the value of semaphore U is decremented by 1. Since there are no "signal(U)" statements, the loop in Process 1 will execute only 3 times (ie, the initial value of U) and then stall the fourth time "wait(U)" is executed.
  1. Discussed in section Assuming execution is eventually halted, how many D's are printed when this set of processes runs?
    Exactly 3. Process 1 will execute its loop three times (see the answer to the previous question), incrementing "signal(V)" each time through the loop. This will permit "wait(V)" to complete three times. For every "wait(V)" Process 2 executes, it also executes a "signal(V)" so there is no net change in the value of semaphore V caused by Process 2. Process 3 does decrement the value of semaphore V, typing out "D" each time it does so. So Process 3 will eventually loop as many times as Process 1.
  1. Discussed in section What is the smallest number of A's that might be printed when this set of processes runs?
    0. If Process 3 is scheduled immediately after Process 1 executes "signal(V)", then Process 2 might continue being stalled at its "wait(V)" statement and hence never execute its "type" statements.
  1. Discussed in section Is CABABDDCABCABD a possible output sequence when this set of processes runs?
    No. Here are the events implied by the sequence above:
    start: U=3 V=0
    type C: U=2 V=1
    type A: U=2 V=0
    type B: U=2 V=1
    type A: U=2 V=0
    type B: U=2 V=1
    type D: U=2 V=0
    type D: oops, impossible since V=0
    
  1. Discussed in section Is CABACDBCABDD a possible output sequence when this set of processes runs?
    Yes:
    start: U=3 V=0
    type C: U=2 V=1
    type A: U=2 V=0
    type B: U=2 V=1
    type A: U=2 V=0
    type C: U=1 V=1
    type D: U=1 V=0
    type B: U=1 V=1
    type C: U=0 V=2
    type A: U=0 V=1
    type B: U=0 V=2
    type D: U=0 V=1
    type D: U=0 V=0
    
  1. Discussed in section Is it possible for execution to be halted with either U or V having a non-zero value?
    No. If U has a non-zero value, Process 1 will be able to run. If V has a non-zero value, Process 3 will be able to run.


Problem 2. The following pair of processes share a common variable X:
Process A         Process B
    int Y;             int Z;
A1: Y = X*2;      B1:  Z = X+1;
A2: X = Y;        B2:  X = Z;
X is set to 5 before either process begins execution. As usual, statements within a process are executed sequentially, but statements in process A may execute in any order with respect to statements in process B.
  1. Discussed in section How many different values of X are possible after both processes finish executing?
    There are four possible values for X. Here are the possible ways in which statements from A and B can be interleaved.
    A1 A2 B1 B2: X = 11
    A1 B1 A2 B2: X = 6
    A1 B1 B2 A2: X = 10
    B1 A1 B2 A2: X = 10
    B1 A1 A2 B2: X = 6
    B1 B2 A1 A2: X = 12
    
  1. Discussed in section Suppose the programs are modified as follows to use a shared binary semaphore S:
    Process A        Process B
        int Y;           int Z;
        wait(S);         wait(S);
    A1: Y = X*2;     B1: Z = X+1;
    A2: X = Y;       B2: X = Z;
        signal(S);       signal(S);
    
    S is set to 1 before either process begins execution and, as before, X is set to 5.
    Now, how many different values of X are possible after both processes finish executing?
    The semaphore S ensures that, once begun, the statements from either process execute without interrupts. So now the possible ways in which statements from A and B can be interleaved are:
    A1 A2 B1 B2: X = 11
    B1 B2 A1 A2: X = 12
    
  1. Discussed in section Finally, suppose the programs are modified as follows to use a shared binary semaphore T:
    Process A         Process B
        int Y;            int Z;
    A1: Y = X*2;      B1: wait(T);
    A2: X = Y;        B2: Z = X+1;
        signal(T);        X = Z;
    
    T is set to 0 before either process begins execution and, as before, X is set to 5.
    Now, how many different values of X are possible after both processes finish executing?
    The semaphore T ensures that all the statements from A finish execution before B begins. So now there is only one way in which statements from A and B can be interleaved:
    A1 A2 B1 B2: X = 11
    


Problem 3. The following pair of processes share a common set of variables: "counter", "tempA" and "tempB":
Process A                            Process B
...                                  ...
A1: tempA = counter + 1;             B1: tempB = counter + 2;
A2: counter = tempA;                 B2: counter = tempB;
...                                  ...
The variable "counter" initially has the value 10 before either process begins to execute.
  1. Discussed in section What different values of "counter" are possible when both processes have finished executing? Give an order of execution of statements from processes A and B that would yield each of the values you give. For example, execution order A1, A2, B1, B2 would yield the value 13.
    There are three possible values for X. Here are the possible ways in which statements from A and B can be interleaved.
    A1 A2 B1 B2: X = 13
    A1 B1 A2 B2: X = 12
    A1 B1 B2 A2: X = 11
    B1 A1 B2 A2: X = 11
    B1 A1 A2 B2: X = 12
    B1 B2 A1 A2: X = 13
    
  1. Discussed in section Modify the above programs for processes A and B by adding appropriate signal and wait operations on the binary semaphore "sync" such that the only possible final value of "counter" is 13. Indicate what should be the initial value of the semaphore "sync".
    We need to ensure that A and B run uniterrupted, but it doesn't matter which runs first.
    semaphore sync = 1;
    
    Process A                            Process B
        wait(sync);                          wait(sync);
    A1: tempA = counter + 1;             B1: tempB = counter + 2;
    A2: counter = tempA;                 B2: counter = tempB;
        signal(sync);                        signal(sync);
    
  1. Discussed in section Draw a precedence graph that describes all the possible orderings of executions of statements A1, A2, B1 and B2 that yield the a final value of 11 for "counter".
       A1     B1
         \  /
          B2
          |
          A2
    
  1. Discussed in section Modify the original programs for processes A and B by adding binary semaphores and signal and wait operations to guarantee that the final result of executing the two processes will be "counter" = 11. Give the initial values for every semaphore you introduce. Try to put the minimum number of constraints on the ordering of statements. In other words, don't just pick one ordering that will yield 11 and enforce that one by means of semaphores; instead, enforce only the essential precedence constraints marked in your solution to question 3.
    semaphore s1 = 0;
    semaphore s2 = 0;
    
    Process A                            Process B
    A1: tempA = counter + 1;             B1: tempB = counter + 2;
        signal(s1);                          wait(s1);
        wait(s2);                        B2: counter = tempB;
    A2: counter = tempA;                     signal(s2);
    


Problem 4. The figure below shows two processes that must cooperate in computing N2 by taking the sum of the first N odd integers.
  Process P  Process Q
  N = 5;
  Sqr = 0;
 loopP: loopQ:

  if (N==0)  Sqr = Sqr + 2*N + 1;
    goto endP;
  N = N - 1;  goto loopQ;
  goto loopP;
 endP:
  print(Sqr);
  1. Add appropriate semaphore declarations and signal and wait statements to these programs so that the proper value of Sqr (i.e., 25) will be printed out. Indicate the initial value of every semaphore you add. Insert the semaphore operations so as to preserve the maximum degree of concurrency between the two processes; do not put any nonessential constraints on the ordering of operations. Hint: Two semaphores suffice for a simple and elegant solution.
    Looking at the code for process Q, we see that we want it to execute with N = 4, 3, 2, 1, and 0. So we want the decrement of N to happen before the first execution of Q. We can use two semaphores to control the production and consumption of "N values" by the two loops. To achieve maximum concurrency, we'll signal the availability of a new N value as soon as it's ready and then do as much as possible (i.e., branch back to the beginning of the loop and check the end condition) before waiting for the value to be consumed:
      Semaphore P = 1;  // P gets first shot at execution
      Semaphore Q = 0;  // Q has to wait for first N value
      int N, Sqr;
    
      Process P  Process Q
      N = 5;
      Sqr = 0;
     loopP: loopQ:
      if (N==0)  wait(Q);
        goto endP;  Sqr = Sqr + 2*N + 1;
      wait(P);                signal(P);
      N = N - 1;  goto loopQ;
      signal(Q);
      goto loopP;
     endP:
      wait(P);  // wait for last Q iteration!
      print(Sqr);
    
    Optionally, you could also split Sqr = Sqr + 2*N + 1 into Sqr = Sqr + 1 and Sqr = Sqr + 2*N to slightly improve concurrency.


Problem 5. A computer has three commonly used resources designated A, B and C. Up to three processes designated X, Y and Z run on the computer and each makes periodic use of two of the three resources.
  • Process X acquires A, then B, uses both and then releases both.
  • Process Y acquires B, then C, uses both and then releases both.
  • Process Z acquires C, then A, uses both and then releases both.
  1. If two of these processes are running simultaneously on the machine, can a deadlock occur? If so, describe the deadlock scenario.
    No deadlock is possible since one of the two processes will always be able to run to completion.
  1. Describe a scenario in which deadlock occurs if all three processes are running simultaneously on the machine.
    All three processes make their first acquisition then hang waiting for a resource that will never become available.
  1. Modify the algorithm for acquiring resources so that deadlock cannot occur with three processes running.
    If we change Z to acquire A then C, no deadlock can occur.


Problem 6. The following is a question about dining computer scientists. There are 6 computer scientists seated at a circular table. There are 3 knives at the table and 3 forks. The knives and forks are placed alternately between the computer scientists. A large bowl of food is placed at the center of the table. The computer scientists are quite hungry, but require both a fork and a knife to eat.
Consider the following policies for eating and indicate for each policy if it can result in deadlock.
    • Attempt to grab the fork that sits between you and your neighbor until you are successful.
    • Attempt to grab the knife that sits between you and your neighbor until you are successful.
    • Eat
    • Return the fork.
    • Return the knife.
    No deadlock is possible since the number of available forks limits the number of knives that can be acquired, i.e., if you have a fork, a knife is guaranteed to be available.
    • Attempt to grab any fork on the table until you are successful (if there are many forks grab the closest one).
    • Attempt to grab any knife on the table until you are succesful (if there are many forks grab the closest one).
    • Eat
    • Return the knife.
    • Return the fork.
    No deadlock is possible for the same reason as above.
    • Flip a coin to decide if you are going to first try for a fork or a knife.
    • Attempt to grab your choice until you are successful (if there are many of that utensil grab the closest one).
    • Attempt to grab the other type of utensil until you are successful (if there are many of that utensil grab the closest one).
    • Eat
    • Return the knife.
    • Return the fork.
    Deadlock is possible since now it is possible all the philosophers to acquire a utensil and then stall indefinately waiting for the other utensil to appear.


Problem 7. Gerbitrail is a manufacturer of modular gerbil cages. A Gerbitrail cage is assembled using a catalog of modules, including gerbil "rooms" of various sizes and shapes as well as sections of tubing whose diameter neatly accommodates a single gerbil. A typical cage contains several intercon- nected rooms, and may house a community of gerbils:

The Gerbitrail cages are immensely successful, except for one tragic flaw: the dreaded GERBILOCK. Gerbilock is a situation that arises when multiple gerbils enter a tube going in opposite directions, meeting within the tube as shown below:

Since each gerbil is determined to move forward and there is insufficient room to pass, both gerbils remain gerbilocked forever.
There is, however, hope on the horizon. Gerbitrail has developed little mechanical ggates that can be placed at the ends of each tube, and which can both sense and lock out gerbil crossings. All ggates are controlled by a single computer. Each ggate X has two routines, X_Enter and X_Leave, which are called when a gerbil tries to enter or leave respectively the tube via that ggate; the gerbil is not allowed to procede (ie, to enter or leave) until the Enter or Leave call returns. Gerbitrail engi- neers speculate that these routines can solve Gerbilock using semaphores.
They perform the following experiment:

where each of the ggates A and B are controlled by the following code:
semaphore S=???;        /* Shared semaphore. */

A_Enter()              /* Handle gerbil entering tube via A */
{ wait(S); }
A_Leave()              /* Handle gerbil leaving tube via A */
{ signal(S); }
B_Enter()              /* Handle gerbil entering tube via B */
{ wait(S); }
B_Leave()              /* Handle gerbil leaving tube via B */
{ signal(S); }
  1. What is the proper initial value of the semaphore S?
    S = 1 since one gerbil is allowed in the tube.
  1. An argument among the Gerbitrail technical staff develops about how the above solution should be extended to more complex cages with multiple tubes. The question under debate is whether separate semaphores should be allocated to each ggate, to each tube, or whether a single semaphore should be shared for all gates. Help resolve the argument. For each proposal, indicate OK if it works, SLOW if it works but becomes burdensome in complex cages, and BAD if it doesn't prevent gerbilock.Single semaphore shared among all ggates: OK -- SLOW -- BAD
    Semaphore for each tube: OK -- SLOW -- BAD
    Semaphore for each ggate: OK -- SLOW -- BAD
    Single semaphore shared among all ggates: SLOW
    Semaphore for each tube: OK
    Semaphore for each ggate: BAD
  1. Gerbitrail management decides to invest heavily in the revolutionary technology which promises to wipe out the gerbilock threat forever. They hire noted computer expert Nickles Worth to evaluate their ggate approach and suggest improvements. Nickles looks at the simple demonstration above, involving only 2 gerbils and a single tube, and immediately objects. "You are enforcing non-essential constraints on the behavior of these two gerbils", he exclaims.What non-essential constraint is imposed by the ggate solution involving only 2 gerbils and one tube? Give a specific scenario.
    Since only 1 gerbil is allowed in the tube at a time, both gerbils can't go through the tube simultaneously in the same direction even though that would work out okay.
  1. Nickles proposes that a new synchronization mechanism, the gerbiphore, be defined to handle the management of Gerbitrail tubes. His proposal involves the implementation of a gerbiphore as a C data structure, and the allocation of a single gerbiphore to each tube in a Gerbitrail cage configuration.Nickles's proposed implementation is:
    semaphore mutex=1;
    
    struct gerbiphore {          /* definition of "gerbiphore" structure*/
     int dir;                    /* Direction: 0 means unspecified. */
     int count;                  /* Number of Gerbils in tube */
    } A, B, ...;                 /* one gerbiphore for each tube */
    
    int Fwd=1, Bkwd=2;           /* Direction codes for tube travel */
    
    /* genter(g, dir) called with gerbiphore pointer g and direction dir
       whenever a gerbil wants to enter the tube attached to g
       in direction dir */
    
    genter(struct gerbiphore *g, int dir) {
     loop:
       wait(mutex);
       if (g->dir == 0)
         g->dir = dir;    /* If g->dir unassigned, grab it! */
       if (g->dir == dir) {
         g->count = 1 + g->count;   /* One more gerbil in tube. */
         *********;                 /* MISSING LINE! */
         return;
       }
       signal(mutex);
       goto loop;
    }
    
    /* gleave(g, dir) is called whenever a gerbil leaves the tube
       attached to gerbiphore g in direction dir. */
    
    gleave(struct gerbiphore *g, int dir) {
       wait(mutex);
       g->count = g->count - 1;
       if (g->count == 0) g->dir = 0;
       signal(mutex);
    }
    
    Unfortunately a blotch of red wine obscures one line of Nickles's code. The team of stain analysts hired to decode the blotch eventually respond with a sizable bill and the report that "it appears to be Ch. Petrus, 1981". Again, your services are needed.
    What belongs in place of ********* at the line commented "MISSING LINE"?
    signal(mutex). We have to release the mutual exclusion lock before returning from genter.
  1. Ben Bitdiddle has been moonlighting at Gerbitrail Research Laboratories, home of the worlds largest gerbil population. Ben's duties include computer related research and shoveling. He is anxious to impress his boss with his research skills, in the hopes that demonstrating such talent will allow him to spend less of his time with a shovel.Ben observes that the GRL Gerbitrail cage, housing millions of gerbils and involving tens of millions of tubes, is a little sluggish despite its use of Nickles Worth's fancy gerbiphore implementation. Ben studies the problem, and finds that each gerbil spends a surprisingly large amount of time in genter and gleave calls, despite the fact that tubes are infrequently used due to their large number. Ben suspects some inefficiency in the code. Ben focuses on the gleave code, and collects statistics about which line of code in the gleave definition is taking the most CPU time.
    Which line of the (4-line) gleave body do you expect to be the most time consuming?
    wait(mutex) since that may hang while waiting for other instances of genter or gleave to complete.
  1. Identify a class of inessential precedence constraints whose imposition by Nickles's code causes the performance bottleneck observed by Ben.
    The mutex semaphore implements a global lock, so you can't manipulate more than one 1 semaphore at a time.
  1. Briefly sketch a change to Nickles's code which mitigates the performance bottleneck.
    If we use a separate mutex for each gerbifore than the mutual exclusion each mutex provides will only apply to operations on that gerbifore.
  1. Encouraged by his success (due to your help), Ben decides to try more aggressive performance improvements. He edits the code to gleave, eliminating entirely the calls to wait and signal on the mutex semaphore. He then tries the new code on a 2-gerbil, 1-tube cage.Will Ben's change work on a 2-gerbil, 1-tube cage? Choose the best answer.

    1. It still works fine. Nice work, Ben!
    2. Gerbilock may be caused by two nearly simultaneous genter calls.
    3. Gerbilock may be caused by two nearly simultaneous gleave calls.
    4. Gerbilock may be caused by nearly simultaneous gleave and genter calls, followed by another genter call.
    5. Gerbilock can't happen, although the system may fail in other ways.
    D.


Problem 8. Similar Software is a software startup whose 24 employees include you and 23 lawyers. Your job is to finish the initial product: the Unicks timesharing system for a single-processor Beta system. Unicks is very similar to the OS code we've seen in lecture, and to a popular workstation OS. To avoid legal entanglements, it incorporates a distinguishing feature: an inter-process communication mechanism call the tube.
A tube provides a flow-controlled communication channel among processes. The system supports at most 100 different tubes, each identified by a unique integer between 0 and 99. The system primitives for communicating via tubes are the Beta SVCs WTube, which writes the nonzero integer in R1 to the tube whose number is in R0, and RTube, which reads a nonzero datum from the tube whose number is passed in R0. Note that tubes can only be used to pass nonzero integers between processes.
The Unicks handlers for WTube and RTube are shown below:
int Tubes[100]; /* max of 100 tubes */

WTube_handler() {
  int TubeNumber = User.R0; /* which tube to write */
  int Datum = User.R1; /* the data to write */

  if (Tubes[TubeNumber] != 0) { /* wait until Tube is empty */
    User.XP = User.XP - 4;
    return;
  } else { 
    Tubes[TubeNumber] = Datum; /* tube empty, fill it up! */
  }
}

RTube_handler() {
  int TubeNumber = User.R0; /* which tube to read */

  if (Tubes[TubeNumber] == 0) { /* wait until there's data */
    User.XP = User.XP - 4;
    return;
  } else {
    User.R1 = Tubes[TubeNumber]; /* read the datum */
    Tubes[TubeNumber] = 0; /* mark the tube as empty */
  }
}
The handlers run as part of the Unicks kernel and will not be interrupted, i.e., handling of interrupts is postponed until the current process returns to user mode. Note that the initial values in the Tubes array are zero, and keep in mind that only nonzero data is to be written to (and read from) each tube.
  1. Let Wi be the ith write on a tube, and Rj be the jth read. What precedence constraint(s) does the above implementation enforce between completion of the Wi and Rj?
    The code requires that the With write must precede the Rith read, that the Rith read must precede the Wi+1th write. All other precedence relationships can be derived from these.
  1. You observe that a process that waits once will waste the remainder of its quantum looping. Suggest a one-line improvement to each of the handlers which will waste less time synchronizing the communication processes.
    Each handler could call the scheduler before returning on a read/write fail.
  1. Assume, for the remaining questions, that your improvement HAS NOT been implemented; the original code, as shown, in being used.Since tubes are advertised as a general mechanism for communication among a set of Unix processes, it is important that they work reliably when several processes attempt to read and/or write the simultaneously. S. Quire, the ex-lawyer CEO, has been unable to figure out just what the semantics of tubes are under these circumstances. He finally asks you for help.
    Describe what will happen if a process writes an empty tube while multiple processes are waiting to read it. How many processes will read the new value?
    Since RTube_handler() is not interruptible, exactly one process will get the new value.
  1. Describe what will happen if multiple processes write different values to a tube at about the same time that another process is doing successive reads from that tube. Will each value be read once? Will at least one value be read, but some may be lost? Will garbage (values other than those written) be read?
    Since WTube_handler() is not interruptible, no values will be lost. Each value will be read correctly.
  1. S. Quire suggests that the interrupt hardware be modified so that timer interupts (which mark the end of the quantum for the currently running process) are allowed to happen in both user and kernel mode. How would this modification change your answer to parts (C) and (D)?
    If the handlers are interruptible, multiple processes may read the same value or garbage (zero). Multiple processes may write over other values before they are read, but will not be read twice if there is only a single reader.
  1. Customers have observed that Unicks seems to favor certain processes under some circumstances. In particular, when one process writes data to a tube while processes A, B, and C are waiting to read it, it is typically the case that the amount of data read by A, B, and C will be dramatically different.Briefly explain the cause for this phenomenon.
    If process D writes data and the scheduler calls A, B, C, and D round-robin (in that order), process A has the best chance of having its Rtube call succeed since it runs right after process D has finished calling Wtube.
  1. Sketch, in a sentence or two, a plausible strategy for treating the processes more equitably.
    If processes have numbers, each tube could remember the last process that read it. When a process writes to the tube again, it could reorder scheduling so that the waiting process with the next higher process number is called next. Another approach would be to use a randomized scheduler, but this creates the possibility that some processes will not be run for a long period of time.
  1. Nickles Worth, a consultant to Similar Software, claims that tubes can be used to implement mutual exclusion-that is, to guarantee that at most one process is executing code within a critical section at all times. He offers an example template:
    int TubeNumber = 37; /* tube to be used for mutual exclusion */
    
    WTube(TubeNumber,1); /* one-time-only initialization */
    
    while () { /* loop with critical section */
       
        /* LOCK ACCESS */
       
       <critical section>
       
        /* UNLOCK ACCESS */
       
    }
    
    where the regions marked LOCK ACCESS and UNLOCK ACCESS use the tube TubeNumber to ensure exclusive access to the code marked <critical section>. These two regions may contain the forms datum=RTube(TubeNumber) and Wtube(TubeNumber,datum) as a C interface to the tube SVCs.Fill in the LOCK and UNLOCK code above.
    LOCK: datum=RTube(TubeNumber)
    UNLOCK: Wtube(TubeNumber,datum)

NP Complete Class

Introduction
One of the - confusing - topics in Algorithm design is the concept of NP completeness. If you search the topic on the internet you will probably find tons of articles and lectures on the subject however in this short article I will summarize it so that it is easy to remember by the average student or software engineer.
It is all about Running Time
Computational complexity analysis refers to the study of computer algorithms in terms of efficiency. Given a problem of some input size, we need to know how fast or slow it runs as the input size grows to large values. Consider the following examples:
  1. Looking an item up in a perfect hash table takes a constant time, theoretically speaking, no matter how big the hash table is, the item can be found instantly (assuming it exists), the algorithm is said to have a running time of O(1). This is indeed very fast.
  2. The running time to look up an item in a sorted binary search tree of (n) elements is O(Log(n)) for example if the number of items is (32) then the number of comparisons needed to find an element is (5) in the worst case. As you can see this is a fast algorithm
  3. Searching for an element in an unsorted array runs in linear time O(n). If the array contains 100 elements then you need to make 100 comparisons to find the element in the worst case. This means, the algorithm will slow down linearly as the input size grows. To be exact, the running time linearly grows as the input size becomes larger.
Polynomial versus Exponential
Algorithms with a running time function of the form O(n^k) where (k) is a constant are said to run (or solved) in polynomial time. Theoretically speaking, they are fast even if (k) is large. On the other hand, algorithms with a running time of the form O(k^n) where (k) is a constant are said to run in exponential time. These algorithms are very slow. Actually it may take a computer years to finish running the algorithm.
P, NP, NP-Complete, NP-Hard
Now we know what running time means, based on that we can classify problems into classes depending on how complex they are. I will be using plain English as opposed to mathematical terms in order to make it easy to understand.
P Problem
P means that there exists an algorithm to solve the problem which runs in polynomial time. If you are still not sure what running time or polynomial means then please read the introduction one more time.
NP Problem
NP problem means that there exists an algorithm to verify a given solution is correct in polynomial time. Note that we are talking about solution verification. We are not talking about the solution for the problem because no one has ever yet discovered a fast (polynomial) solution for NP problems nor proved the solution does not exist in the first place. NP does not mean None Polynomial, please do not say that in front of people because it is embarrassing, NP stands for non deterministic which means the problem can be solved in polynomial time BUT using none deterministic machine (of course it is not the regular computer we use every day, this computer is in the mind of some weird computer scientists. If you are interested you can research it on your own. Try to search for Turing Machine)
NP-Hard Problem
Now we know what NP means, NP-Hard problem is AT LEAST as hard as any NP problem. It could be harder, who knows. Again, when thinking about problem difficulty we are still referring to the running time of the algorithm. A harder problem takes more time to finish running.
Reduction
It feels bad when you try to solve a problem and it turns to be very hard to the extent that many people have tried it without success. In order to claim that the problem cannot be solved by many people then you need to prove that it is equivalent to some known hard problem. Reduction refers to transforming a problem to another well known (hard) problem so that people won’t dare to call you stupid.  This conversion process should run in polynomial time and can be used to prove a problem is NP-Complete as we will indicate later.
NP-Complete Problem
NP-Complete problem is both NP and NP-Hard. You know what NP and NP-Hard means, so NP-Complete means a problem that is easy to verify a given solution and every problem in NP can be reduced to our problem. The hard part is proving the first example of an NP-complete problem but our friend Steve Cook did that in the 1970s. Thanks to him because he did a great job so that we are not called stupid. In few words, in order to prove that a given problem is NP-Complete then
  1. Show it is NP: given a solution, show that it can be verified in polynomial time
  2. Show it is NP-Hard: pick an already known NP-Complete problem and show that you can reduce it back to our problem in polynomial time
What is the big deal about the question P = NP?
  1. If P is the same as NP then this is good news, there are many interesting real life problems that could be solved very quickly.
  2. It is going to be really embarrassing for the folks working in computer science field because it is believed (I am assuming someone got the $1000000 prize) for long time by many people that they are not the same.
  3. In reality many believe they are not the same because many people knocked their heads against the wall for so long without any success and because crazy scientists need a proof a solution does not exist, it is going to be an open question.

Class NP


  • The class of decision problems for which there is a polynomially bounded nondeterministic algorithm.


  • The class of problems that are "slightly harder" than P.


  • In general, an exponential number of subproblems, each of which can be solved or verified in polynomial time.


  • Being in class NP constitutes an upper bound on problem complexity.

Nondeterministic Turing Machine (NDTM)



  • Model a problem as a tree in which the nodes represent states, with time flowing from the root to the children. A leaf represents a solution.


  • In an NDTM, the computation splits whenever a guess is needed.
    • The tree has polynomial depth but is exponentially bushy.


  • "Guess" nodes are like OR nodes: if either of the children returns T, the "guess" node returns T.


  • NP-hard



  • A problem Q is NP-hard if every problem in NP is reducible to Q.


  • A lower bound on a problem: "at least as hard as any problem in NP."


  • NP-Complete



  • The hardest decision problems in NP.


  • If there were a polynomially bounded algorithm for an NP-complete problem, then there would be a polynomially bounded algorithm for each problem in NP.


  • Establishes both a lower and an upper bound (although it may be that P=NP).


  • If any NP-complete problem is in P, then P=NP.


  • Satisfiability (SAT)



  • Instance: Given a set U = {u1, ..., um} of variables and a collection C = {c1, ..., cn} of clauses over U.


  • Question: Is there a truth assignment to U that satisfies C?


  • A logical formula in conjunctive normal form (CNF): a conjunction of disjunctions
    e.g., (x1 + x2 + ¬x3) · (x1 + x3) · (¬x2)

    • easy to falsify: make all terms in one clause fail.


    • hard to satisfy: terms in clauses interact.


  • Cook's Theorem: SAT is NP-complete.

    • uses SAT to simulate a machine


  • Once the primordial NP-complete problem is found, we can prove other problems NP-complete without reference to machines:

    • To prove that problem Q is NP-complete, reduce SAT (or any other NP-complete problem) to Q:
            SAT <=P Q


    • E.g., 3SAT: like SAT, but | ci | = 3 for 1 <= i <= n.


    • Need to show that 3SAT is NP-hard:
      Since X <=P SAT for any problem X in NP, we need only show SAT <=P 3SAT.


    • Proof:
      General form of clauses in SAT is cj = (z1 + z2 + ... + zk), where each zi is either an input or the negation of an input.
      k = 1
      {z1 + z1 + z1}
      k = 2
      {z1 + z2 + z2}
      k = 3
      {z1 + z2 + z3}
      k >= 4
      Add auxiliary variables that transmit information between pieces.
      E.g., {z1 + z2 + z3 + z4 + z5} becomes {(z1 + z2 + y1) (¬y1 + z3 + y2) (¬y2 + z4 + z5)}
      The transformation is polynomial in the length of the SAT instance.

    • Note: 2SAT is solvable in polynomial time.


  • Other NP-Complete Problems


    • Graph coloring
      A coloring of a graph is the assignment of a "color" to each vertex of the graph such that adjacent vertices are not assigned the same color. The chromatic number of a graph G, X(G) [Chi(G)], is the smallest number of colors needed to color G.
      Optimization problem: Given an undirected graph G, determine X(G) (and produce an optimal coloring).
      Decision problem: Given G and a positive integer k, is there a coloring of G using at most k colors (i.e., is G k-colorable)?


    • Job scheduling with penalties
      Suppose we are given n jobs to be executed one at a time and, for each job, an execution time, deadline, and penalty for missing the deadline.
      Optimization problem: Determine the minimum possible penalty (and find an optimal schedule).
      Decision problem: Given a nonnegative integer k, is there a schedule with penalty < k?
      Note: Job scheduling without penalties such that at most k jobs miss their deadlines is in P.


    • Multiprocessor scheduling
      Suppose we are given a deadline and a set of tasks of varying length to be performed on two identical processors.
      Optimization problem: Determine an allocation of the tasks to the two processors such that the deadline can be met.
      Decision problem: Can the tasks be arranged so that the deadline is met?


    • Bin packing
      Optimization problem: Given an unlimited number of bins, each of size 1, and n objects with sizes s1, ..., sn where 0 < si < 1, determine the smallest number of bins into which the objects can be packed (and find an optimal packing).
      Decision problem: Given, in addition to the inputs described above, an integer k, do the objects fit in k bins?


    • Knapsack problem
      Optimization problem: Given a knapsack of capacity C and n objects with sizes s1, ..., sn and "profits" p1, ..., pn, find the largest total profit of any subset of the objects that fits in the knapsack ( and find a subset that achieves the maximum profit).
      Decision problem: Given k, is there a subset of the objects that fits in the knapsack and has total profit at least k?


    • Subset sum problem
      Optimization problem: Given a positive integer C and n objects with positive sizes s1, ..., sn, among subsets of the objects with sum at most C, what is the largest subset sum?
      Decision problem: Is there a subset of the objects whose sizes add up to exactly C?
      The subset sum problem is a simpler version of the knapsack problem, in which the profit for each object is the same as its size.


    • Partition
      Suppose we are given a set of integers.
      Decision problem: Can the integers be partitioned into two sets whose sum is equal?


    • Hamiltonian cycles and Hamiltonian paths
      A Hamiltonian cycle (path) is a simple cycle (path) that passes through every vertex of a graph exactly once.
      Decision problem: Does a given undirected graph have a Hamiltonian cycle (path)?
      Note: An Euler tour -- a cycle that traverses each edge of a graph exactly once -- can be found in O(e) time.


    • Traveling salesperson problem (TSP) or minimum tour problem
      Optimization problem: Given a complete, weighted graph, find a minimum-weight Hamiltonian cycle.
      Decision problem: Given a complete, weighted graph and an integer k, is there a Hamiltonian cycle with total weight at most k?
      A complete graph has an edge between each pair of vertices.


    • Vertex cover
      A vertex cover for an undirected graph G is a subset V' of vertices such that each edge in G is incident upon some vertex in V'.
      Optimization problem: Find a vertex cover for G with as few vertices as possible.
      Decision problem: Given an integer k, does G have a vertex cover consisting of k vertices?
      Note: The edge cover problem is in P.


    • Clique
      A clique is a subset V' of vertices in an undirected graph G such that every pair of distinct vertices in V' is joined by an edge in G (i.e., the subgraph induced by V' is complete). A clique with k vertices is called a k-clique.
      Optimization problem: Find a clique with as many vertices as possible.
      Decision problem: Given an integer k, does G have a k-clique?


    • Independent set
      An independent set is a subset V' of vertices in an undirected graph G such that no pair of vertices in V' is joined by an edge of G.
      Optimization problem: Find an independent set with as many vertices as possible.
      Decision problem: Given an integer k, does G have an independent set consisting of k vertices?


    • Feedback edge set
      A feedback edge set in a digraph G is a subset E' of edges such that every cycle in G has an edge in E'.
      Optimization problem: Find a feedback edge set with as few edges as possible.
      Decision problem: Given an integer k, does G have a feedback edge set consisting of k edges?
      Note: The feedback edge set for undirected graphs is in P.


    • Longest simple path problem
      Optimization problem: What is the longest simple path between any two vertices in graph G?
      Decision problem: Given an integer k, is there a path in graph G longer than k?
      Note: The shortest path between any two vertices can be found in O(ne) time.


    • Subgraph isomorphism
      Decision problem: Given two graphs G1 and G2, determine if G1 is isomorphic to a subgraph of G2.


    • Integer linear programming
      Linear programming is a method of analyzing systems of linear inequalities to arrive at optimal solutions to problems
      Decision problem: Given a linear program, is there an integral solution?

    Class NP


    • The class of decision problems for which there is a polynomially bounded nondeterministic algorithm.


    • The class of problems that are "slightly harder" than P.


    • In general, an exponential number of subproblems, each of which can be solved or verified in polynomial time.


    • Being in class NP constitutes an upper bound on problem complexity.

    Nondeterministic Turing Machine (NDTM)



  • Model a problem as a tree in which the nodes represent states, with time flowing from the root to the children. A leaf represents a solution.


  • In an NDTM, the computation splits whenever a guess is needed.
    • The tree has polynomial depth but is exponentially bushy.


  • "Guess" nodes are like OR nodes: if either of the children returns T, the "guess" node returns T.


  • NP-hard



  • A problem Q is NP-hard if every problem in NP is reducible to Q.


  • A lower bound on a problem: "at least as hard as any problem in NP."


  • NP-Complete



  • The hardest decision problems in NP.


  • If there were a polynomially bounded algorithm for an NP-complete problem, then there would be a polynomially bounded algorithm for each problem in NP.


  • Establishes both a lower and an upper bound (although it may be that P=NP).


  • If any NP-complete problem is in P, then P=NP.


  • Satisfiability (SAT)



  • Instance: Given a set U = {u1, ..., um} of variables and a collection C = {c1, ..., cn} of clauses over U.


  • Question: Is there a truth assignment to U that satisfies C?


  • A logical formula in conjunctive normal form (CNF): a conjunction of disjunctions
    e.g., (x1 + x2 + ¬x3) · (x1 + x3) · (¬x2)

    • easy to falsify: make all terms in one clause fail.


    • hard to satisfy: terms in clauses interact.


  • Cook's Theorem: SAT is NP-complete.

    • uses SAT to simulate a machine


  • Once the primordial NP-complete problem is found, we can prove other problems NP-complete without reference to machines:

    • To prove that problem Q is NP-complete, reduce SAT (or any other NP-complete problem) to Q:
            SAT <=P Q


    • E.g., 3SAT: like SAT, but | ci | = 3 for 1 <= i <= n.


    • Need to show that 3SAT is NP-hard:
      Since X <=P SAT for any problem X in NP, we need only show SAT <=P 3SAT.


    • Proof:
      General form of clauses in SAT is cj = (z1 + z2 + ... + zk), where each zi is either an input or the negation of an input.
      k = 1
      {z1 + z1 + z1}
      k = 2
      {z1 + z2 + z2}
      k = 3
      {z1 + z2 + z3}
      k >= 4
      Add auxiliary variables that transmit information between pieces.
      E.g., {z1 + z2 + z3 + z4 + z5} becomes {(z1 + z2 + y1) (¬y1 + z3 + y2) (¬y2 + z4 + z5)}
      The transformation is polynomial in the length of the SAT instance.

    • Note: 2SAT is solvable in polynomial time.


  • Other NP-Complete Problems


    • Graph coloring
      A coloring of a graph is the assignment of a "color" to each vertex of the graph such that adjacent vertices are not assigned the same color. The chromatic number of a graph G, X(G) [Chi(G)], is the smallest number of colors needed to color G.
      Optimization problem: Given an undirected graph G, determine X(G) (and produce an optimal coloring).
      Decision problem: Given G and a positive integer k, is there a coloring of G using at most k colors (i.e., is G k-colorable)?


    • Job scheduling with penalties
      Suppose we are given n jobs to be executed one at a time and, for each job, an execution time, deadline, and penalty for missing the deadline.
      Optimization problem: Determine the minimum possible penalty (and find an optimal schedule).
      Decision problem: Given a nonnegative integer k, is there a schedule with penalty < k?
      Note: Job scheduling without penalties such that at most k jobs miss their deadlines is in P.


    • Multiprocessor scheduling
      Suppose we are given a deadline and a set of tasks of varying length to be performed on two identical processors.
      Optimization problem: Determine an allocation of the tasks to the two processors such that the deadline can be met.
      Decision problem: Can the tasks be arranged so that the deadline is met?


    • Bin packing
      Optimization problem: Given an unlimited number of bins, each of size 1, and n objects with sizes s1, ..., sn where 0 < si < 1, determine the smallest number of bins into which the objects can be packed (and find an optimal packing).
      Decision problem: Given, in addition to the inputs described above, an integer k, do the objects fit in k bins?


    • Knapsack problem
      Optimization problem: Given a knapsack of capacity C and n objects with sizes s1, ..., sn and "profits" p1, ..., pn, find the largest total profit of any subset of the objects that fits in the knapsack ( and find a subset that achieves the maximum profit).
      Decision problem: Given k, is there a subset of the objects that fits in the knapsack and has total profit at least k?


    • Subset sum problem
      Optimization problem: Given a positive integer C and n objects with positive sizes s1, ..., sn, among subsets of the objects with sum at most C, what is the largest subset sum?
      Decision problem: Is there a subset of the objects whose sizes add up to exactly C?
      The subset sum problem is a simpler version of the knapsack problem, in which the profit for each object is the same as its size.


    • Partition
      Suppose we are given a set of integers.
      Decision problem: Can the integers be partitioned into two sets whose sum is equal?


    • Hamiltonian cycles and Hamiltonian paths
      A Hamiltonian cycle (path) is a simple cycle (path) that passes through every vertex of a graph exactly once.
      Decision problem: Does a given undirected graph have a Hamiltonian cycle (path)?
      Note: An Euler tour -- a cycle that traverses each edge of a graph exactly once -- can be found in O(e) time.


    • Traveling salesperson problem (TSP) or minimum tour problem
      Optimization problem: Given a complete, weighted graph, find a minimum-weight Hamiltonian cycle.
      Decision problem: Given a complete, weighted graph and an integer k, is there a Hamiltonian cycle with total weight at most k?
      A complete graph has an edge between each pair of vertices.


    • Vertex cover
      A vertex cover for an undirected graph G is a subset V' of vertices such that each edge in G is incident upon some vertex in V'.
      Optimization problem: Find a vertex cover for G with as few vertices as possible.
      Decision problem: Given an integer k, does G have a vertex cover consisting of k vertices?
      Note: The edge cover problem is in P.


    • Clique
      A clique is a subset V' of vertices in an undirected graph G such that every pair of distinct vertices in V' is joined by an edge in G (i.e., the subgraph induced by V' is complete). A clique with k vertices is called a k-clique.
      Optimization problem: Find a clique with as many vertices as possible.
      Decision problem: Given an integer k, does G have a k-clique?


    • Independent set
      An independent set is a subset V' of vertices in an undirected graph G such that no pair of vertices in V' is joined by an edge of G.
      Optimization problem: Find an independent set with as many vertices as possible.
      Decision problem: Given an integer k, does G have an independent set consisting of k vertices?


    • Feedback edge set
      A feedback edge set in a digraph G is a subset E' of edges such that every cycle in G has an edge in E'.
      Optimization problem: Find a feedback edge set with as few edges as possible.
      Decision problem: Given an integer k, does G have a feedback edge set consisting of k edges?
      Note: The feedback edge set for undirected graphs is in P.


    • Longest simple path problem
      Optimization problem: What is the longest simple path between any two vertices in graph G?
      Decision problem: Given an integer k, is there a path in graph G longer than k?
      Note: The shortest path between any two vertices can be found in O(ne) time.


    • Subgraph isomorphism
      Decision problem: Given two graphs G1 and G2, determine if G1 is isomorphic to a subgraph of G2.


    • Integer linear programming
      Linear programming is a method of analyzing systems of linear inequalities to arrive at optimal solutions to problems
      Decision problem: Given a linear program, is there an integral solution?

      Class NP

    • The class of decision problems for which there is a polynomially bounded nondeterministic algorithm.
    • The class of problems that are "slightly harder" than P.
    • In general, an exponential number of subproblems, each of which can be solved or verified in polynomial time.
    • Being in class NP constitutes an upper bound on problem complexity.

    Nondeterministic Turing Machine (NDTM)



  • Model a problem as a tree in which the nodes represent states, with time flowing from the root to the children. A leaf represents a solution.


  • In an NDTM, the computation splits whenever a guess is needed.
    • The tree has polynomial depth but is exponentially bushy.


  • "Guess" nodes are like OR nodes: if either of the children returns T, the "guess" node returns T.


  • NP-hard



  • A problem Q is NP-hard if every problem in NP is reducible to Q.


  • A lower bound on a problem: "at least as hard as any problem in NP."


  • NP-Complete



  • The hardest decision problems in NP.


  • If there were a polynomially bounded algorithm for an NP-complete problem, then there would be a polynomially bounded algorithm for each problem in NP.


  • Establishes both a lower and an upper bound (although it may be that P=NP).


  • If any NP-complete problem is in P, then P=NP.


  • Satisfiability (SAT)



  • Instance: Given a set U = {u1, ..., um} of variables and a collection C = {c1, ..., cn} of clauses over U.


  • Question: Is there a truth assignment to U that satisfies C?


  • A logical formula in conjunctive normal form (CNF): a conjunction of disjunctions
    e.g., (x1 + x2 + ¬x3) · (x1 + x3) · (¬x2)

    • easy to falsify: make all terms in one clause fail.


    • hard to satisfy: terms in clauses interact.


  • Cook's Theorem: SAT is NP-complete.

    • uses SAT to simulate a machine


  • Once the primordial NP-complete problem is found, we can prove other problems NP-complete without reference to machines:

    • To prove that problem Q is NP-complete, reduce SAT (or any other NP-complete problem) to Q:
            SAT <=P Q


    • E.g., 3SAT: like SAT, but | ci | = 3 for 1 <= i <= n.


    • Need to show that 3SAT is NP-hard:
      Since X <=P SAT for any problem X in NP, we need only show SAT <=P 3SAT.


    • Proof:
      General form of clauses in SAT is cj = (z1 + z2 + ... + zk), where each zi is either an input or the negation of an input.
      k = 1
      {z1 + z1 + z1}
      k = 2
      {z1 + z2 + z2}
      k = 3
      {z1 + z2 + z3}
      k >= 4
      Add auxiliary variables that transmit information between pieces.
      E.g., {z1 + z2 + z3 + z4 + z5} becomes {(z1 + z2 + y1) (¬y1 + z3 + y2) (¬y2 + z4 + z5)}
      The transformation is polynomial in the length of the SAT instance.

    • Note: 2SAT is solvable in polynomial time.


  • Other NP-Complete Problems


    • Graph coloring
      A coloring of a graph is the assignment of a "color" to each vertex of the graph such that adjacent vertices are not assigned the same color. The chromatic number of a graph G, X(G) [Chi(G)], is the smallest number of colors needed to color G.
      Optimization problem: Given an undirected graph G, determine X(G) (and produce an optimal coloring).
      Decision problem: Given G and a positive integer k, is there a coloring of G using at most k colors (i.e., is G k-colorable)?


    • Job scheduling with penalties
      Suppose we are given n jobs to be executed one at a time and, for each job, an execution time, deadline, and penalty for missing the deadline.
      Optimization problem: Determine the minimum possible penalty (and find an optimal schedule).
      Decision problem: Given a nonnegative integer k, is there a schedule with penalty < k?
      Note: Job scheduling without penalties such that at most k jobs miss their deadlines is in P.


    • Multiprocessor scheduling
      Suppose we are given a deadline and a set of tasks of varying length to be performed on two identical processors.
      Optimization problem: Determine an allocation of the tasks to the two processors such that the deadline can be met.
      Decision problem: Can the tasks be arranged so that the deadline is met?


    • Bin packing
      Optimization problem: Given an unlimited number of bins, each of size 1, and n objects with sizes s1, ..., sn where 0 < si < 1, determine the smallest number of bins into which the objects can be packed (and find an optimal packing).
      Decision problem: Given, in addition to the inputs described above, an integer k, do the objects fit in k bins?


    • Knapsack problem
      Optimization problem: Given a knapsack of capacity C and n objects with sizes s1, ..., sn and "profits" p1, ..., pn, find the largest total profit of any subset of the objects that fits in the knapsack ( and find a subset that achieves the maximum profit).
      Decision problem: Given k, is there a subset of the objects that fits in the knapsack and has total profit at least k?


    • Subset sum problem
      Optimization problem: Given a positive integer C and n objects with positive sizes s1, ..., sn, among subsets of the objects with sum at most C, what is the largest subset sum?
      Decision problem: Is there a subset of the objects whose sizes add up to exactly C?
      The subset sum problem is a simpler version of the knapsack problem, in which the profit for each object is the same as its size.


    • Partition
      Suppose we are given a set of integers.
      Decision problem: Can the integers be partitioned into two sets whose sum is equal?


    • Hamiltonian cycles and Hamiltonian paths
      A Hamiltonian cycle (path) is a simple cycle (path) that passes through every vertex of a graph exactly once.
      Decision problem: Does a given undirected graph have a Hamiltonian cycle (path)?
      Note: An Euler tour -- a cycle that traverses each edge of a graph exactly once -- can be found in O(e) time.


    • Traveling salesperson problem (TSP) or minimum tour problem
      Optimization problem: Given a complete, weighted graph, find a minimum-weight Hamiltonian cycle.
      Decision problem: Given a complete, weighted graph and an integer k, is there a Hamiltonian cycle with total weight at most k?
      A complete graph has an edge between each pair of vertices.


    • Vertex cover
      A vertex cover for an undirected graph G is a subset V' of vertices such that each edge in G is incident upon some vertex in V'.
      Optimization problem: Find a vertex cover for G with as few vertices as possible.
      Decision problem: Given an integer k, does G have a vertex cover consisting of k vertices?
      Note: The edge cover problem is in P.


    • Clique
      A clique is a subset V' of vertices in an undirected graph G such that every pair of distinct vertices in V' is joined by an edge in G (i.e., the subgraph induced by V' is complete). A clique with k vertices is called a k-clique.
      Optimization problem: Find a clique with as many vertices as possible.
      Decision problem: Given an integer k, does G have a k-clique?


    • Independent set
      An independent set is a subset V' of vertices in an undirected graph G such that no pair of vertices in V' is joined by an edge of G.
      Optimization problem: Find an independent set with as many vertices as possible.
      Decision problem: Given an integer k, does G have an independent set consisting of k vertices?


    • Feedback edge set
      A feedback edge set in a digraph G is a subset E' of edges such that every cycle in G has an edge in E'.
      Optimization problem: Find a feedback edge set with as few edges as possible.
      Decision problem: Given an integer k, does G have a feedback edge set consisting of k edges?
      Note: The feedback edge set for undirected graphs is in P.


    • Longest simple path problem
      Optimization problem: What is the longest simple path between any two vertices in graph G?
      Decision problem: Given an integer k, is there a path in graph G longer than k?
      Note: The shortest path between any two vertices can be found in O(ne) time.


    • Subgraph isomorphism
      Decision problem: Given two graphs G1 and G2, determine if G1 is isomorphic to a subgraph of G2.


    • Integer linear programming
      Linear programming is a method of analyzing systems of linear inequalities to arrive at optimal solutions to problems
      Decision problem: Given a linear program, is there an integral solution?

    Class NP


    • The class of decision problems for which there is a polynomially bounded nondeterministic algorithm.


    • The class of problems that are "slightly harder" than P.


    • In general, an exponential number of subproblems, each of which can be solved or verified in polynomial time.


    • Being in class NP constitutes an upper bound on problem complexity.

    Nondeterministic Turing Machine (NDTM)



  • Model a problem as a tree in which the nodes represent states, with time flowing from the root to the children. A leaf represents a solution.


  • In an NDTM, the computation splits whenever a guess is needed.
    • The tree has polynomial depth but is exponentially bushy.


  • "Guess" nodes are like OR nodes: if either of the children returns T, the "guess" node returns T.


  • NP-hard



  • A problem Q is NP-hard if every problem in NP is reducible to Q.


  • A lower bound on a problem: "at least as hard as any problem in NP."


  • NP-Complete



  • The hardest decision problems in NP.


  • If there were a polynomially bounded algorithm for an NP-complete problem, then there would be a polynomially bounded algorithm for each problem in NP.


  • Establishes both a lower and an upper bound (although it may be that P=NP).


  • If any NP-complete problem is in P, then P=NP.


  • Satisfiability (SAT)



  • Instance: Given a set U = {u1, ..., um} of variables and a collection C = {c1, ..., cn} of clauses over U.


  • Question: Is there a truth assignment to U that satisfies C?


  • A logical formula in conjunctive normal form (CNF): a conjunction of disjunctions
    e.g., (x1 + x2 + ¬x3) · (x1 + x3) · (¬x2)

    • easy to falsify: make all terms in one clause fail.


    • hard to satisfy: terms in clauses interact.


  • Cook's Theorem: SAT is NP-complete.

    • uses SAT to simulate a machine


  • Once the primordial NP-complete problem is found, we can prove other problems NP-complete without reference to machines:

    • To prove that problem Q is NP-complete, reduce SAT (or any other NP-complete problem) to Q:
            SAT <=P Q


    • E.g., 3SAT: like SAT, but | ci | = 3 for 1 <= i <= n.


    • Need to show that 3SAT is NP-hard:
      Since X <=P SAT for any problem X in NP, we need only show SAT <=P 3SAT.


    • Proof:
      General form of clauses in SAT is cj = (z1 + z2 + ... + zk), where each zi is either an input or the negation of an input.
      k = 1
      {z1 + z1 + z1}
      k = 2
      {z1 + z2 + z2}
      k = 3
      {z1 + z2 + z3}
      k >= 4
      Add auxiliary variables that transmit information between pieces.
      E.g., {z1 + z2 + z3 + z4 + z5} becomes {(z1 + z2 + y1) (¬y1 + z3 + y2) (¬y2 + z4 + z5)}
      The transformation is polynomial in the length of the SAT instance.

    • Note: 2SAT is solvable in polynomial time.


  • Other NP-Complete Problems


    • Graph coloring
      A coloring of a graph is the assignment of a "color" to each vertex of the graph such that adjacent vertices are not assigned the same color. The chromatic number of a graph G, X(G) [Chi(G)], is the smallest number of colors needed to color G.
      Optimization problem: Given an undirected graph G, determine X(G) (and produce an optimal coloring).
      Decision problem: Given G and a positive integer k, is there a coloring of G using at most k colors (i.e., is G k-colorable)?


    • Job scheduling with penalties
      Suppose we are given n jobs to be executed one at a time and, for each job, an execution time, deadline, and penalty for missing the deadline.
      Optimization problem: Determine the minimum possible penalty (and find an optimal schedule).
      Decision problem: Given a nonnegative integer k, is there a schedule with penalty < k?
      Note: Job scheduling without penalties such that at most k jobs miss their deadlines is in P.


    • Multiprocessor scheduling
      Suppose we are given a deadline and a set of tasks of varying length to be performed on two identical processors.
      Optimization problem: Determine an allocation of the tasks to the two processors such that the deadline can be met.
      Decision problem: Can the tasks be arranged so that the deadline is met?


    • Bin packing
      Optimization problem: Given an unlimited number of bins, each of size 1, and n objects with sizes s1, ..., sn where 0 < si < 1, determine the smallest number of bins into which the objects can be packed (and find an optimal packing).
      Decision problem: Given, in addition to the inputs described above, an integer k, do the objects fit in k bins?


    • Knapsack problem
      Optimization problem: Given a knapsack of capacity C and n objects with sizes s1, ..., sn and "profits" p1, ..., pn, find the largest total profit of any subset of the objects that fits in the knapsack ( and find a subset that achieves the maximum profit).
      Decision problem: Given k, is there a subset of the objects that fits in the knapsack and has total profit at least k?


    • Subset sum problem
      Optimization problem: Given a positive integer C and n objects with positive sizes s1, ..., sn, among subsets of the objects with sum at most C, what is the largest subset sum?
      Decision problem: Is there a subset of the objects whose sizes add up to exactly C?
      The subset sum problem is a simpler version of the knapsack problem, in which the profit for each object is the same as its size.


    • Partition
      Suppose we are given a set of integers.
      Decision problem: Can the integers be partitioned into two sets whose sum is equal?


    • Hamiltonian cycles and Hamiltonian paths
      A Hamiltonian cycle (path) is a simple cycle (path) that passes through every vertex of a graph exactly once.
      Decision problem: Does a given undirected graph have a Hamiltonian cycle (path)?
      Note: An Euler tour -- a cycle that traverses each edge of a graph exactly once -- can be found in O(e) time.


    • Traveling salesperson problem (TSP) or minimum tour problem
      Optimization problem: Given a complete, weighted graph, find a minimum-weight Hamiltonian cycle.
      Decision problem: Given a complete, weighted graph and an integer k, is there a Hamiltonian cycle with total weight at most k?
      A complete graph has an edge between each pair of vertices.


    • Vertex cover
      A vertex cover for an undirected graph G is a subset V' of vertices such that each edge in G is incident upon some vertex in V'.
      Optimization problem: Find a vertex cover for G with as few vertices as possible.
      Decision problem: Given an integer k, does G have a vertex cover consisting of k vertices?
      Note: The edge cover problem is in P.


    • Clique
      A clique is a subset V' of vertices in an undirected graph G such that every pair of distinct vertices in V' is joined by an edge in G (i.e., the subgraph induced by V' is complete). A clique with k vertices is called a k-clique.
      Optimization problem: Find a clique with as many vertices as possible.
      Decision problem: Given an integer k, does G have a k-clique?


    • Independent set
      An independent set is a subset V' of vertices in an undirected graph G such that no pair of vertices in V' is joined by an edge of G.
      Optimization problem: Find an independent set with as many vertices as possible.
      Decision problem: Given an integer k, does G have an independent set consisting of k vertices?


    • Feedback edge set
      A feedback edge set in a digraph G is a subset E' of edges such that every cycle in G has an edge in E'.
      Optimization problem: Find a feedback edge set with as few edges as possible.
      Decision problem: Given an integer k, does G have a feedback edge set consisting of k edges?
      Note: The feedback edge set for undirected graphs is in P.


    • Longest simple path problem
      Optimization problem: What is the longest simple path between any two vertices in graph G?
      Decision problem: Given an integer k, is there a path in graph G longer than k?
      Note: The shortest path between any two vertices can be found in O(ne) time.


    • Subgraph isomorphism
      Decision problem: Given two graphs G1 and G2, determine if G1 is isomorphic to a subgraph of G2.


    • Integer linear programming
      Linear programming is a method of analyzing systems of linear inequalities to arrive at optimal solutions to problems
      Decision problem: Given a linear program, is there an integral solution?