PART 1: Pipelining...
Consider a machine with the following machine cycle:

- Fetch: instruction is fetched
- Decode: instruction is decoded
- Execute: instruction is executed
- Write-back: result is written back to memory

Therefore, a normal (non-pipelined) execution will look like the following:

F D E WB F D E WB F D E WB F D E WB F D E WB F D E WB ...

Since both fetch and write-back access memory, they cannot be performed simultaneously. Show how a pipelined machine cycle will look like using 9 instructions. How much speedup is gained over the non-pipelined execution?

Here’s one possibility

F D E WB
  F D E WB
     F D E WB
        F D E WB
           F D E WB
              F D E WB
                 F D E WB
                    F D E WB

The speedup is $36/18=2$. 
Here’s another:

\[
\begin{array}{cccc}
F & D & E & WB \\
F & D & E & WB \\
F & D & E & WB \\
F & D & E & WB \\
F & D & E & WB \\
F & D & E & WB \\
F & D & E & WB \\
F & D & E & WB \\
\end{array}
\]

The speedup is $36/20 = 1.8$.

In both cases, when the number of instruction is large, the speedup approaches $2$.

**PART 2: Cache memory**

The CSCI 120 teacher collected all the tests and went to his office in HN 1090F. Few seconds after being in the office, he received a call on his cell phone from his wife to come home, so he left the building carrying the tests with him. He got into a cab on 68th street and Lexington avenue and went home. At home, and while talking to his wife, he suddenly realized that the tests are no longer with him! He figured that he must have forgotten them in the cab, and this must have happened because he placed them on the seat when he was paying the driver. Unfortunately, while he always makes sure to remember the four digit cab number, this time he didn’t. Therefore, all the tests are gone. The next day, when he got into his office, he saw the tests on his desk.

(a) Explain how sometimes we can be tricked by our memory and why inconsistencies such as the one described above could occur.

**ANSWER:** I will not answer this, but I am interested in seeing how you answered it.

(b) Describe the above scenario from a machine’s perspective using memory and cache memory and how this inconsistency could occur because of a failure to invalidate cache. Use your imagination.

**ANSWER:** Here’s an example: There are two processors, one is performing the task of entering to the office and placing the tests of the desk, and the other is performing the task of answering the phone and acting based on the conversation. The first processor loads main memory into cache, which basically says that tests are in teacher’s hands. The second processor does the same. The first processor then places the tests on the desk, updates its cache, and then copies cache back to main memory, and stops, failing to invalidate the cache of the second processor. The second processor, having now invalid cache, thinks that the tests are still in teacher’s hands (that’s its local view of the memory). It answers the phone, leaves the office, grabs a cab, places the tests on the seat, pays the driver, and due to a bug, forgets the tests on the seat. When entering home, it attempts to get access to the tests, but realizes that they have been forgotten in the cab. The rest is history as they say...
PART 3: Computing the absolute value
Assume the machine uses 2’s complement representation of numbers. Write a
program using the instruction set presented in class to perform the following:

if memory location 103 contains a positive number, keep it the same;
otherwise if negative, change it to a positive.

Example: if memory location 103 contains 00100101 (that’s 37) then nothing is
to be done, but if it contains 10010010 (that’s -110), then memory location 103
should be changed to 01101110 (that’s 110).

The idea is the following: Compare 0 to the number (using the CMP instruc-
tion). If 0 is greater than the number (i.e. number is negative), jump to a
location in memory where you negate the number. The jump instruction must
rely on the result of the compare; therefore, it must use the register that holds
the result of the comparison.

ANSWER:

200: LOAD R0 103MEM
202: LOAD R1 0VAL
204: CMP R1 R0 R2 (R2 is 1 if R0>0)
206: JUMP R2 210 (if R2 is 1 jump)
208: HALT
210: NOT R0 R1 (flip bits of R0)
212: ADD R1 R2 R0 (add 1, because R2 now contains 1)
214: STORE R0 103 (now R0 contains the negation)
216: HALT