Tuesday, April 12, 2016

Computer Architecture

Struct and unions in C
- union is a user defined data type to store values of different data types in same memory location.

MESI v MoESI
- O-owner helps to reduce memory accesses because one cache contains the most recent data and all other caches can take data from it.


How to swap two bits in an integer         
- using xor


How to reverse bits in a X bit number
- lookup table of reversed indexing bits
- continues OR operation with half the bits swapped


why is branch delay slot necessary
- a delay slot is an instruction slot which gets executed without the effects of preceding instructions.
- The point of delay slot specifically is to execute an instruction that has already made it through part of pipeline and is now in slot that would otherwise just have to be thrown away.

what happens if you forward data to execute stage rather than decode stage
- will not have to stall the pipeline and next instruction can receive the data just before execution.

4 way set associative vs 16 way set associative. What will consume more power and what will be slower
- fewer the total no. of sets you need to search through, the less overall hardware is needed.
- simultaneouly allowing N cache lines per slot for N way set associate cache reduces the misses.
- so for 16 way set associate cache will use more hardware for comparison and thus more power for same total size of cache as compares to 4 way set associate cache.
- but 16 way set associate cache has fewer collisions as there are now more slots to pick from.

architectural reg file vs physical reg file.
- the architected registers and rename registers can be pooled together to form a single physical reg file.s


OoO execution and in order and the differences of that
Topology if one of the input is faster than other. Passgate, CMOS, domino etc

What is the problem with delay slots? - Hard to find independent instructions. If in future pipeline changes, then software will change

what is predicated execution?
- predicated execution avoids branches, and simplifies compiler optimizations by converting a control dependence to a data dependence.
- replaces branch prediction by allowing the CPU to execute all possible branch paths in parallel.

Memory
http://www.barrgroup.com/Embedded-Systems/How-To/Memory-Types-RAM-ROM-Flash


No comments:

Post a Comment