中文版 | News | Archives | Reviews | Forum | $ DealsLinks | History | Contact | Privacy

Intel Core Microarchitecture 2/4
Bluetooth 23 Apr 2006

5 major points of Intel core Microarchitecture

The new microarchitecture has 5 major improvements over prior design of microprocessor. The first being Wide Dynamic Execution. Wide Dynamic Execution is a combination of technique for data flow analysis, speculative execution, out of order execution and super scalar).

With Intel Core microarchiteture, Intel enhances this capability with  delivery of more instructions per clock cycle to improve execution time and energy efficiency. It can now handle up to 4 full instructions in contrast to 3 in Netburst architecture.

Not only that it can execute 4 in a row, it has a feature known as Macro-Fusion. Macro fusion as its name implies fuses or joins common instruction pair, for example a CMP and JNE (compare and Jump) instruction into one instruction. This reduces the number of clk cycles. e.g. 5 instructions reduced to 4 which can be done with 1 clk cycle.

The second improvement is Intel's Intelligent Power Capability. It enables to reduce power consumption by an advanced power gating which allows ultra fine-frained logic control that turns on individual processor logic subsystem only if and when they are needed. Intel has ensured that this technique does not sacrifice system's responsiveness as it needs to power up the lower power state.

Advanced Smart Cache is the 3rd feature of the new architecture. It allows a dynamic L2 cache to be shared among the cores. In the current design, the cores uses independent L2 cache. This L2 cache cannot be shared. Information that is required by Core 0 cannot be read directly into core 1's L2 cache , it has to be swapped out to memory. With the advance smart cache, this problem is resolved. Furthermore, the L2 cache of both cores can be dynamically allocated. So if Core 0 is idle, it can allow Core 1 to use all the L2 cache..

Smart Memory Access improves the system performance by optimising the available data bandwidth from the memory subsystem and hiding the latency. This unique capability is known as Memory disambiguiation which increases the efficiency of otu of order processing by speculating load data for instructions that are about to be executed before all previous store instructions are executed. Usually a load cannot be executed before a store as you might be loading into wrong location. Intel's memory disambiguation uses special intelligent algorithms to evaluate whether or not a load can be executed ahead of a preceding store. If it can, it will load and thus speed up the performance (rather than waiting) for instruction level paralleism.  In the event that the load is invalid, the built in algorithm will detect the conflict and reload the correct data and re-execute the instruction. To ensure data is where each execution core needs it, it uses two prefecters per L1 cache and two prefetchers per L2 cache.

Advanced Digital Media Boost speeds up execution of SSE instructions with 128 bit execution per clk. previous generation of processors has to processor the lower and upper 64 bits in 2 clock cycles.

Discuss in Forum

Next >>>

(C) Copyright 1998-2009 OCWorkbench.com