Pipeline Hazards

INTRODUCTION

First of all, the question that arises to one's mind is, what is pipelining?.

Pipelining is determined as the manner of accumulating data from the processor from a pipeline. It lets us save and execute instructions in a systematic way. This too is known as pipeline processing.

It is a way where a number of commands are overlapped while execution. Pipeline is split into different levels and these levels are connected with one another in order to form a pipe-like structure. Instructions enter from one stop and go out from every other stop.

In pipelining there may come some certain scenarios where some part of the pipeline must be delaying it because the conditions do not allow continuous execution i.e the next instruction which is to be executed at a particular clock cycle is delayed or cannot be executed. The scenarios where such a state occurs, it is said to be a pipeline hazard.

They are categorized into 3 types:

Structural hazards: certain combinations of commands cannot be supported by using the hardware. (the origin resource for both the instructions in the pipeline required is the same).
Data hazards: the next instruction is relying on the output of preceding instruction which is nonetheless present inside the pipeline
Control hazards: It rises because of the postponement between the fetching of commands and choices about modifications in control flow.

Let’s discuss each hazard in detail.

Structural Hazards

Due to the hardware aid conflicts among the instructions inside the pipeline Structural risks are said to occur. A aid here will be memory, ALU, or a register in GPR. while more than one instruction inside the pipe is requiring an allow to the resource and that too within the identical clock cycle this resource conflict has stated to be arised. In an overlapped pipeline execution this is the situation in which the hardware can not manage all of the feasible combos.

Fig.1 : Structural Hazard

The solution to structural hazards

We'll introduce a bubble as shown within the fig.At t4, I4 is not allowed to continue, rather behind schedule. it could have been permitted in t5, but again encounter with I2 RW. For the same motive, I4 isn't always allowed in t6 too. In the end, I4 could be allowed to proceed (stalled) within the pipe only at t7.

Fig. 2: Solution of Structural Hazard

This delay is filtered to all the subsequent instructions too. Thus, while an ideal 4-stage system would have taken 8 timing states to execute 5 instructions, now due to structural dependency it has taken 11 timing states. Just not this. By now you would have gotten the idea that this hazard is likely to arise at every 4th instruction. Not at all a good solution for a heavy load on the CPU.

This delay is percolated to all the next instructions too. For that reason, at the same time as the ideal 4-level system might have taken eight timing states to execute 5 instructions, now due to structural dependency it has taken eleven timing states. just not this. Until now you would have guessed that this risk is probably to happen at each 4th instruction. In no way a good solution for a heavy load on the CPU.

A higher solution might be to increase the structural resources inside the system, with the usage of one of the few alternatives beneath:

The pipeline can be expanded to 5 or more stages and accordingly redefine the capability of the stages and alter the clock frequency. This removes the issue of the threat at every 4th instruction in the four-stage pipeline
The memory may physically be comparted as Instruction memory and Data Memory. A better preference might be to design as Cache memory in CPU, in preference to managing Data memory. IF utilizes Instruction memory and Result writing uses Data Memory. These become two separate resources avoiding dependency.
It is possible to have Multiple levels of Cache in the CPU too.
There may be a possibility of ALU in resource dependency. ALU can be required in IE machine cycle by an instruction while another command may require ALU in IF level to calculate Effective Address based on addressing mode. The answer would be either stalling or having an one of a kind ALU for deal with calculation
Register files are utilized in areas of GPRS. Register files have multiport permit with exclusive read and write ports. This enables simultaneous access on one read/write register.

The last two ways are implemented in modern-day CPUs. Beyond these, if dependency arises, Stalling is the best alternative. Understand that increasing resources involves increased cost. So the trade-off is a designer’s preference.

Data Hazards

When an instruction’s execution depends on the result of a prior instruction that is neverthe;ess being processed inside the pipeline that is where data hazard may occur.

Fig. 3: Data Hazards

Within the above case, the result into the register R3 in t5 is written with the help of the ADD instruction. All three instructions could be using the incorrect information from R3, that is earlier to ADD result if bubbles are not brought. Bubbles are introduced so one can stall the subsequent instruction.

The program goes wrong! We can try to solve by following methods:

Solutions to data hazards

One way is we will introduce three bubbles at SUB instruction IF stage. This will make a way for SUB – ID to function at t6. Eventually, all the following instructions are also delayed in the pipe. The other manner is by means of data forwarding. Forwarding can be said as passing of the result directly to the functional unit that wants it.

In this situation, ADD result is there at the output of ALU in ADD –IE i.e t3 end. If this can be contained and pushed ahead via control unit to the SUB-IE stage at t4, before writing directly to output register R3, then the pipeline will move in advance with no stalling. This will require extra logic to pick out this data hazard and act upon it. It is to be noted that despite the fact that normally Operand Fetch happens in the ID stage, it is used only in IE stage. Hence forwarding is given to the stage as an input. Comparable forwarding can be carried out with OR and AND command as well.

Fig. 4: Solutions to Data Hazard

In every other manner, the Compiler can play a position in detecting the data dependency and reorder (resequence) the instructions accordingly while generating executable code. This manner the hardware can be eased.

Classification of Data Hazards are done in three categories on basis of their read write operation in the register

1. RAW (Read after Write) [Flow/True data dependency]

This is a case where an instruction uses data produced by a preceding one. Example

ADD R0, R1, R2

SUB R4, R3, R0

2. WAR (Write after Read) [Anti-Data dependency]

This is a case in which the second instruction writes onto the register before the first instruction is read. This is not common in a simple pipeline structure. However, in some systems with complex and special instruction cases, WAR can happen.

ADD R2, R1, R0

SUB R0, R3, R4

3.WAW (Write after Write) [Output data dependency]

This is a case where two parallel instructions write the same register and have to do it in the order wherein they had been issued.

ADD R0, R1, R2

SUB R0, R4, R5

WAW and WAR hazards can most effectively arise when instructions are finished in parallel or out of order. These arise because the same register numbers have been allotted by the compiler even though avoidable. This case is fixed by renaming one of the registers by the compiler or by delaying the updating of a register until the perfect value has been produced.

Control Hazards

Control hazards are referred to as Branch hazards and are resulting from Branch commands. Branch instructions manage the flow of program/ instructions execution. Don't forget that we use conditional statements inside the higher-level language either for iterative loops or with conditions checking (correlate with for, while, if, case statements). These are transformed into one of the editions of BRANCH instructions. It is important to know the value of the condition being checked to get the program flow. Life is complicating you! So it is for the CPU!

The branch and jump instructions determine the system program flow by loading the correct location in the Program Counter(PC). The PC has the value of the subsequent instruction to be fetched and executed by the CPU. Consider the following sequence of instructions.

Fig. 5: Control Hazards

In this case, there may be no point in fetching the I3. What takes place inside the pipeline? The I3 fetching needs to be stopped when in I2. This information can be gained only after I2 is decoded as JMP and not until then. So the pipeline cannot proceed at its pace and hence for this reason this is a Control Dependency (hazard). In case I3 is fetched in the meantime, it is not only redundant work but possibly some data in registers might have got altered and needs to be undone.

Similar cases arise with conditional JMP or BRANCH.

Solutions for Conditional Hazards

Stall the Pipeline as soon as decoding any form of branch instructions. Just no longer allow anymore IF. Just like every time, stalling reduces throughput. The records state that in a program, at least 30% of the instructions are BRANCH. Concluding, the pipeline will operate at 50% capacity with Stalling taken into consideration.
Prediction – Imagine a for or while loop getting executed 100 times. We can say that, 100 times the program executes without the branch condition being met. Finally in the 101st cycle, the program comes out of the loop. So, it is wiser to permit the pipeline to proceed and undo/flush when the branch condition is met. The throttle of the pipeline is not affected.
Dynamic Branch Prediction - A history record is maintained with the assistance of Branch Table Buffer (BTB). The BTB is a type of cache, which has a set of entries, with the PC address of the Branch Instruction and the corresponding effective branch address. This is maintained for every branch instruction that it counters. SO at times when a conditional branch instruction is encountered, a research for the matching branch instruction address from the BTB is finished. If hit, then the corresponding goal or target branch address is used for fetching the subsequent instruction. This is called dynamic branch prediction..
Reordering instructions - Delayed branch i.e. reordering the instructions/commands to position the branch instruction later within the order, such that safe and utilisable instructions which are not bothered or affected by the end result of a branch are brought-in advance in the sequence thus delaying the branch instruction fetch. If no such commands are available then NOP is brought into action. This delayed branch is applied with the help of the Compiler.

Since, control hazards are also known as branch hazards, we'll have to deal with the branches.

Dealing with these branches :

Multiple Streams
Prefetch Branch target
Delayed branch
Loop Buffer
Branch Prediction

Multiple Streams:

A simple pipeline suffers a penalty for a branch instruction as it should pick out one in every two commands to fetch the next and may make an incorrect choice.

A brute-force approach is to duplicate or copy the starting portions of the pipeline and permit the pipeline to fetch both commands, making use of two stream.

Drawbacks: With multiple pipelines there are rivalry delays for access to the registers and to memory Additional branch instructions can also enter the pipeline before the main or original branch decision is resolved

Prefetch Branch Target:

When a conditional branch is identified, the target of the branch is prefetched, similarly in addition to the instruction following the branch.

Target is then saved until the branch instruction is carried out.

If the branch is taken, the target has already been prefetched.

Delayed Branch:

Delayed Branch Do not take jump till you need to Rearrange instructions

Loop buffer:

Small, very-high pace memory maintained through the instruction fetch level of the pipeline and containing the most recently fetched instructions, in a proper series

Benefits: Instructions fetched in a proper series will be available without the usual memory access time. If a branch occurs to a target only at few locations ahead of the address of the branch instruction, the target will already be within the buffer. This method is particularly is nicely suitable in handling the loops.

Same in principle to a cache devoted to instructions

Differences:

The loop buffer will only retain instructions in a proper series.

Branch Prediction: Taken/No longer taken switch Based on previous history Good for loops

It is a whole lot smaller in size and as a result lower in price as well

Conclusion

Finally concluding, we’ll discuss the effects of pipeline hazards on the computers.

Hazards limit performance of computers: –

Structural: need more HW resources

Data (RAW,WAR,WAW): need forwarding, compiler scheduling

Control: delayed branch, prediction

Thank You for reading Our Blog…

Blog by — Parth Fulsoundar | Tejal Gadad | Shruti Ghulaxe | Asmita Gitte

Vishwakarma Institute of Technology, Pune.

Search This Blog

Pipeline Hazards

Pipeline Hazards

Structural Hazards

Data Hazards

Control Hazards

Solutions for Conditional Hazards

Comments

Post a Comment