Pipelining and parallel execution of multiple load instructions is performed within a load store unit. When a first load instruction incurs a cache miss and proceeds to retrieve the load data from the system memory hierarchy, a second load instruction addressing the same load data will be merged into the first load instruction so that the data returned from the system memory hierarchy is sent to register files associated with both the first and second load instructions. As a result, the second load instruction does not have to wait until the load data has been written and validated in the data cache.
System And Method For Executing Store Instructions
Hung Qui Le - Austin TX Robert Greg McDonald - Austin TX David James Shippy - Austin TX Larry Edward Thatcher - Austin TX
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 9312
US Classification:
712225, 712208, 712221, 712222
Abstract:
In a processor, store instructions are divided or cracked into store data and store address generation portions for separate and parallel execution within two execution units. The address generation portion of the store instruction is executed within the load store unit, while the store data portion of the instruction is executed in an execution unit other than the load store unit. If the store instruction is a fixed point execution unit, then the store data portion is executed within the fixed point unit. If the store instruction is a floating point store instruction, then the store data portion of the store instruction is executed within the floating point unit.
System And Method For Invalidating An Entry In A Translation Unit
Albert Chang - Yorktown Heights NY Edward John Silha - Austin TX Larry Edward Thatcher - Austin TX Gus Wai-Yan Yeung - Austin TX
Assignee:
International Business Machines Corp. - Armonk NY
International Classification:
G06F 1200
US Classification:
711203, 711133, 711200, 711205
Abstract:
As a program is replaced by the operating system running within a microprocessor, only those entries associated with the replaced program and resident within effective-to-real address translation units will be replaced. Those entries within the effective-to-real address translation units associated with the operating system and shared libraries, and any other software units operating within the microprocessor will not be invalidated.
Optimization Of Instruction Stream Execution That Includes A Vliw Dispatch Group
Larry Edward Thatcher - Austin TX John Edward Derrick - Round Rock TX
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 1500
US Classification:
712 24, 712218, 712219, 712228
Abstract:
A method and system for optimizing execution of an instruction stream which includes a very long instruction word (VLIW) dispatch group in which ordering is not maintained is disclosed. The method and system comprises examining an access which initiated a flush operation; capturing an indice related to the flush operation; and causing all storage access instructions related to this indice to be dispatched as single-IOP groups until the indice is updated. Storage access to address space which is safe such as Guarded (G=1) or Direct Store (E=DS) must be handled in a non-speculative manner such that operations which could potentially go to volatile I/O devices or control locations that do not get processed out of order. Since the address is not known in the front end of the processor, this can only be determined by the load store unit or functional block which performs translation. Therefore, if a flush occurs for these conditions, in accordance with the present invention the value of the base register (RA) is latched and subsequent loads and stores which use this base register are decoded in a âsafeâ manner until an instruction is decoded which would change the base register value (safe means an internal instruction sequence which can be executed in order without repeating any accesses).
System And Method For High Performance Execution Of Locked Memory Instructions In A System With Distributed Memory And A Restrictive Memory Model
Bryan D. Boatright - Austin TX Rajesh Bhikhubhai Patel - Austin TX Larry Edward Thatcher - Austin TX
Assignee:
Intel Corporation - Santa Clara CA
International Classification:
G06F 1200
US Classification:
711145, 711146
Abstract:
The present invention relates to locked memory instructions, and more specifically to a system and method for the high performance execution of locked memory instructions in a system with distributed memory and a restrictive memory model. In accordance with an embodiment of the present invention, a method for executing locked-memory instructions includes decoding a locked-memory instruction, obtaining exclusive ownership of a cacheline to be used by a load-lock operation, setting a bit to indicate the load-lock operations ownership of the cacheline, and activating a snoop checking process. The method also includes modifying a load data value and storing the modified load data value. The method further includes determining that the cacheline is still exclusively owned, storing the load data value, determining that the cacheline is unsnooped, merging the modified load data value with the load data value, and releasing the locked-memory instruction to be retired.
Data Processing System Including Load/Store Unit Having A Real Address Tag Array And Method For Correcting Effective Address Aliasing
James Allan Kahle - Austin TX George McNeil Lattimore - Austin TX Jose Angel Paredes - Austin TX Larry Edward Thatcher - Austin TX
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 1210
US Classification:
711210, 711146, 711220
Abstract:
A data processing system including a processor having a load/store unit and a method for correcting effective address aliasing. In the load/store unit within the processor, load and store instructions are executed out of order. The load and store instructions are assigned tags in a predetermined manner, and then assigned to load and store reorder queues for keeping track of the program order of the load and store instructions. A real address tag is utilized to correct for effective address aliasing within the load/store unit.
Method And System For Optimally Issuing Dependent Instructions Based On Speculative L2 Cache Hit In A Data Processing System
Robert Alan Cargnoni - Austin TX Bruce Joseph Ronchetti - Austin TX David James Shippy - Austin TX Larry Edward Thatcher - Austin TX
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 1208
US Classification:
711125, 711122, 711137
Abstract:
A method for optimally issuing instructions that are related to a first instruction in a data processing system is disclosed. The processing system includes a primary and secondary cache. The method and system comprises speculatively indicating a hit of the first instruction in a secondary cache and releasing the dependent instructions. The method and system includes determining if the first instruction is within the secondary cache. The method and system further includes providing data related to the first instruction from the secondary cache to the primary cache when the instruction is within the secondary cache. A method and system in accordance with the present invention causes instructions that create dependencies (such as a load instruction) to signal an issue queue (which is responsible for issuing instructions with resolved conflicts) in advance, that the instruction will complete in a predetermined number of cycles. In an embodiment, a core interface unit (CIU) will signal an execution unit such as the Load Store Unit (LSU) that it is assumed that the instruction will hit in the L cache. An issue queue uses the signal to issue dependent instructions at an optimal time.
James Allan Kahle - Austin TX Hung Qui Le - Austin TX Kevin F. Reick - Austin TX David James Shippy - Austin TX Larry Edward Thatcher - Austin TX
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 1100
US Classification:
714 10, 714 47, 714 55, 712229
Abstract:
A processor and an associated method and data processing system are disclosed. The processor includes an issue unit (ISU), a completion unit, and a hang detect unit. The ISU is configured to issue instructions to an execution unit. The completion unit is adapted to produce a completion valid signal responsive to the issue unit completing an instruction. The hang detect unit is configured to receive the completion valid signal from the ISU and adapted to determine the interval since the most recent assertion of the completion valid signal. The hang detect unit is adapted to initiate a hang recovery sequence upon determining that the interval since the most recent assertion of the completion valid signal exceeds a predetermined maximum interval. In one embodiment, the hang recovery sequence includes the hang recovery unit asserting a stop completion signal to a completion unit and a stop dispatch signal to a dispatch unit to suspend instruction completion and dispatch. The hang recovery unit then asserts a force reject signal to an execution unit to reject all instructions pending in the execution units pipeline and a flush signal to the execution unit that results in the processor flushing a set of instructions.