A processor includes a thread switching control logic that performs a fast thread-switching operation in response to an L cache miss stall. The fast thread-switching operation implements one or more of several thread-switching methods. A first thread-switching operation is âobliviousâ thread-switching for every N cycle in which the individual flip-flops locally determine a thread-switch without notification of stalling. The oblivious technique avoids usage of an extra global interconnection between threads for thread selection. A second thread-switching operation is âsemi-obliviousâ thread-switching for use with an existing âpipeline stallâ signal (if any). The pipeline stall signal operates in two capacities, first as a notification of a pipeline stall, and second as a thread select signal between threads so that, again, usage of an extra global interconnection between threads for thread selection is avoided. A third thread-switching operation is an âintelligent global schedulerâ thread-switching in which a thread switch decision is based on a plurality of signals including: (1) an L data cache miss stall signal, (2) an instruction buffer empty signal, (3) an L cache miss signal, (4) a thread priority signal, (5) a thread timer signal, (6) an interrupt signal, or other sources of triggering. In some embodiments, the thread select signal is broadcast as fast as possible, similar to a clock tree distribution.
Vertically And Horizontally Threaded Processor With Multidimensional Storage For Storing Thread Data
William N. Joy - Aspen CO Marc Tremblay - Menlo Park CA Gary Lauterbach - Los Altos CA Joseph I. Chamdani - Santa Clara CA
Assignee:
Sun Microsystems, Inc. - Palo Alto CA
International Classification:
G06F 946
US Classification:
712228, 709107, 709108, 712229
Abstract:
A processor includes a âfour-dimensionalâ register structure in which register file structures are replicated by N for vertical threading in combination with a three-dimensional storage circuit. The multi-dimensional storage is formed by constructing a storage, such as a register file or memory, as a plurality of two-dimensional storage planes.
A method of reordering instructions. Barrier instructions are determined. The method determines when a processor stall may occur, and hoists subsequent instructions to fill in the stall time. However, instructions are not hoisted above the barrier instructions. Barrier instructions include branch instructions, store and load instructions, and instructions which, if hoisted, cause the number of available registers to be exceeded. The method produces a reordered instruction trace and statistics regarding the effectiveness of the reordering.
William N. Joy - Aspen CO Marc Tremblay - Menlo Park CA Gary Lauterbach - Los Altos CA Joseph I. Chamdani - Santa Clara CA
Assignee:
Sun Microsystems, Inc. - Santa Clara CA
International Classification:
G06F 946
US Classification:
709107, 709108, 712228, 712229, 712219
Abstract:
A processor includes logic for attaining a very fast exception handling functionality while executing non-threaded programs by invoking a multithreaded-type functionality in response to an exception condition. The processor, while operating in multithreaded conditions or while executing non-threaded programs, progresses through multiple machine states during execution. The very fast exception handling logic includes connection of an exception signal line to thread select logic, causing an exception signal to evoke a switch in thread and machine state. The switch in thread and machine state causes the processor to enter and to exit the exception handler immediately, without waiting to drain the pipeline or queues and without the inherent timing penalty of the operating systems software saving and restoring of registers.
Multiple-Thread Processor With Single-Thread Interface Shared Among Threads
William N. Joy - Aspen CO Marc Tremblay - Menlo Park CA Gary Lauterbach - Los Altos CA Joseph I. Chamdani - Santa Clara CA
Assignee:
Sun Microsystems, Inc. - Santa Clara CA
International Classification:
G06F 1212
US Classification:
712228, 709108
Abstract:
A processor includes logic for tagging a thread identifier (TID) for usage with processor blocks that are not stalled. Pertinent non-stalling blocks include caches, translation look-aside buffers (TLB), a load buffer asynchronous interface, an external memory management unit (MMU) interface, and others. A processor includes a cache that is segregated into a plurality of N cache parts. Cache segregation avoids interference, âpollutionâ, or âcross-talkâ between threads. One technique for cache segregation utilizes logic for storing and communicating thread identification (TID) bits. The cache utilizes cache indexing logic. For example, the TID bits can be inserted at the most significant bits of the cache index.
William N. Joy - Aspen CO Marc Tremblay - Menlo Park CA Gary Lauterbach - Los Altos CA Joseph I. Chamdani - Santa Clara CA
Assignee:
Sun Microsystems, Inc. - Palo Alto CA
International Classification:
G06F 900
US Classification:
709108, 712244, 712228
Abstract:
A processor includes logic for attaining a very fast exception handling functionality while executing non-threaded programs by invoking a multithreaded-type functionality in response to an exception condition. The processor, while operating in multithreaded conditions or while executing non-threaded programs, progresses through multiple machine states during execution. The very fast exception handling logic includes connection of an exception signal line to thread select logic, causing an exception signal to evoke a switch in thread and machine state. The switch in thread and machine state causes the processor to enter and to exit the exception handler immediately, without waiting to drain the pipeline or queues and without the inherent timing penalty of the operating systems software saving and restoring of registers.
Multiple-Thread Processor With Single-Thread Interface Shared Among Threads
William N. Joy - Aspen CO Marc Tremblay - Menlo Park CA Gary Lauterbach - Los Altos CA Joseph I. Chamdani - Santa Clara CA
Assignee:
Sun Microsystems, Inc. - Santa Clara CA
International Classification:
G06F 938
US Classification:
712229
Abstract:
A processor includes logic for tagging a thread identifier (TID) for usage with processor blocks that are not stalled. Pertinent non-stalling blocks include caches, translation look-aside buffers (TLB), a load buffer asynchronous interface, an external memory management unit (MMU) interface, and others. A processor includes a cache that is segregated into a plurality of N cache parts. Cache segregation avoids interference, âpollutionâ, or âcross-talkâ between threads. One technique for cache segregation utilizes logic for storing and communicating thread identification (TID) bits. The cache utilizes cache indexing logic. For example, the TID bits can be inserted at the most significant bits of the cache index.
Integrated Circuit Assembly Module That Supports Capacitive Communication Between Semiconductor Dies
Ivan E. Sutherland - Santa Monica CA, US Robert J. Drost - Mountain View CA, US Gary R. Lauterbach - Los Altos Hills CA, US Howard L. Davidson - San Carlos CA, US
Assignee:
Sun Microsystems, Inc. - Santa Clara CA
International Classification:
H01L023/52 H05K007/00
US Classification:
257777, 257691, 257723, 361735
Abstract:
One embodiment of the present invention provides an integrated circuit assembly module, including a first semiconductor die and a second semiconductor die, each semiconductor die with an active face upon which active circuitry and signal pads reside and a back face opposite the active face. The first and second semiconductor dies are positioned face-to-face within the assembly module so that signal pads on the first semiconductor die overlap with signal pads on the second semiconductor die, thereby facilitating capacitive communication between the first and second semiconductor dies. Additionally, the first and second semiconductor dies are pressed together between a first substrate and a second substrate so that a front side of the first substrate is in contact with the back face of the first semiconductor die and a front side of the second substrate is in contact with the back face of the second semiconductor die.
Consultant Aug 2014 - Jan 2016
Chief Technology Officer
Cerebras Systems Aug 2014 - Jan 2016
Chief Technology Officer and Co-Founder
Amd Mar 2012 - Aug 2014
Corporate Vice President and Dcss Chief Technology Officer
Seamicro Dec 2007 - Apr 2012
Chief Technology Officer and Co-Founder
Networkfab Jan 2007 - Aug 2007
Vice President Engineering
Education:
New Jersey Institute of Technology 1974 - 1978
Bachelors, Electronics Engineering, Electronics
Skills:
Asic Embedded Systems Processors System Architecture Soc Semiconductors Computer Architecture Distributed Systems Debugging Cloud Computing Software Engineering Microprocessors Linux Product Management Verilog High Performance Computing Hardware Eda Perl Scalability Storage Fpga Electronics
Gary Lauterbach 1972 graduate of Bay View High School in Milwaukee, WI is on Classmates.com. See pictures, plan your class reunion and get caught up with Gary and other high school ...
Googleplus
Gary Lauterbach
Work:
Community Christian School - Teacher
Education:
University of Northern Iowa - Science
Youtube
SCW18_Keynote_AI Design Forum
So the title of this panel is is the memory wall overcoming the memory...
Duration:
3h 8m 12s
Shop Tour with Larry Lauterbach of Lauterbach...
Larry Lauterbach recently gave PropTalk a shop tour of Lauterbach Cust...
Duration:
5m 10s
Simon Cowell NEW FACE | Plastic Surgery Analy...
Subscribe to our hair newsletter: Which plastic surgery procedures d...
Duration:
9m 7s
FUNKY SLAP BASS - ULI LAUTERBACH | BassTheWor...
Uli played thousands of gigs as a freelance bass player backing up hun...
Duration:
5m 36s
HC23-S7: Server
SeaMicro SM10000-64 Server: Building Data Center Servers Using "Cell P...
Duration:
1h 34m 19s
Gary Halbert - Direct Marketing Secrets Seminar
A direct marketing seminar by info-guru, the late Gary Halbert, called...