A method for loop reformulation is provided such that a single exit ill-formed loop (SEIFL) can be reformulated into a reformulated code block that contains a transformed well-formed loop (TWFL). A SEIFL loop is a loop that can exit from the loop body of the loop. After the loop reformulation, the TWFL of the reformulated code block can only exit from the end of the loop. The reformulated code block will replace the SEIFL in the compiler's internal representation (IR) such that a more efficient executable machine code can be generated by optimizing the reformulated compiler's IR.
Compiler Framework For Speculative Automatic Parallelization With Transactional Memory
Yonghong Song - Palo Alto CA, US Xiangyun Kong - Union City CA, US Spiros Kalogeropulos - Los Gatos CA, US Partha P. Tirumalai - Fremont CA, US
Assignee:
Oracle America, Inc. - Redwood City CA
International Classification:
G06F 9/45
US Classification:
717140, 717141, 717143
Abstract:
A computer program is speculatively parallelized with transactional memory by scoping program variables at compile time, and inserting code into the program at compile time. Determinations of the scoping can be based on whether scalar variables being scoped are involved in inter-loop non-reduction data dependencies, are used outside loops in which they were defined, and at what point in a loop a scalar variable is defined. The inserted code can include instructions for execution at a run time of the program to determine loop boundaries of the program, and issue checkpoint instructions and commit instructions that encompass transaction regions in the program. A transaction region can include an original function of the program and a spin-waiting loop with a non-transactional load, wherein the spin-waiting loop is configured to wait for a previous thread to commit before the current transaction commits.
Value Predictable Variable Scoping For Speculative Automatic Parallelization With Transactional Memory
Yonghong Song - Palo Alto CA, US Xiangyun Kong - Union City CA, US Spiros Kalogeropulos - Los Gatos CA, US Partha P. Tirumalai - Fremont CA, US
Assignee:
Oracle America, Inc. - Redwood Shores CA
International Classification:
G06F 9/45
US Classification:
717149, 717140, 717151, 717160
Abstract:
Parallelize a computer program by scoping program variables at compile time and inserting code into the program. Identify as value predictable variables, variables that are: defined only once in a loop of the program; not defined in any inner loop of the loop; and used in the loop. Optionally also: identify a code block in the program that contains a variable assignment, and then traverse a path backwards from the block through a control flow graph of the program. Name in a set all blocks along the path until a loop header block. For each block in the set, determine program blocks that logically succeed the block and are not in the first set. Identify all paths between the block and the determined blocks as failure paths, and insert code into the failure paths. When executed at run time of the program, the inserted code fails the corresponding path.
Yonghong Song - Sunnyvale CA, US Xiangyun Kong - Fremont CA, US Jian-Zhong Wang - Fremont CA, US
International Classification:
G06F009/45 G06F009/44
US Classification:
717/150000, 717/119000
Abstract:
Index association based dependence analysis accurately determines lack of dependence for complex memory subscript references to allow greater use of loop transformation and automatic parallelization at compile of an application. Index association functions that map an original i index space to a dependence analysis j index space are analyzed at compile to determine one-to-one mapping or many-to-one mapping. For dependence analysis of two references with a one-to-one mapping determination, lack of dependence in the dependence analysis index space confirms lack of dependence in the original index space. For many-to-one mapping, both a lack of dependence in the dependence analysis index space and a check that no two iterations in the original index space could map to the two references in the dependence analysis index space confirms no dependence for the two references.
Method For Convergence Analysis Based On Thread Variance Analysis
Vinod GROVER - Mercer Island WA, US Yunsup LEE - Fremont CA, US Xiangyun KONG - Union City CA, US Gautam CHAKRABARTI - Sunnyvale CA, US Ronny M. KRASHINSKY - San Francisco CA, US
International Classification:
G06F 9/312
US Classification:
712214, 712E09033
Abstract:
Basic blocks within a thread program are characterized for convergence based on variance analysis or corresponding instructions. Each basic block is marked as divergent based on transitive control dependence on a block that is either divergent or comprising a variant branch condition. Convergent basic blocks that are defined by invariant instructions are advantageously identified as candidates for scalarization by a thread program compiler.
Yonghong Song - South San Francisco CA, US Xiangyun Kong - Union City CA, US
Assignee:
Sun Microsystems, Inc. - Santa Clara CA
International Classification:
G06F 9/45
US Classification:
717160, 717159, 717161
Abstract:
We present a technique to perform dependence analysis on more complex array subscripts than the linear form of the enclosing loop indices. For such complex array subscripts, we decouple the original iteration space and the dependence test iteration space and link them through index-association functions. The dependence analysis is performed in the dependence test iteration space to determine whether the dependence exists in the original iteration space. The dependence distance in the original iteration space is determined by the distance in the dependence test iteration space and the property of index-association functions. For certain non-linear expressions, we show how to transform it to a set of linear expressions equivalently. The latter can be used in dependence test with traditional techniques. We also show how our advanced dependence analysis technique can help parallelize some otherwise hard-to-parallelize loops.
System And Method For Inserting Synchronization Statements Into A Program File To Mitigate Race Conditions
- Santa Clara CA, US Xiangyun Kong - Santa Clara CA, US Jae-Woo Lee - West Lafayette IN, US Manjunath Kudlur - Santa Clara CA, US Jian-Zhong Wang - Santa Clara CA, US
Assignee:
Nvidia Corporation - Santa Clara CA
International Classification:
G06F 9/44
US Classification:
717110
Abstract:
A system and method are provided for inserting synchronization statements into a program file to mitigate race conditions. The method includes reading a program file and determining one or more convergent statements in the program file. The method also includes inserting one or more synchronization statements in the program file between the determined convergent statements. The method further includes removing one or more of the inserted synchronization statements and writing the modified program file. The method may include, after removing the inserted synchronization statements, identifying to a user any remaining inserted synchronization statements.
Googleplus
Xiangyun Kong
Xiangyun Kong
Xiangyun Kong
Youtube
Xiangyun Kong Bricklayer 1
Duration:
4m 49s
Worlds largest container ship OOCL Hong Kong'...
The world's largest container ship, the 21413 TEU OOCL Hong Kong, has ...
Duration:
2m 19s
Xiangyun Kong Brick layer2
Duration:
4m 40s
FPV - CINEMATIC - BIG BANDO CHINA / XIANGYUN
i was in business trip in a city a bit far from Shanghai and every mor...
Duration:
1m 30s
2021 National Wushu Routine Championship Wome...
2021 National Wushu Routine Championship Women's Tai Chi 10-1210th Xu ...
Duration:
11m 19s
China Builds Fake Paris, London and Jackson H...
Tianducheng, or "Sky Capital City," is a real estate development model...