Bo Wu
Title
Cited by
Cited by
Year
Complexity Analysis and Algorithm Design for Reorganizing Data to Minimize Non-Coalesced GPU Memory Accesses
B Wu, Z Zhao, E Zhang, Y Jiang, X Shen
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013
109*2013
Can PCM Benefit GPU? Reconciling Hybrid Memory Design with GPU Massive Parallelism for Energy Efficiency
B Wang, B Wu, D Li, X Shen, W Yu, Y Jiao, J Vetter
The 22nd International Conference on Parallel Architectures and Compilation …, 2013
68*2013
PORPLE: An Extensible Optimizer for Portable Data Placement on GPU
G Chen, B Wu, D Li, X Shen
The 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014
672014
Enabling and Exploiting Flexible Task Assignment on GPU through SM-Centric Program Transformations
B Wu, G Chen, D Li, X Shen, J Vetter
The 29th International Conference on Supercomputing, 2015
652015
Flep: Enabling flexible and efficient preemption on gpus
B Wu, X Liu, X Zhou, C Jiang
ACM SIGPLAN Notices 52 (4), 483-496, 2017
452017
ScaAnalyzer: A Tool to Identify Memory Scalability Bottlenecks in Parallel Programs
X Liu, B Wu
The International Conference for High Performance Computing, Networking …, 2015
442015
Finepar: Irregularity-aware fine-grained workload partitioning on integrated architectures
F Zhang, B Wu, J Zhai, B He, W Chen
2017 IEEE/ACM International Symposium on Code Generation and Optimization …, 2017
352017
Challenging the" embarrassingly sequential" parallelizing finite state machine-based computations through principled speculation
Z Zhao, B Wu, X Shen
ACM SIGARCH Computer Architecture News 42 (1), 543-558, 2014
342014
Graphie: Large-scale asynchronous graph traversals on just a GPU
W Han, D Mawhirter, B Wu, M Buland
2017 26th International Conference on Parallel Architectures and Compilation …, 2017
332017
Enhancing Data Locality for Dynamic Simulations through Asynchronous Data Transformations and Adaptive Control
B Wu, EZ Zhang, X Shen
The Twentieth International Conference on Parallel Architectures and …, 2011
212011
Grnn: Low-latency and scalable rnn inference on gpus
C Holmes, D Mawhirter, Y He, F Yan, B Wu
Proceedings of the Fourteenth EuroSys Conference 2019, 1-16, 2019
202019
Automine: harmonizing high-level abstraction and high performance for graph mining
D Mawhirter, B Wu
Proceedings of the 27th ACM Symposium on Operating Systems Principles, 509-523, 2019
182019
Exploiting inter-sequence correlations for program behavior prediction
B Wu, Z Zhao, X Shen, Y Jiang, Y Gao, R Silvera
ACM SIGPLAN Notices 47 (10), 851-866, 2012
152012
Simple profile rectifications go a long way
B Wu, M Zhou, X Shen, Y Gao, R Silvera, G Yiu
European Conference on Object-Oriented Programming, 654-678, 2013
142013
Optimizing data placement on GPU memory: A portable approach
G Chen, X Shen, B Wu, D Li
IEEE Transactions on Computers 66 (3), 473-487, 2016
132016
Co-run scheduling with power cap on integrated cpu-gpu systems
Q Zhu, B Wu, X Shen, L Shen, Z Wang
2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2017
122017
Graphphi: efficient parallel graph processing on emerging throughput-oriented architectures
Z Peng, A Powell, B Wu, T Bicer, B Ren
Proceedings of the 27th International Conference on Parallel Architectures …, 2018
112018
Enabling scalability-sensitive speculative parallelization for fsm computations
J Qiu, Z Zhao, B Wu, A Vishnu, SL Song
Proceedings of the International Conference on Supercomputing, 1-10, 2017
112017
Laius: Towards latency awareness and improved utilization of spatial multitasking accelerators in datacenters
W Zhang, W Cui, K Fu, Q Chen, DE Mawhirter, B Wu, C Li, M Guo
Proceedings of the ACM International Conference on Supercomputing, 58-68, 2019
102019
Speculative parallelization needs rigor: probabilistic analysis for optimal speculation of finite-state machine applications
Z Zhao, B Wu, X Shen
Proceedings of the 21st international conference on Parallel architectures …, 2012
102012
The system can't perform the operation now. Try again later.
Articles 1–20