... worksefficiently in both single-threaded and multithreaded cases.1 IntroductionWith SMP machines being commonly available and multicore chips becoming thenorm, the mixing of the message-passing programming ... J.Liu,J.Wu,S.P.Kini,P.Wyckoff,andD.K.Panda.HighperformanceRDMA-based MPI implementation over in niband. In ICS ’03: Proceedings of the 17thannual international conference on Supercomputing, pages ... software and hardware vendors, including practical tips and tricks on how to use them for performance tuning.When designing, developing, or using a performance tool, one has to decide onwhich instrumentation...