. . "RIV/00216305:26230/14:PU111935!RIV15-MSM-26230___" . "RIV/00216305:26230/14:PU111935" . "18518" . . "GPU-accelerated Evolutionary Design of the Complete Exchange Communication on Wormhole Networks" . "2014-07-12+02:00"^^ . . "http://dl.acm.org/citation.cfm?id=2576768.2598315" . . "2"^^ . . "978-1-4503-2662-9" . . "The communication overhead is one of the main challenges in the exascale era, where millions of compute cores are expected to collaborate on solving complex jobs. However, many algorithms will not scale since they require complex global communication and synchronisation. In order to perform the communication as fast as possible, contentions, blocking and deadlock must be avoided. Recently, we have developed an evolutionary tool producing fast and safe communication schedules reaching the lower bound of the theoretical time complexity. Unfortunately, the execution time associated with the evolution process raises up to tens of hours, even when being run on a multi-core processor. In this paper, we propose a revised implementation accelerated by a single Graphic Processing Unit (GPU) delivering speed-up of 5 compared to a quad-core CPU. Subsequently, we introduce an extended version employing up to 8 GPUs in a shared memory environment offering a speed-up of almost 30. This significantly" . "GPU-accelerated Evolutionary Design of the Complete Exchange Communication on Wormhole Networks"@en . "10.1145/2576768.2598315" . "GPU-accelerated Evolutionary Design of the Complete Exchange Communication on Wormhole Networks" . "Tyrala, Radek" . "2"^^ . "The communication overhead is one of the main challenges in the exascale era, where millions of compute cores are expected to collaborate on solving complex jobs. However, many algorithms will not scale since they require complex global communication and synchronisation. In order to perform the communication as fast as possible, contentions, blocking and deadlock must be avoided. Recently, we have developed an evolutionary tool producing fast and safe communication schedules reaching the lower bound of the theoretical time complexity. Unfortunately, the execution time associated with the evolution process raises up to tens of hours, even when being run on a multi-core processor. In this paper, we propose a revised implementation accelerated by a single Graphic Processing Unit (GPU) delivering speed-up of 5 compared to a quad-core CPU. Subsequently, we introduce an extended version employing up to 8 GPUs in a shared memory environment offering a speed-up of almost 30. This significantly"@en . . . . "Sheraton Wall Centre Vancouver" . . "[E8E0B2C670EC]" . "Complete exchange communication, Collective communications, communication scheduling, evolutionary design, GPU-based acceleration, multi-GPU systems."@en . "Association for Computing Machinery" . . "GECCO '14 Proceedings of the 2014 conference on Genetic and evolutionary computation" . . "New York, NY" . . . . "8"^^ . "GPU-accelerated Evolutionary Design of the Complete Exchange Communication on Wormhole Networks"@en . . . "26230" . "Jaro\u0161, Ji\u0159\u00ED" . "S" . .