


Where they work, they allow for decent utilization on accelerated systems with reduced development. My preference would be to use directive based approaches. Some work well in general, other work well when restricted to a single GPU, and others outright don't work. But my experience is that it can be hit or miss with the functions in these libraries. I know there are libraries like cuBLAS and cuSOLVER. Third, I need access to efficient numerical math libraries that work across multiple GPUs. This is a common complaint for example from ASCR funded scientists, especially those running the exascale project. Yet I constantly hear developers, both government funded and commercial, decry that they have to support Fortran. Many government funded research codes are written in Fortran. The US government is spending big bucks supporting the development of GPU accelerators and their programing paradigms. Another example is array reductions, which thankfully is coming to OpenAcc 2.7. For example, the Fortran sum command should be a simple reduction, but I know in OpenAcc its usage problematic.

This paradigm should also support common Fortran features. Second, I want a paradigm that natively supports Fortran, instead of begrudgingly supporting Fortran as an afterthought. As a research scientist I don't have the time or funding to independently port my codes to AMD, Nvidia, and Intel architectures using a different paradigm for each. First, I want a paradigm that is architecture agnostic.
