New Release of the AMP Algorithms Library
If you are fond of high-performance algorithms, you will be pleased to find out that our friend Ade Miller has just issued a new iteration of the AMP Algorithms Library. As usual, Ade's work is top notch, and it brings notable improvements across the board; in his own words:
Finally, there is a new release of the C++ AMP Algorithms Library! It has taken a while, largely due to other things, like CppCon taking up my time. This release contains the following:
New C++ AMP features:
AMP and STL algorithms no longer depend on DirectX scan implementation.
New implementation of amp_algorithms::scan that does not have a direct dependency on the ID3DX11Scan and ID3DX11SegmentedScan interfaces.
The amp_stl_algorithms::copy_if and remove_if algorithms use the new scan implementation now, for improved performance.
Implementation of radix sort amp_algorithms::radix_sort.
New utility functions: log2, is_power_of_two, count_bits, padded_read, padded_write, pack_byte and unpack_byte.
New namespace added for DirectX dependent features, amp_algorithms::direct3d. All DirectX code now in a separate header file amp_algorithms_direct3d.h.
New C++ AMP STL features:
inner_product
minmax
pair<T1, T2>
rotate_copy
New SAXPY example.
Reorganized unit tests, consistent names and test categories.
As usual, you can download the latest iteration from: https://ampalgorithms.codeplex.com/, and enjoy the benefits of heterogeneous parallelism.
Comments
Anonymous
November 24, 2014
You really should create something comparable to Intel IPP and MKL to really drive this to the market.Anonymous
February 11, 2015
I have a simple question, is there any plan to remake C++ amp runtime with using DirectX 12?Anonymous
July 08, 2015
Do you really have any plan to develop this project? I ve optimized matrix multiplication and my version gives following results in managed code:
- Sequential realisation runs better than parallel in random traveler!
- My parallel code runs 2-5% faster than c++ amp warp version. Does c++ amp warp really use sse?
Anonymous
July 23, 2015
Concerning Random Traveler again... I have tested c++ compiler vectorization and visual studio 2015 new c# compiler. So vectorized tiled version of matrix multipication (C++ .dll best verision without any transposition) on 2900*2900 matrix loads only one core and runs at the speed approximately 5 GFlops on my computer, C# Parallel loop with partitioner and unsafe pointer on Visual Studio 2015 approximately 4 GFlops. Warp version 1.1 GFlops. And the first test of naive C# was at the speed 0.04 GFlops, your verision of parallel loop where you decided to use pointers was about 0.35 GFlops. GPU is needed only with bigger matrices. Seems you are giving unfair information.Anonymous
July 24, 2015
Hi, VS 2015 and Win 10 are out. Do you have any plans for DirectX 12 or is c++ amp dead? MikeAnonymous
August 12, 2015
Is c++ amp dead?Anonymous
September 08, 2015
Any news on C++ AMP with Windows 10 / WDDM 2.0?Anonymous
November 18, 2015
guys what's the state of C++amp?Anonymous
January 02, 2016
Unfortunately, after 2 years without any update or news, we can now safely say that this project is dead. RIP.