What features do you want in C++ AMP V3?

A little over year ago, we released the first version of C++ AMP technology as a part of Visual Studio 2012. In Visual Studio 2013, we are on track to deliver the next version of C++ AMP. Hopefully, by now, you had a chance to learn about what is new for C++ AMP in Visual Studio 2013. We are delighted to see how C++ AMP has been received by the community. C++ AMP is being used in demos such as BUILD 2013 keynote demo, and by various customers including Aviary and Kinect Fusion. Additionally, it's encouraging to see our partners like Intel doing proof of concept implementation of C++ AMP's Open specification.

With the second release getting ready for release, the team is actively planning new features and improvements for the next version of C++ AMP. We want to reach out to the community for suggestions and feature requests. Your suggestions and requests play an integral role in our planning activities. If you have any feature request or suggestion, we would love to hear about it. So please do let us know about your suggestions either in this blog or at MSDN forum.

Comments

  • Anonymous
    September 18, 2013
    Support for handling int8,uint8,int16, uint16, int32, uint32 vectorized on CPU and/or GPU. E.g. for doing uint8 image processing etc.

  • Anonymous
    September 18, 2013
    Unified memory textures.

  • Anonymous
    September 19, 2013
    Just make it like CUDA in terms of classes/structs support

  • Anonymous
    September 20, 2013
    Jim mentions there that future versions of the autovectorizing C++ compiler would not just target many conventional cores and vector units across those cores but also GPGPU cores. channel9.msdn.com/.../Compilerpp- How does AMP fit into this picture? If we can have general mapping of regular C++ code into GPGPU cores then does that mean we won't have to decorate C++ functions in the future with the 'amp' highlight.

  • Anonymous
    September 21, 2013
    As C++ AMP is an open spec and is already a great abstraction, why not propose to the C++ standards committee to setup a working group to roll it into C++17 proper?

  • Anonymous
    September 21, 2013
    Just allow same feature set as cuda and opencl 2.0 so nested parallellism i.e. Kernels call kernels, pipes concept in opencl 2.0 to pass data between kernels, and also pointers shipping since cuda 1.0.. Also recursion as supported in cuda since fermi.. Function pointers.. And new gpu instructions like intra warp instructions shuffle in kepler.. 64biti atomics float atomics.. Named barriers i.e threads in threadblock synchronize only to threads with same barrier id see cudadma project for use case.. And graphics features as support new tiled sparse textures in dx11.2, for depth textures, msaa textures, mipmaps,cubemap textures.etc.. Also hope we get bindless texture in dx12 and amp expose it too.. Also last year you anounced a spec with a roadmap hope next year we get whats called 2.0 in spec..

  • Anonymous
    September 21, 2013
    Just more synchronization, please :)

  • Anonymous
    September 24, 2013
    Some manner - other than launching a new kernel - of waiting for all device threads to complete before continuing.

  • Anonymous
    September 24, 2013
    complex numbers? haven't really tried the FFT library though yet

  • Anonymous
    October 01, 2013
    A compiler option to generate warnings on all usage of doubles in AMP-lambdas would be very useful. For example, manually searching for double literals when porting code is a pain and if you miss only a single one colleagues developing using Win7+warp will get a crash. Also useful to avoid inadvertently running (parts of some lambdas) at <5% of possible speed on many cards... Ability to query if ECC is supported on accelerator (and enabled in driver). Please distinguish full ECC support (memory and registers etc) from the partial support available in some recent cards. A "C++AMP standard library" of highly-optimized utility functions (available directly as part of VS). Common reductions, prefix sum, sort, etc... I know there are parts of this available on the web, I just think it would be reasonable to have it as part of VS. Also, wrt quality/regulatory issues, getting it as part of VS is much nicer that "something-I-found-on-the-web". 64-bit integers and atomic operations (already mentioned). (My usage: there are no atomic operations for floats, and even if there where, they would not be deterministic -- sometimes you can use fixed-point math and int atomics instead, but 32-bit ints are often too limited). 8/16-bit datatypes (already mentioned) (the lack of such datatypes are actually limiting real use cases involving 3D data due to out-of-memory issues). Some kind of automatic lambda capture clause (everything by value except arrays). I have noticed that lambdas using WARP sometimes are slower than (very very similar) lambdas executed by PPL. Maybe the JIT compiler can recognize such cases and only use WARP when beneficial.

I am already very happy with C++AMP. Looking forward to v3...

  • Anonymous
    October 02, 2013
    2D and 3D (and higher!) texture & array/array_view interpolation ala CUDA text2D/tex3D. Cubic/hermite spline interpolation in 3D or higher up to 6D would be amazing.

  • Anonymous
    October 07, 2013
    Thanks everyone for your feedback. keep them coming..

  • Anonymous
    October 08, 2013
    The comment has been removed

  • Anonymous
    October 08, 2013
    The comment has been removed

  • Anonymous
    November 12, 2013
    Remove the ban on zero-length Concurrency::array, it seems arbitrary and impedes readability of code where such arrays are needed (e.g. varying topology neural networks).

  • Anonymous
    January 24, 2014
    API's to monitor device performance, throughput and state? How can we deploy  C++AMP applications as compute 'services', if we can't monitor and measure their performance remotely or determine when they hang-up?

  • Anonymous
    February 14, 2014
    It would be nice if I can write pixel/vertex/geometry/hull/domain shader with c++ AMP. I really love the feature that C++ AMP runtime takes care of copying data between CPU/GPU for me. Right now c++ AMP is limited to compute shader. It would be nice if I could do the same for the other shaders mentioned above.

  • Anonymous
    June 08, 2014
    It'd be really cool to see some HLSL intrinsics like normalize(), mul(), dot() and so on.

  • Anonymous
    July 11, 2014
    I would really like native uint support. I would still love to see the api contain Perlin noise in 2 and 3 dimensions (it already has 1 dimension) I would still love to see native support for random numbers.

  • Anonymous
    July 18, 2014
    Sorry, should have read "native uchar"

  • Anonymous
    July 12, 2018
    Agreed it would help a lot to see more HLSL intrinsics like normalize, mul, dot, cross, mag, etc.