_mm_hadds_epi16

Microsoft Specific

Emits the Supplemental Streaming SIMD Extensions 3 (SSSE3) instruction phaddsw. This instruction adds the elements of two 128-bit parameters.

__m128i _mm_hadds_epi16( 
   __m128i a,
   __m128i b
);

Parameters

  • [in] a
    A 128-bit parameter that contains eight 16-bit signed integers.

  • [in] b
    A 128-bit parameter that contains eight 16-bit signed integers.

Return value

A 128-bit value that contains eight 16-bit signed integers. Each integer is the sum between adjacent pairs of elements in the input parameters.

The result can be expressed with the following equations:

r0 := SATURATE_16(a0 + a1)
r1 := SATURATE_16(a2 + a3)
r2 := SATURATE_16(a4 + a5)
r3 := SATURATE_16(a6 + a7)
r4 := SATURATE_16(b0 + b1)
r5 := SATURATE_16(b2 + b3)
r6 := SATURATE_16(b4 + b5)
r7 := SATURATE_16(b6 + b7)

Requirements

Intrinsic

Architecture

_mm_hadds_epi16

x86, x64

Header file <tmmintrin.h>

Remarks

r0-r7, a0-a7, and b0-b7 are the sequentially ordered 16-bit components of return value r and parameters a and b. r0, a0, and b0 are the least significant 16 bits.

SATURATE_16(x) is ((x > 32767) ? 32767 : ((x < -32768) ? -32768 : x))

Before you use this intrinsic, software must ensure that the underlying processor supports the instruction.

Example

#include <stdio.h>
#include <tmmintrin.h>

int main ()
{
    __m128i a, b;

    a.m128i_i16[0] = -1;
    a.m128i_i16[1] = 2;
    a.m128i_i16[2] = 0;
    a.m128i_i16[3] = 128;
    a.m128i_i16[4] = -32768;
    a.m128i_i16[5] = -10000;
    a.m128i_i16[6] = 32000;
    a.m128i_i16[7] = 10000;
    b.m128i_i16[0] = 128;
    b.m128i_i16[1] = -64;
    b.m128i_i16[2] = 52;
    b.m128i_i16[3] = -200;
    b.m128i_i16[4] = 0;
    b.m128i_i16[5] = 0;
    b.m128i_i16[6] = 4;
    b.m128i_i16[7] = 2;

    __m128i res = _mm_hadds_epi16(a, b);

    printf_s("Original a:\t%6d\t%6d\t%6d\t%6d\n\t\t%6d\t%6d\t%6d\t%6d\n",
                a.m128i_i16[0], a.m128i_i16[1], a.m128i_i16[2], a.m128i_i16[3],
                a.m128i_i16[4], a.m128i_i16[5], a.m128i_i16[6], a.m128i_i16[7]);
    printf_s("Original b:\t%6d\t%6d\t%6d\t%6d\n\t\t%6d\t%6d\t%6d\t%6d\n",
                b.m128i_i16[0], b.m128i_i16[1], b.m128i_i16[2], b.m128i_i16[3],
                b.m128i_i16[4], b.m128i_i16[5], b.m128i_i16[6], b.m128i_i16[7]);
    printf_s("Result res:\t%6d\t%6d\t%6d\t%6d\n\t\t%6d\t%6d\t%6d\t%6d\n",
                res.m128i_i16[0], res.m128i_i16[1], res.m128i_i16[2], res.m128i_i16[3],
                res.m128i_i16[4], res.m128i_i16[5], res.m128i_i16[6], res.m128i_i16[7]);

    return 0;
}

Original a:         -1       2       0     128
                -32768  -10000   32000   10000
Original b:        128     -64      52    -200
                     0       0       4       2
Result res:          1     128  -32768   32767
                    64    -148       0       6

See Also

Concepts

Compiler Intrinsics