_mm_nmacc_sd
Visual Studio 2010 SP1 is required.
Microsoft Specific
Generates the FMA4 XMM instruction vfnmaddsd to perform a single-round double-precision floating-point negative multiply-add of the low-order floating-point values of its sources.
__m128d _mm_nmacc_sd (
__m128d src1,
__m128d src2,
__m128d src3
);
Parameters
[in] src1
A 128-bit parameter that contains a 64-bit floating-point value in the low quadword.[in] src2
A 128-bit parameter that contains a 64-bit floating-point value in the low quadword.[in] src3
A 128-bit parameter that contains a 64-bit floating-point value in the low quadword.
Return value
A 128-bit result r that contains two 64-bit floating-point values.
r[0] := -(src1[0] * src2[0]) + src3[0];
r[1] := 0.;
Requirements
Intrinsic |
Architecture |
---|---|
_mm_nmacc_sd |
FMA4 |
Header file <intrin.h>
Remarks
The low-order double-precision floating-point value in src1 is multiplied by the corresponding value in src2. The result is negated and added to the corresponding value in src3, and the result is stored as the corresponding value in the destination. The other values in src1, src2, and src3 are ignored, and the high-order double-precision floating-point value of the result is set to 0. The multiply-negate-add is performed with a single round at the end, as if intermediate results were computed to infinite precision.
The vfnmaddsd instruction is part of the FMA4 family of instructions. Before you use this intrinsic, you must ensure that the processor supports this instruction. To determine hardware support for this instruction, call the __cpuid intrinsic with InfoType = 0x80000001 and check bit 16 of CPUInfo[2] (ECX). This bit is 1 when the instruction is supported, and 0 otherwise.
Example
#include <stdio.h>
#include <intrin.h>
int main()
{
__m128d a, b, c, d;
int i;
for (i = 0; i < 2; i++) {
a.m128d_f64[i] = i;
b.m128d_f64[i] = 2.;
c.m128d_f64[i] = 3.;
}
d = _mm_nmacc_sd(a, b, c);
for (i = 0; i < 2; i++) printf_s(" %.3lf", d.m128d_f64[i]);
printf_s("\n");
}
3.000 0.000