OpenMP 절

아티클
08/04/2024

OpenMP API에 사용되는 절에 대한 링크를 제공합니다.

Visual C++는 다음 OpenMP 절을 지원합니다.

일반 특성의 경우:

절	설명
if	루프를 병렬 또는 직렬로 실행할지 여부를 지정합니다.
num_threads	스레드 팀의 스레드 수를 설정합니다.
ordered	순서가 지정된 지시문을 루프에서 사용하는 경우 문에 대한 병렬에 필요합니다.
schedule	for 지시문에 적용됩니다.
nowait	지시문에서 암시적 장벽을 재정의합니다.

데이터 공유 특성의 경우:

절	설명
private	각 스레드에 변수의 자체 인스턴스가 있어야 하며
firstprivate	각 스레드에 고유한 변수 인스턴스가 있어야 하며 변수가 병렬 구문 앞에 존재하기 때문에 변수 값을 사용하여 초기화되도록 지정합니다.
lastprivate	바깥쪽 컨텍스트의 변수 버전이 최종 반복(for-loop 구문) 또는 마지막 섹션(#pragma 섹션)을 실행하는 스레드의 프라이빗 버전과 동일하게 설정되도록 지정합니다.
shared	하나 이상의 변수를 모든 스레드 간에 공유되도록 지정합니다.
default	병렬 영역에서 범위가 지정되지 않은 변수의 동작을 지정합니다.
reduction	각 스레드에 대해 비공개인 하나 이상의 변수가 병렬 영역의 끝에 있는 감소 작업의 대상이 되도록 지정합니다.
copyin	스레드가 threadprivate 변수에 대해 주 스레드의 값에 액세스할 수 있도록 허용합니다 .
copyprivate	하나 이상의 변수를 모든 스레드 간에 공유되도록 지정합니다.

copyin

스레드가 threadprivate 변수에 대해 주 스레드의 값에 액세스할 수 있도록 허용합니다 .

copyin(var)

매개 변수

var
threadprivate 병렬 구문 앞에 있는 기본 스레드에서 변수의 값으로 초기화할 변수입니다.

설명

copyin 는 다음 지시문에 적용됩니다.

자세한 내용은 2.7.2.7 복사를 참조 하세요.

예시

사용 copyin예제는 threadprivate를 참조하세요.

copyprivate

하나 이상의 변수를 모든 스레드 간에 공유되도록 지정합니다.

copyprivate(var)

매개 변수

var
공유할 변수가 하나 이상 있습니다. 둘 이상의 변수를 지정하면 변수 이름을 쉼표로 구분합니다.

설명

copyprivate는 단일 지시문에 적용됩니다.

자세한 내용은 2.7.2.8 copyprivate를 참조 하세요.

예시

// omp_copyprivate.cpp
// compile with: /openmp
#include <stdio.h>
#include <omp.h>

float x, y, fGlobal = 1.0;
#pragma omp threadprivate(x, y)

float get_float() {
   fGlobal += 0.001;
   return fGlobal;
}

void use_float(float f, int t) {
   printf_s("Value = %f, thread = %d\n", f, t);
}

void CopyPrivate(float a, float b) {
   #pragma omp single copyprivate(a, b, x, y)
   {
      a = get_float();
      b = get_float();
      x = get_float();
      y = get_float();
    }

   use_float(a, omp_get_thread_num());
   use_float(b, omp_get_thread_num());
   use_float(x, omp_get_thread_num());
   use_float(y, omp_get_thread_num());
}

int main() {
   float a = 9.99, b = 123.456;

   printf_s("call CopyPrivate from a single thread\n");
   CopyPrivate(9.99, 123.456);

   printf_s("call CopyPrivate from a parallel region\n");
   #pragma omp parallel
   {
      CopyPrivate(a, b);
   }
}

call CopyPrivate from a single thread
Value = 1.001000, thread = 0
Value = 1.002000, thread = 0
Value = 1.003000, thread = 0
Value = 1.004000, thread = 0
call CopyPrivate from a parallel region
Value = 1.005000, thread = 0
Value = 1.005000, thread = 1
Value = 1.006000, thread = 0
Value = 1.006000, thread = 1
Value = 1.007000, thread = 0
Value = 1.007000, thread = 1
Value = 1.008000, thread = 0
Value = 1.008000, thread = 1

default

병렬 영역에서 범위가 지정되지 않은 변수의 동작을 지정합니다.

default(shared | none)

설명

shared절이 지정되지 않은 경우 default 적용됩니다. 즉, 병렬 영역의 변수는 공유 절로 지정된 것처럼 처리됩니다. none은 프라이빗, 공유 , 축소 , firstprivate 또는 lastprivate 절로 범위가 지정되지 않은 병렬 지역에 사용되는 변수가 컴파일러 오류를 발생시키는 것을 의미합니다.

default 는 다음 지시문에 적용됩니다.

자세한 내용은 2.7.2.5 기본값을 참조하세요.

예시

사용 default예제는 private을 참조하세요.

firstprivate

각 스레드에 고유한 변수 인스턴스가 있어야 하며 변수가 병렬 구문 앞에 존재하기 때문에 변수 값을 사용하여 초기화되도록 지정합니다.

firstprivate(var)

매개 변수

var
각 스레드에 인스턴스가 있고 이 변수는 병렬 구문 앞에 존재하기 때문에 변수의 값으로 초기화됩니다. 둘 이상의 변수를 지정하면 변수 이름을 쉼표로 구분합니다.

설명

firstprivate 는 다음 지시문에 적용됩니다.

자세한 내용은 2.7.2.2 firstprivate를 참조 하세요.

예시

사용 firstprivate예제는 private의 예제를 참조하세요.

if(OpenMP)

루프를 병렬 또는 직렬로 실행할지 여부를 지정합니다.

if(expression)

매개 변수

expression
true(0이 아닌)로 평가되면 병렬 영역의 코드가 병렬로 실행되는 정수 계열 식입니다. 식이 false(0)로 평가되면 병렬 영역이 단일 스레드에 의해 직렬로 실행됩니다.

설명

if 는 다음 지시문에 적용됩니다.

자세한 내용은 2.3 병렬 구문을 참조하세요.

예시

// omp_if.cpp
// compile with: /openmp
#include <stdio.h>
#include <omp.h>

void test(int val)
{
    #pragma omp parallel if (val)
    if (omp_in_parallel())
    {
        #pragma omp single
        printf_s("val = %d, parallelized with %d threads\n",
                 val, omp_get_num_threads());
    }
    else
    {
        printf_s("val = %d, serialized\n", val);
    }
}

int main( )
{
    omp_set_num_threads(2);
    test(0);
    test(2);
}

val = 0, serialized
val = 2, parallelized with 2 threads

lastprivate

바깥쪽 컨텍스트의 변수 버전이 최종 반복(for-loop 구문) 또는 마지막 섹션(#pragma 섹션)을 실행하는 스레드의 프라이빗 버전과 동일하게 설정되도록 지정합니다.

lastprivate(var)

매개 변수

var
최종 반복(for-loop 구문) 또는 마지막 섹션(#pragma 섹션)을 실행하는 스레드의 프라이빗 버전과 동일하게 설정된 변수입니다.

설명

lastprivate 는 다음 지시문에 적용됩니다.

for
sections

자세한 내용은 2.7.2.3 lastprivate를 참조 하세요.

예시

절 사용 lastprivate 예제는 일정을 참조하세요.

nowait

지시문에서 암시적 장벽을 재정의합니다.

nowait

설명

nowait 는 다음 지시문에 적용됩니다.

자세한 내용은 구문의 경우 2.4.1, 2.4.2 섹션 구문 및 2.4.3 단일 구문을 참조하세요.

예시

// omp_nowait.cpp
// compile with: /openmp /c
#include <stdio.h>

#define SIZE 5

void test(int *a, int *b, int *c, int size)
{
    int i;
    #pragma omp parallel
    {
        #pragma omp for nowait
        for (i = 0; i < size; i++)
            b[i] = a[i] * a[i];

        #pragma omp for nowait
        for (i = 0; i < size; i++)
            c[i] = a[i]/2;
    }
}

int main( )
{
    int a[SIZE], b[SIZE], c[SIZE];
    int i;

    for (i=0; i<SIZE; i++)
        a[i] = i;

    test(a,b,c, SIZE);

    for (i=0; i<SIZE; i++)
        printf_s("%d, %d, %d\n", a[i], b[i], c[i]);
}

0, 0, 0
1, 1, 0
2, 4, 1
3, 9, 1
4, 16, 2

num_threads

스레드 팀의 스레드 수를 설정합니다.

num_threads(num)

매개 변수

num
스레드 수

설명

절에는 num_threads omp_set_num_threads 함수와 동일한 기능이 있습니다.

num_threads 는 다음 지시문에 적용됩니다.

자세한 내용은 2.3 병렬 구문을 참조하세요.

예시

절 사용 num_threads 예제는 병렬을 참조하세요.

ordered

순서가 지정된 지시문을 루프에서 사용하는 경우 문에 대한 병렬에 필요합니다.

ordered

설명

ordered는 for 지시문에 적용됩니다.

자세한 내용은 구문에 대한 2.4.1을 참조하세요.

예시

절 사용 ordered 예제는 ordered를 참조하세요.

private

각 스레드에 변수의 자체 인스턴스가 있어야 하며

private(var)

매개 변수

var
각 스레드에 인스턴스가 있는 변수입니다.

설명

private 는 다음 지시문에 적용됩니다.

자세한 내용은 2.7.2.1 private을 참조 하세요.

예시

// openmp_private.c
// compile with: /openmp
#include <windows.h>
#include <assert.h>
#include <stdio.h>
#include <omp.h>

#define NUM_THREADS 4
#define SLEEP_THREAD 1
#define NUM_LOOPS 2

enum Types {
   ThreadPrivate,
   Private,
   FirstPrivate,
   LastPrivate,
   Shared,
   MAX_TYPES
};

int nSave[NUM_THREADS][MAX_TYPES][NUM_LOOPS] = {{0}};
int nThreadPrivate;

#pragma omp threadprivate(nThreadPrivate)
#pragma warning(disable:4700)

int main() {
   int nPrivate = NUM_THREADS;
   int nFirstPrivate = NUM_THREADS;
   int nLastPrivate = NUM_THREADS;
   int nShared = NUM_THREADS;
   int nRet = 0;
   int i;
   int j;
   int nLoop = 0;

   nThreadPrivate = NUM_THREADS;
   printf_s("These are the variables before entry "
           "into the parallel region.\n");
   printf_s("nThreadPrivate = %d\n", nThreadPrivate);
   printf_s("      nPrivate = %d\n", nPrivate);
   printf_s(" nFirstPrivate = %d\n", nFirstPrivate);
   printf_s("  nLastPrivate = %d\n", nLastPrivate);
   printf_s("       nShared = %d\n\n", nShared);
   omp_set_num_threads(NUM_THREADS);

   #pragma omp parallel copyin(nThreadPrivate) private(nPrivate) shared(nShared) firstprivate(nFirstPrivate)
   {
      #pragma omp for schedule(static) lastprivate(nLastPrivate)
      for (i = 0 ; i < NUM_THREADS ; ++i) {
         for (j = 0 ; j < NUM_LOOPS ; ++j) {
            int nThread = omp_get_thread_num();
            assert(nThread < NUM_THREADS);

            if (nThread == SLEEP_THREAD)
               Sleep(100);
            nSave[nThread][ThreadPrivate][j] = nThreadPrivate;
            nSave[nThread][Private][j] = nPrivate;
            nSave[nThread][Shared][j] = nShared;
            nSave[nThread][FirstPrivate][j] = nFirstPrivate;
            nSave[nThread][LastPrivate][j] = nLastPrivate;
            nThreadPrivate = nThread;
            nPrivate = nThread;
            nShared = nThread;
            nLastPrivate = nThread;
            --nFirstPrivate;
         }
      }
   }

   for (i = 0 ; i < NUM_LOOPS ; ++i) {
      for (j = 0 ; j < NUM_THREADS ; ++j) {
         printf_s("These are the variables at entry of "
                  "loop %d of thread %d.\n", i + 1, j);
         printf_s("nThreadPrivate = %d\n",
                  nSave[j][ThreadPrivate][i]);
         printf_s("      nPrivate = %d\n",
                  nSave[j][Private][i]);
         printf_s(" nFirstPrivate = %d\n",
                  nSave[j][FirstPrivate][i]);
         printf_s("  nLastPrivate = %d\n",
                  nSave[j][LastPrivate][i]);
         printf_s("       nShared = %d\n\n",
                  nSave[j][Shared][i]);
      }
   }

   printf_s("These are the variables after exit from "
            "the parallel region.\n");
   printf_s("nThreadPrivate = %d (The last value in the "
            "main thread)\n", nThreadPrivate);
   printf_s("      nPrivate = %d (The value prior to "
            "entering parallel region)\n", nPrivate);
   printf_s(" nFirstPrivate = %d (The value prior to "
            "entering parallel region)\n", nFirstPrivate);
   printf_s("  nLastPrivate = %d (The value from the "
            "last iteration of the loop)\n", nLastPrivate);
   printf_s("       nShared = %d (The value assigned, "
            "from the delayed thread, %d)\n\n",
            nShared, SLEEP_THREAD);
}

These are the variables before entry into the parallel region.
nThreadPrivate = 4
      nPrivate = 4
nFirstPrivate = 4
  nLastPrivate = 4
       nShared = 4

These are the variables at entry of loop 1 of thread 0.
nThreadPrivate = 4
      nPrivate = 1310720
nFirstPrivate = 4
  nLastPrivate = 1245104
       nShared = 3

These are the variables at entry of loop 1 of thread 1.
nThreadPrivate = 4
      nPrivate = 4488
nFirstPrivate = 4
  nLastPrivate = 19748
       nShared = 0

These are the variables at entry of loop 1 of thread 2.
nThreadPrivate = 4
      nPrivate = -132514848
nFirstPrivate = 4
  nLastPrivate = -513199792
       nShared = 4

These are the variables at entry of loop 1 of thread 3.
nThreadPrivate = 4
      nPrivate = 1206
nFirstPrivate = 4
  nLastPrivate = 1204
       nShared = 2

These are the variables at entry of loop 2 of thread 0.
nThreadPrivate = 0
      nPrivate = 0
nFirstPrivate = 3
  nLastPrivate = 0
       nShared = 0

These are the variables at entry of loop 2 of thread 1.
nThreadPrivate = 1
      nPrivate = 1
nFirstPrivate = 3
  nLastPrivate = 1
       nShared = 1

These are the variables at entry of loop 2 of thread 2.
nThreadPrivate = 2
      nPrivate = 2
nFirstPrivate = 3
  nLastPrivate = 2
       nShared = 2

These are the variables at entry of loop 2 of thread 3.
nThreadPrivate = 3
      nPrivate = 3
nFirstPrivate = 3
  nLastPrivate = 3
       nShared = 3

These are the variables after exit from the parallel region.
nThreadPrivate = 0 (The last value in the main thread)
      nPrivate = 4 (The value prior to entering parallel region)
nFirstPrivate = 4 (The value prior to entering parallel region)
  nLastPrivate = 3 (The value from the last iteration of the loop)
       nShared = 1 (The value assigned, from the delayed thread, 1)

reduction

각 스레드에 대해 비공개인 하나 이상의 변수가 병렬 영역의 끝에 있는 감소 작업의 대상이 되도록 지정합니다.

reduction(operation:var)

매개 변수

작업
병렬 영역의 끝에 있는 변수 var 에서 수행할 작업에 대한 연산자입니다.

var
스칼라 감소를 수행할 하나 이상의 변수입니다. 둘 이상의 변수를 지정하면 변수 이름을 쉼표로 구분합니다.

설명

reduction 는 다음 지시문에 적용됩니다.

자세한 내용은 2.7.2.6 감소를 참조하세요.

예시

// omp_reduction.cpp
// compile with: /openmp
#include <stdio.h>
#include <omp.h>

#define NUM_THREADS 4
#define SUM_START   1
#define SUM_END     10
#define FUNC_RETS   {1, 1, 1, 1, 1}

int bRets[5] = FUNC_RETS;
int nSumCalc = ((SUM_START + SUM_END) * (SUM_END - SUM_START + 1)) / 2;

int func1( ) {return bRets[0];}
int func2( ) {return bRets[1];}
int func3( ) {return bRets[2];}
int func4( ) {return bRets[3];}
int func5( ) {return bRets[4];}

int main( )
{
    int nRet = 0,
        nCount = 0,
        nSum = 0,
        i,
        bSucceed = 1;

    omp_set_num_threads(NUM_THREADS);

    #pragma omp parallel reduction(+ : nCount)
    {
        nCount += 1;

        #pragma omp for reduction(+ : nSum)
        for (i = SUM_START ; i <= SUM_END ; ++i)
            nSum += i;

        #pragma omp sections reduction(&& : bSucceed)
        {
            #pragma omp section
            {
                bSucceed = bSucceed && func1( );
            }

            #pragma omp section
            {
                bSucceed = bSucceed && func2( );
            }

            #pragma omp section
            {
                bSucceed = bSucceed && func3( );
            }

            #pragma omp section
            {
                bSucceed = bSucceed && func4( );
            }

            #pragma omp section
            {
                bSucceed = bSucceed && func5( );
            }
        }
    }

    printf_s("The parallel section was executed %d times "
             "in parallel.\n", nCount);
    printf_s("The sum of the consecutive integers from "
             "%d to %d, is %d\n", 1, 10, nSum);

    if (bSucceed)
        printf_s("All of the functions, func1 through "
                 "func5 succeeded!\n");
    else
        printf_s("One or more of the functions, func1 "
                 "through func5 failed!\n");

    if (nCount != NUM_THREADS)
    {
        printf_s("ERROR: For %d threads, %d were counted!\n",
                 NUM_THREADS, nCount);
        nRet |= 0x1;
   }

    if (nSum != nSumCalc)
    {
        printf_s("ERROR: The sum of %d through %d should be %d, "
                "but %d was reported!\n",
                SUM_START, SUM_END, nSumCalc, nSum);
        nRet |= 0x10;
    }

    if (bSucceed != (bRets[0] && bRets[1] &&
                     bRets[2] && bRets[3] && bRets[4]))
    {
        printf_s("ERROR: The sum of %d through %d should be %d, "
                 "but %d was reported!\n",
                 SUM_START, SUM_END, nSumCalc, nSum);
        nRet |= 0x100;
    }
}

The parallel section was executed 4 times in parallel.
The sum of the consecutive integers from 1 to 10, is 55
All of the functions, func1 through func5 succeeded!

schedule

for 지시문에 적용됩니다.

schedule(type[,size])

매개 변수

type
일정의 종류(예dynamic: , guided또는 runtimestatic.)

size
(선택 사항) 반복의 크기를 지정합니다. 크기 는 정수여야 합니다. 형식이 runtime.인 경우 유효하지 않습니다.

설명

자세한 내용은 구문에 대한 2.4.1을 참조하세요.

예시

// omp_schedule.cpp
// compile with: /openmp
#include <windows.h>
#include <stdio.h>
#include <omp.h>

#define NUM_THREADS 4
#define STATIC_CHUNK 5
#define DYNAMIC_CHUNK 5
#define NUM_LOOPS 20
#define SLEEP_EVERY_N 3

int main( )
{
    int nStatic1[NUM_LOOPS],
        nStaticN[NUM_LOOPS];
    int nDynamic1[NUM_LOOPS],
        nDynamicN[NUM_LOOPS];
    int nGuided[NUM_LOOPS];

    omp_set_num_threads(NUM_THREADS);

    #pragma omp parallel
    {
        #pragma omp for schedule(static, 1)
        for (int i = 0 ; i < NUM_LOOPS ; ++i)
        {
            if ((i % SLEEP_EVERY_N) == 0)
                Sleep(0);
            nStatic1[i] = omp_get_thread_num( );
        }

        #pragma omp for schedule(static, STATIC_CHUNK)
        for (int i = 0 ; i < NUM_LOOPS ; ++i)
        {
            if ((i % SLEEP_EVERY_N) == 0)
                Sleep(0);
            nStaticN[i] = omp_get_thread_num( );
        }

        #pragma omp for schedule(dynamic, 1)
        for (int i = 0 ; i < NUM_LOOPS ; ++i)
        {
            if ((i % SLEEP_EVERY_N) == 0)
                Sleep(0);
            nDynamic1[i] = omp_get_thread_num( );
        }

        #pragma omp for schedule(dynamic, DYNAMIC_CHUNK)
        for (int i = 0 ; i < NUM_LOOPS ; ++i)
        {
            if ((i % SLEEP_EVERY_N) == 0)
                Sleep(0);
            nDynamicN[i] = omp_get_thread_num( );
        }

        #pragma omp for schedule(guided)
        for (int i = 0 ; i < NUM_LOOPS ; ++i)
        {
            if ((i % SLEEP_EVERY_N) == 0)
                Sleep(0);
            nGuided[i] = omp_get_thread_num( );
        }
    }

    printf_s("------------------------------------------------\n");
    printf_s("| static | static | dynamic | dynamic | guided |\n");
    printf_s("|    1   |    %d   |    1    |    %d    |        |\n",
             STATIC_CHUNK, DYNAMIC_CHUNK);
    printf_s("------------------------------------------------\n");

    for (int i=0; i<NUM_LOOPS; ++i)
    {
        printf_s("|    %d   |    %d   |    %d    |    %d    |"
                 "    %d   |\n",
                 nStatic1[i], nStaticN[i],
                 nDynamic1[i], nDynamicN[i], nGuided[i]);
    }

    printf_s("------------------------------------------------\n");
}

------------------------------------------------
| static | static | dynamic | dynamic | guided |
|    1   |    5   |    1    |    5    |        |
------------------------------------------------
|    0   |    0   |    0    |    2    |    1   |
|    1   |    0   |    3    |    2    |    1   |
|    2   |    0   |    3    |    2    |    1   |
|    3   |    0   |    3    |    2    |    1   |
|    0   |    0   |    2    |    2    |    1   |
|    1   |    1   |    2    |    3    |    3   |
|    2   |    1   |    2    |    3    |    3   |
|    3   |    1   |    0    |    3    |    3   |
|    0   |    1   |    0    |    3    |    3   |
|    1   |    1   |    0    |    3    |    2   |
|    2   |    2   |    1    |    0    |    2   |
|    3   |    2   |    1    |    0    |    2   |
|    0   |    2   |    1    |    0    |    3   |
|    1   |    2   |    2    |    0    |    3   |
|    2   |    2   |    2    |    0    |    0   |
|    3   |    3   |    2    |    1    |    0   |
|    0   |    3   |    3    |    1    |    1   |
|    1   |    3   |    3    |    1    |    1   |
|    2   |    3   |    3    |    1    |    1   |
|    3   |    3   |    0    |    1    |    3   |
------------------------------------------------

하나 이상의 변수를 모든 스레드 간에 공유되도록 지정합니다.

shared(var)

매개 변수

var
공유할 변수가 하나 이상 있습니다. 둘 이상의 변수를 지정하면 변수 이름을 쉼표로 구분합니다.

설명

스레드 간에 변수를 공유하는 또 다른 방법은 copyprivate 절을 사용하는 것입니다 .

shared 는 다음 지시문에 적용됩니다.

자세한 내용은 2.7.2.4 공유를 참조하세요.

예시

사용 shared예제는 private을 참조하세요.

다음을 통해 공유

OpenMP 절

copyin

매개 변수

설명

예시

copyprivate

매개 변수

설명

예시

default

설명

예시

firstprivate

매개 변수

설명

예시

if(OpenMP)

매개 변수

설명

예시

lastprivate

매개 변수

설명

예시

nowait

설명

예시

num_threads

매개 변수

설명

예시

ordered

설명

예시

private

매개 변수

설명

예시

reduction

매개 변수

설명

예시

schedule

매개 변수

설명

예시

공유

매개 변수

설명

예시

피드백

추가 리소스