Многопоточная обработка массива

__Heaven__

Джедай : наставник для всех

Offline

Сообщений: 2130

Re: Многопоточная обработка массива

« Ответ #30 : Декабрь 01, 2014, 15:03 »

Цитата: Igors от Декабрь 01, 2014, 12:54

Элемент из 4 узлов - квадрангл (quadrangle/quad)

Нет. Тетраэдр. Вы же пишете о двумерном четырёхугольнике.

Цитата: Igors от Декабрь 01, 2014, 14:35

Потому что вектор (или др контейнер) требует конструктор копирования.

Спасибо. Ни разу не сталкивался до сих пор.


	Записан

__Heaven__

Джедай : наставник для всех

Offline

Сообщений: 2130

Re: Многопоточная обработка массива

« Ответ #31 : Декабрь 01, 2014, 16:53 »

Мда... Замерил скорость поиска нормалей с использованием одной нитки - получил 21. Причём на нетбуке

Цитата: xokc от Ноябрь 29, 2014, 23:00

И опять же соглашусь, что для того, чтобы ощутить эффект от распараллеливания для этого алгоритма нужно обсчитывать ну ООООЧЧЧЧЕЕЕННЬ длинные массивы.

Я думал, что у меня такой массив.


« Последнее редактирование: Декабрь 01, 2014, 17:05 от __Heaven__ »	Записан

xokc

Птица говорун

Offline

Сообщений: 976

Re: Многопоточная обработка массива

« Ответ #32 : Декабрь 01, 2014, 18:59 »

Добавьте OpenMP макрос перед циклом из примера от Igors, запустите на чем-нибудь многоядерном и сравните результаты. Интересно же!


	Записан

__Heaven__

Джедай : наставник для всех

Offline

Сообщений: 2130

Re: Многопоточная обработка массива

« Ответ #33 : Декабрь 02, 2014, 10:16 »

Цитата: xokc от Декабрь 01, 2014, 18:59

Хорошо. Только для этого мне потребуется помощь.
Я так понял, что у меня на машине установлен уже omp, подключение сделал подобно этому

Цитата: zlogic от Январь 20, 2010, 23:33

В свойствах проекта нужно дописать (там, где QMake build arguments)

"QMAKE_LIBS+=-static -lgomp -lpthread" "QMAKE_CXXFLAGS+=-msse3 -fopenmp" QMAKE_CXXFLAGS+=-U_WIN32

Только почему-то у меня при вызове omp_get_num_threads() и вообще любых OpenMP функций не из main, а из потока QThread программа выпадает с ошибкой Segmentation Fault.

Нынешняя функция выглядит таким образом:

Код

C++ (Qt)
    while (src != srcEnd)
    {
        // присваиваем временные имена
        const float& x0 = *(src++);
        const float& y0 = *(src++);
        const float& z0 = *(src++);
 
        const float& x1 = *(src++);
        const float& y1 = *(src++);
        const float& z1 = *(src++);
 
        const float& x2 = *(src++);
        const float& y2 = *(src++);
        const float& z2 = *(src++);
 
        // вычисляем векторы
        float a[3] = {x1 - x0, y1 - y0, z1 - z0},
              b[3] = {x2 - x0, y2 - y0, z2 - z0};
 
        const int x = 0,
                  y = 1,
                  z = 2;
        // вычисление и запись перпендикуляра
        *dest = a[y] * b[z] - a[z] * b[y];
        dest[1] = a[z] * b[x] - a[x] * b[z];
        dest[2] = a[x] * b[y] - a[y] * b[x];
 
        // вычисление длины и нормализация
        float inv_length = 1.0 / sqrt(*dest * *dest +
                                      dest[y] * dest[y] +
                                      dest[z] * dest[z]);
        *dest *= inv_length;
        dest[y] *= inv_length;
        dest[z] *= inv_length;
 
        // копирование во все вершины
        memcpy(dest + 3, dest, 3 * sizeof(float));
        memcpy(dest + 6, dest, 3 * sizeof(float));
 
        dest += 9;
    }

Что и куда добавлять?


	Записан

Igors

Джедай : наставник для всех

Offline

Сообщений: 11445

Re: Многопоточная обработка массива

« Ответ #34 : Декабрь 02, 2014, 11:42 »

Цитата: __Heaven__ от Декабрь 02, 2014, 10:16

Что и куда добавлять?

Пока ничего, сначала поработать с исходным кодом. Если нет желания сделать свой класс (напр Vec3f), то использовать хотя бы QVector3D. Он конечно сделан хреновенько, но все ж лучше чем всякий раз так пыль глотать. Напр

Код

C++ (Qt)
void CalcNormals( const float * srcF, int numF, float * dstF )
{
 const QVector3D * src = (QVector3D *) srcF;
 QVector3D * dst = (QVector3D *) dstF;
 int N = numF / 3 / 3;
 
// #pragma omp parallel for
 for (int i = 0; i < N; ++i) {
   int j = i * 3;
   dst[j] = dst[j + 1] = dst[j + 2] = QVector3D::crossProduct(src[j + 1] - src[j], src[j + 2] - src[j]).normalize();
 }
}


« Последнее редактирование: Декабрь 02, 2014, 11:45 от Igors »	Записан

__Heaven__

Джедай : наставник для всех

Offline

Сообщений: 2130

Re: Многопоточная обработка массива

« Ответ #35 : Декабрь 02, 2014, 12:27 »

Всё равно сделал по-своему (лень).
Параллельная задача

Код

C++ (Qt)
    int times = (srcEnd - src) / sizeof(float) / 9;
    #pragma omp parallel for
    for(int i = 0; i < times; i++)
    {
        // присваиваем временные имена
        int j = i * 9;
        const float& x0 = src[j];
        const float& y0 = src[j + 1];
        const float& z0 = src[j + 2];
 
        const float& x1 = src[j + 3];
        const float& y1 = src[j + 4];
        const float& z1 = src[j + 5];
 
        const float& x2 = src[j + 6];
        const float& y2 = src[j + 7];
        const float& z2 = src[j + 8];
 
        // вычисляем векторы
        float a[3] = {x1 - x0, y1 - y0, z1 - z0},
              b[3] = {x2 - x0, y2 - y0, z2 - z0};
 
        const int x = 0,
                  y = 1,
                  z = 2;
        // вычисление и запись перпендикуляра
        float& xNorm = dest[j] = a[y] * b[z] - a[z] * b[y];
        float& yNorm = dest[j + 1] = a[z] * b[x] - a[x] * b[z];
        float& zNorm = dest[j + 2] = a[x] * b[y] - a[y] * b[x];
 
        // вычисление длины и нормализация
        float inv_length = 1.0 / sqrt(xNorm * xNorm +
                                      yNorm * yNorm +
                                      zNorm * zNorm);
        xNorm *= inv_length;
        yNorm *= inv_length;
        zNorm *= inv_length;
 
        // копирование во все вершины
        memcpy(&xNorm + 3, &xNorm, 3 * sizeof(float));
        memcpy(&xNorm + 6, &xNorm, 3 * sizeof(float));
 
    }

Против однопоточной

Код

C++ (Qt)
    while (src != srcEnd)
    {
        // присваиваем временные имена
        const float& x0 = *(src++);
        const float& y0 = *(src++);
        const float& z0 = *(src++);
 
        const float& x1 = *(src++);
        const float& y1 = *(src++);
        const float& z1 = *(src++);
 
        const float& x2 = *(src++);
        const float& y2 = *(src++);
        const float& z2 = *(src++);
 
        // вычисляем векторы
        float a[3] = {x1 - x0, y1 - y0, z1 - z0},
              b[3] = {x2 - x0, y2 - y0, z2 - z0};
 
        const int x = 0,
                  y = 1,
                  z = 2;
        // вычисление и запись перпендикуляра
        *dest = a[y] * b[z] - a[z] * b[y];
        dest[1] = a[z] * b[x] - a[x] * b[z];
        dest[2] = a[x] * b[y] - a[y] * b[x];
 
        // вычисление длины и нормализация
        float inv_length = 1.0 / sqrt(*dest * *dest +
                                      dest[y] * dest[y] +
                                      dest[z] * dest[z]);
        *dest *= inv_length;
        dest[y] *= inv_length;
        dest[z] *= inv_length;
 
        // копирование во все вершины
        memcpy(dest + 3, dest, 3 * sizeof(float));
        memcpy(dest + 6, dest, 3 * sizeof(float));
 
        dest += 9;
    }

Результат достаточно интересный. Ответ на вопрос примерно могу сформулировать, но спрошу знающих. Что у нас происходит на первом цикле, что параллельное вычисление слабее скалярного?

Цитировать

Normals calculated with 1 thread in 14 ms
Normals calculated with multithreading in 27 ms
Normals calculated with 1 thread in 13 ms
Normals calculated with multithreading in 8 ms
Normals calculated with 1 thread in 14 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 13 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 13 ms
Normals calculated with multithreading in 6 ms
Normals calculated with 1 thread in 13 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 13 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 13 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 14 ms
Normals calculated with multithreading in 6 ms
Normals calculated with 1 thread in 13 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 13 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 14 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 13 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 13 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 13 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 14 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 13 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 14 ms
Normals calculated with multithreading Qin 7 ms
Normals calculated with 1 thread in 15 ms
Normals calculated with multithreading in 6 ms
Normals calculated with 1 thread in 14 ms
Normals calculated with multithreading in 8 ms
Normals calculated with 1 thread in 14 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 13 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 13 ms
Normals calculated with multithreading in 8 ms
Normals calculated with 1 thread in 14 ms
Normals calculated with multithreading in 7 ms
Normals calculated with 1 thread in 14 ms
Normals calculated with multithreading in 7 ms


	Записан

Old

Джедай : наставник для всех

Offline

Сообщений: 4350

Re: Многопоточная обработка массива

« Ответ #36 : Декабрь 02, 2014, 12:31 »

Запускает рабочие потоки.


	Записан

__Heaven__

Джедай : наставник для всех

Offline

Сообщений: 2130

Re: Многопоточная обработка массива

« Ответ #37 : Декабрь 02, 2014, 12:46 »

На линуксе пишет
cannot find -lQt5OpenGL
-lQt5Widgets
и т.д.
В чём дело?
При выключении omp всё работает и результаты работы второго алгоритма оказались быстрее первого. Нужно перезамерить.


	Записан

__Heaven__

Джедай : наставник для всех

Offline

Сообщений: 2130

Re: Многопоточная обработка массива

« Ответ #38 : Декабрь 02, 2014, 12:48 »

Цитата: __Heaven__ от Декабрь 02, 2014, 12:46

результаты работы второго алгоритма оказались быстрее первого. Нужно перезамерить.

Код:

Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  28  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  6  ms
Normals calculated with multithreading in  6  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  6  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  6  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  6  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  6  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  6  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  6  ms
Normals calculated with 1 thread in  6  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  6  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  6  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  6  ms
Normals calculated with multithreading in  7  ms
Normals calculated with 1 thread in  6  ms
Normals calculated with multithreading in  6  ms
Normals calculated with 1 thread in  7  ms
Normals calculated with multithreading in  6  ms


	Записан

Old

Джедай : наставник для всех

Offline

Сообщений: 4350

Re: Многопоточная обработка массива

« Ответ #39 : Декабрь 02, 2014, 12:49 »

Цитата: __Heaven__ от Декабрь 02, 2014, 12:46

На линуксе пишет
cannot find -lQt5OpenGL
-lQt5Widgets
и т.д.
В чём дело?

Вы бы показали строку запуска линкера перед этим сообщением.


	Записан

__Heaven__

Джедай : наставник для всех

Offline

Сообщений: 2130

Re: Многопоточная обработка массива

« Ответ #40 : Декабрь 02, 2014, 12:53 »

Цитировать

12:51:51: Настройки не изменились, этап qmake пропускается.
12:51:51: Запускается: «/usr/bin/make»
g++ -Wl,-rpath,/opt/Qt/5.3/gcc_64 -Wl,-rpath,/opt/Qt/5.3/gcc_64/lib -o ProCASTil main.o mainwindow.o solver.o positioner.o geometry.o axis.o progressdialog.o complexexportdialog.o fractionsolidcurve.o normalsolver.o qrc_icons.o qrc_shaders.o qrc_templates.o moc_mainwindow.o moc_solver.o moc_positioner.o moc_axis.o moc_progressdialog.o moc_complexexportdialog.o moc_fractionsolidcurve.o moc_normalsolver.o -static -lgomp -L/opt/Qt/5.3/gcc_64/lib -lQt5OpenGL -lQt5Widgets -lQt5Gui -lQt5Core -lGL -lpthread
/usr/bin/ld: cannot find -lQt5OpenGL
/usr/bin/ld: cannot find -lQt5Widgets
/usr/bin/ld: cannot find -lQt5Gui
/usr/bin/ld: cannot find -lQt5Core
/usr/bin/ld: cannot find -lGL
collect2: error: ld returned 1 exit status
make: *** [ProCASTil] Error 1
12:51:52: Процесс «/usr/bin/make» завершился с кодом 2.
Ошибка при сборке/установке проекта ProCASTil (комплект: Desktop Qt 5.3 GCC 64bit)
When executing step "Сборка"


	Записан

Old

Джедай : наставник для всех

Offline

Сообщений: 4350

Re: Многопоточная обработка массива

« Ответ #41 : Декабрь 02, 2014, 12:56 »

Скажите, а -static это вы добавляете? Попробуйте без него.


	Записан

Igors

Джедай : наставник для всех

Offline

Сообщений: 11445

Re: Многопоточная обработка массива

« Ответ #42 : Декабрь 02, 2014, 12:58 »

Цитата: __Heaven__ от Декабрь 02, 2014, 12:27

Код

C++ (Qt)
int times = (srcEnd - src) / sizeof(float) / 9;

А у Вас с адресной арифметикой все норм? Разница указателей возвращает число float (а не число байт). Да, и кстати не надо полагать что memcpy - самое быстрое (напр в данном случае это не так).

Цитата: __Heaven__ от Декабрь 02, 2014, 12:27

Всё равно сделал по-своему (лень).

Типа "не могу разлюбить copy-paste"


	Записан

__Heaven__

Джедай : наставник для всех

Offline

Сообщений: 2130

Re: Многопоточная обработка массива

« Ответ #43 : Декабрь 02, 2014, 13:09 »

Цитата: Igors от Декабрь 02, 2014, 12:58

Цитата: __Heaven__ от Декабрь 02, 2014, 12:27

Код

C++ (Qt)
int times = (srcEnd - src) / sizeof(float) / 9;

спасибо. глаза уже замылились - пока искал один баг на калькуляторе считал адреса, вот и автоматом выдал...

Цитата: Old от Декабрь 02, 2014, 12:56

Скажите, а -static это вы добавляете? Попробуйте без него.

Спасибо. Помогло

Win 4 ядра

Цитировать

Normals calculated with 1 thread in 14 ms
Normals calculated with multithreading in 30 ms
Normals calculated with 1 thread in 15 ms
Normals calculated with multithreading in 10 ms
Normals calculated with 1 thread in 14 ms
Normals calculated with multithreading in 11 ms
Normals calculated with 1 thread in 13 ms
Normals calculated with multithreading in 10 ms


	Записан

__Heaven__

Джедай : наставник для всех

Offline

Сообщений: 2130

Re: Многопоточная обработка массива

« Ответ #44 : Декабрь 02, 2014, 13:18 »

Linux 2 ядра нетбук

Цитировать

Normals calculated with 1 thread in 21 ms
Normals calculated with multithreading in 53 ms
Normals calculated with 1 thread in 20 ms
Normals calculated with multithreading in 16 ms
Normals calculated with 1 thread in 20 ms
Normals calculated with multithreading in 17 ms
Normals calculated with 1 thread in 19 ms
Normals calculated with multithreading in 17 ms
Normals calculated with 1 thread in 20 ms
Normals calculated with multithreading in 16 ms
Normals calculated with 1 thread in 19 ms
Normals calculated with multithreading in 17 ms
Normals calculated with 1 thread in 19 ms
Normals calculated with multithreading in 15 ms
Normals calculated with 1 thread in 19 ms
Normals calculated with multithreading in 24 ms
Normals calculated with 1 thread in 20 ms
Normals calculated with multithreading in 23 ms
Normals calculated with 1 thread in 20 ms
Normals calculated with multithreading in 23 ms
Normals calculated with 1 thread in 23 ms
Normals calculated with multithreading in 21 ms
Normals calculated with 1 thread in 20 ms
Normals calculated with multithreading in 20 ms
Normals calculated with 1 thread in 21 ms
Normals calculated with multithreading in 20 ms
Normals calculated with 1 thread in 20 ms
Normals calculated with multithreading in 16 ms
Normals calculated with 1 thread in 19 ms
Normals calculated with multithreading in 23 ms
Normals calculated with 1 thread in 19 ms
Normals calculated with multithreading in 21 ms
Normals calculated with 1 thread in 20 ms
Normals calculated with multithreading in 20 ms
Normals calculated with 1 thread in 19 ms
Normals calculated with multithreading in 23 ms
Normals calculated with 1 thread in 20 ms
Normals calculated with multithreading in 21 ms
Normals calculated with 1 thread in 19 ms
Normals calculated with multithreading in 20 ms
Normals calculated with 1 thread in 19 ms
Normals calculated with multithreading in 20 ms
Normals calculated with 1 thread in 19 ms
Normals calculated with multithreading in 21 ms
Normals calculated with 1 thread in 19 ms
Normals calculated with multithreading in 16 ms
Normals calculated with 1 thread in 22 ms
Normals calculated with multithreading in 26 ms
Normals calculated with 1 thread in 19 ms
Normals calculated with multithreading in 20 ms


	Записан

Страниц: 1 2 [3] 4 5 6 Вверх

Печать

« предыдущая тема следующая тема »