34 Doxygen
41:41
8 жыл бұрын
30 C++ Templates And Exceptions
44:35
32 Vim
1:05:03
8 жыл бұрын
33 Basic Git
54:25
8 жыл бұрын
31 C++ Exceptions
31:07
8 жыл бұрын
26 Nbody and Modern Fortran
1:00:23
8 жыл бұрын
27 Code Organization
45:40
8 жыл бұрын
29 Genetic Algorithms
29:58
8 жыл бұрын
22 Correct Memory Handling In C++
39:16
25 N Body Simulation
23:05
8 жыл бұрын
24 Correct Memory Management Part 2
24:23
23 Debugging MPI Programs
6:57
8 жыл бұрын
21 Debugging Your Code
57:01
8 жыл бұрын
20 Ghost Nodes and Efficient MPI
29:08
18 MPI Domain Decomposition
39:20
8 жыл бұрын
19 MPI Jacobi and Coding Style
20:50
8 жыл бұрын
17 Misc Items and more MPI
55:20
8 жыл бұрын
15 A Game Of Telephone
48:04
8 жыл бұрын
14 Distributed Computing and MPI
26:52
10 OpenMP ccNUMA MatPlotLib
43:06
8 жыл бұрын
9 Parallel Scalability
40:19
8 жыл бұрын
1 Linux
48:01
8 жыл бұрын
3 Programming In Linux
56:04
8 жыл бұрын
0 Linux
51:44
8 жыл бұрын
6 Basic Serial Optimizations
42:36
8 жыл бұрын
4 Modern Processors
38:58
8 жыл бұрын
Пікірлер
@williamweller8100
@williamweller8100 7 жыл бұрын
Hi, I watched your video on running the Gauss Seidel method in parallel (sadly, comments are disabled on that video). I'm currently working on my senior project, which is all about GPU acceleration. I was given a problem to solve, write a multi-grid poisson solver in parallel. I've currently got a version working in C (translated from C++), but it's only single threaded, and so takes a while to solve large grids. After failing to write a parallel function for the Gauss Seidel method the first time, I decided to write something else to be passed off to the GPU. Funny thing is, the way the program is written, simply copying values from one 2D array to another takes longer in a GPU implementation (can only test grids of size 4096 * 4096). The GPU is, of course, about 25x faster than the CPU when doing this, but I'm not sure of how to directly pass 2D arrays to the GPU, and so have to flatten the arrays to 1 dimension before passing it off, then reconstruct the 2D array after the kernel returns. The whole processes of flattening and reconstructing takes significantly longer to do than just letting the CPU copy the values from one 2D array to the other. I've tracked the time each method takes, and the smoothing process takes by far the longest time, which uses the Gauss Seidel method. If, for example, a solution takes 11 seconds to solve, the smoothing process will take 10 of those seconds (depending, I can specify how many times I want the grids to be smoothed). I was wondering if you had an email I could get to bounce some ideas off of you? I'm not sure if you have any experience with CUDA, but either way.
@DalonWork
@DalonWork 7 жыл бұрын
Unfortunately, I don't have any experience with GPUs. If your project is in C, then your 2D arrays are probably double pointers e.g. `double ** a`, yes? You can't pass them to the GPU in that format, because the data is not contiguous in memory. However, if you are using static arrays: double a[4096][4096], you have a chance. This is because these ARE contiguous in memory. If this is not the case, then I suggest that you use a 1D array, and index it appropriately, as if it was a 2D array. (see en.wikipedia.org/wiki/Row-_and_column-major_order). Then you can pass the array without the packing/unpacking.