Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I've built a fairly simple c code that reads a pgm image, splits it in different sections and sends it to various cores to elaborate it.

In order to account for some elaboration margins (each core has to access a larger area of the image than the it needs to write on), I can't simply split the image but I first have to create an array where I add the before mentioned margins.

As a quick example: an image is 1600x1200 (width x height), I have 2 cores, I want to access an area of 3x3 centered on the pixel and I'm splitting this image horizontal line by horizontal line then the subdivision would be -> the first core gets the pixels from 0 to 6011600, the second core gets the pixels from 5091600 to 1200*1600.

Now, I believe there is nothing wrong in how I implemented this in my program, still I get this error:

[ct1pt-tnode003:22389:0:22389] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x7ffe7f60ead8)
==== backtrace (tid:  22389) ====
 0 0x000000000004ee05 ucs_debug_print_backtrace()  ???:0
 1 0x0000000000402624 main()  ???:0
 2 0x0000000000022505 __libc_start_main()  ???:0
 3 0x0000000000400d99 _start()  ???:0

This is my code:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <math.h>
#include <time.h>
#include "testlibscatter.h"
#include <mpi.h>

#define MSGLEN 2048


int main(int argc, char *argv[]){

MPI_Init(&argc, &argv);

int m = atoi(argv[1]), n = atoi(argv[2]), kern_type = atoi(argv[3]);
double kernel[m*n];
int i_rank, ranks;
int param, symm;

MPI_Comm_rank( MPI_COMM_WORLD, &i_rank);
MPI_Comm_size( MPI_COMM_WORLD, &ranks);

int xsize, ysize, maxval;
xsize = 0;
ysize = 0;
maxval = 0;

void * ptr;

switch (kern_type){
    case 1:
    meankernel(m, n, kernel);
    break;
    case 2:
    weightkernel(m, n, param, kernel);
    break;
    case 3:
    gaussiankernel(m, n, param, symm, kernel);
    break;
}

if (i_rank == 0){
    read_pgm_image(&ptr, &maxval, &xsize, &ysize, "check_me2.pgm");
}


MPI_Bcast(&xsize, 1, MPI_INT, 0, MPI_COMM_WORLD);
MPI_Bcast(&ysize, 1, MPI_INT, 0, MPI_COMM_WORLD);
MPI_Bcast(&maxval, 1, MPI_INT, 0, MPI_COMM_WORLD);

int flo, start, end, i;
flo = floor(ysize/ranks);

int first, last;

first = start - (m - 1)/2;
last = end + (m - 1)/2;

if (start == 0){
    first = 0;
}
if (end == ysize){
    last = ysize;
}

int sendcounts[ranks];
int displs[ranks];

int first2[ranks];
int last2[ranks];
int c_start2[ranks];
int c_end2[ranks];

int num;
num = (ranks - 1) * (m-1);
printf("num is %d
", num);

unsigned short int bigpic[xsize*(ysize + num)];


if (i_rank == 0){
    for(i = 0; i < ranks; i++){
        c_start2[i] = i * flo;
        c_end2[i] = (i + 1) * flo; 
        if ( i == ranks - 1){
            c_end2[i] = ysize;
        }
        first2[i] = c_start2[i] - (m - 1)/2;
        last2[i] = c_end2[i] + (m - 1)/2;
        if (c_start2[i] == 0){
            first2[i] = 0;
        }
        if (c_end2[i] == ysize){
            last2[i] = ysize;
        }
        sendcounts[i] = (last2[i] - first2[i]) * xsize; 
    }

    int i, j, k, index, index_disp = 0;
    index = 0;
    displs[0] = 0;

    for (k = 0; k < ranks; k++){
        for (i = first2[k]*xsize; i < last2[k]*xsize; i++){
            bigpic[index] = ((unsigned short int *)ptr)[i];
            index++;
        }
        printf("%d
", displs[index_disp]);
        index_disp++;
        displs[index_disp] = index;
    }

}

MPI_Bcast(displs, ranks, MPI_INT, 0, MPI_COMM_WORLD);
MPI_Bcast(sendcounts, ranks, MPI_INT, 0, MPI_COMM_WORLD);

unsigned short int minipic[xsize*(last-first)];
MPI_Barrier(MPI_COMM_WORLD);
MPI_Scatterv(&bigpic[0], sendcounts, displs, MPI_UNSIGNED_SHORT, minipic, (last-first)*xsize, MPI_UNSIGNED_SHORT, 0, MPI_COMM_WORLD);

MPI_Finalize();
}

the function kernel simply returns an array of m*n doubles to edit the image, while the read_pgm_image returns a void pointer with the values of the image read. I've tried printing the values of bigpic and they show no problem.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
280 views
Welcome To Ask or Share your Answers For Others

1 Answer

In the code shown here, start and end are used uninitialised in the computations of first and last:

int flo, start, end, i;
         ~~~~~~~~~~
flo = floor(ysize/ranks);

int first, last;

first = start - (m - 1)/2; // <---- start has a random value here
last = end + (m - 1)/2;    // <---- end has a random value here

If the values are very large, the size of minipic may become larger than the stack size:

unsigned short int minipic[xsize*(last-first)];
                                  ^^^^^^^^^^ random (possibly large) value

A strong indication that this is indeed the cause is the fact that the address of the fault 0x7ffe7f60ead8 is very close to the end of the positive part of the virtual address space, which is where most 64-bit OSes allocate the stack area of the main thread.

Always compile with -Wall in order to get back as many diagnostic messages from the compiler as possible.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
...