Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

convolution code for embedded sustem

Status
Not open for further replies.

bibhukalyan

Newbie level 6
Newbie level 6
Joined
Aug 3, 2011
Messages
12
Helped
0
Reputation
0
Reaction score
0
Trophy points
1,281
Location
india
Activity points
1,364
Hi everyone,
I am trying to use convolution function on embedded system.
In my pc i tried like :

Code:
    int i,j,m,n,m1,n1;
    int halfH = filter_height >> 1;
    int halfW = filter_width >> 1;

    for(i = 0; i < height; i++)
    {
        for(j = 0; j < width; j++)
        {
            Output[i][j] = 0;

            for(m = -halfH,m1 = filter_height-1; m1 >= 0; m++,m1--)
            {
                if(i + m < 0 || i + m >= height)
                    continue;
                for(n = -halfW,n1 = filter_width-1; n1 >= 0; n++,n1--)
                {
                    if(j + n < 0 || j + n >= height)
                        continue;
                    Output[i][j] += input[i+m][j+n] * filter[m1][n1];
                }
            }
        }
    }

Here my input image is 240x272 (width X Height) and my filter size is 3 x 3.
I tried the same code in embedded system (arm 9).Here it is taking around 350 ms.
I want to reduce the execution time.

Can any one tell me how i will reduce the time ?

with regards
bibhu.
 

Hello,

There are a lot of possible optimization but I have some questions before.
Have your filter matrix a fixed 3x3 size ?
What is the numerical range of input array and filter coefficients ?
You need a scaled output in the same range of input array ?
Where is your target time ?

Regards.
 

Thanks for reply.

My filter is a sobel filter(1 2 1 , 0 0 0, -1, -2, -1) and input is a gray scale image(0 to 255).
I want to calculate the gradient of image.
I do not have any target time. I want to reduce the time as much as it is possible.

Is convolution via fft is faster ?

Regards.
 

Hello,

So your previous code compute a single step of sobel filter apply one 3x3 kernel ?
Your coefficient are for Sobel edge-detection along x-direction ?

In any way you can perform some simply optimization (generic 3x3 array)

1) put your image starting at 1,1 in a array having two pixel more the original image, 0 fill first and last row and first and last column.
In this way you can eliminate all the test for the image border saving some time (amply compensate for the additional processing in the border)

2) enrolling loop 3 and 4 and use local variables instead of the filter matrix array

so the code can be similar to:

Code:
    short k00=filter[0][0]; 
    short k01=filter[0][1]; 
    short k02=filter[0][2]; 
    short k10=filter[1][0]; 
    short k11=filter[1][1]; 
    short k12=filter[1][2]; 
    short k20=filter[2][0]; 
    short k21=filter[2][1]; 
    short k22=filter[2][2]; 

    short acc=0;
     
    for(i = 0; i < height; i++)
    {
        for(j = 0; j < width; j++)
        {
            acc =input[i][j]*k00;
            acc+=input[i][j+1]*k01;
            acc+=input[i][j+2]*k02;
            acc+=input[i+1][j]*k10;
            acc+=input[i+1][j+1]*k11;
            acc+=input[i+1][j+2]*k12;
            acc+=input[i+2][j]*k20;
            acc+=input[i+2][j+1]*k21;
            acc+=input[i+2][j+2]*k22;
            output[i][j]=acc; // if you want again origin in 0,0 
        }
    }

In case of your previous mentioned filter matrix, you can 'hard code' the filter coefficient to gain more time in similar way:

Code:
    short acc=0;
     
    for(i = 0; i < height; i++)
    {
        for(j = 0; j < width; j++)
        {
            acc =input[i][j];
            acc+=(input[i][j+1]<<1);
            acc+=input[i][j+2];
            acc-=input[i+2][j];
            acc-=(input[i+2][j+1]<<1);
            acc-=input[i+2][j+2];
            output[i][j]=acc; // if you want again origin in 0,0
        }
    }

(the code is only exemplificative. check coefficient order and sign)

Let me how speed gain you obtain, and if it you need more speed.

Regards.
 

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top