Need some Advice in VHDL

Adnan86 · Jul 27, 2014

hi
I want to write a code that sensible by clock, and all operation work normally till the answer be ready , until here i have no problem , but after the result be ready i want all operation in my code stop, and wait till input data are change , it means compare new data , and if input data are changed , all operation work again just like past .
Thanks

KlausST · Jul 28, 2014

Hi,

till input data are change

This has to be clocked. Input data are clocked in. Previous (calculated) input data is also clocked in.
Compare both values to check if they differ. If differ start calculation and set a "busy" flag.
After calculations clear busy flag.

*****
But the usual (and recommended) way is to work with a flag that shows new input data.

Klaus

sid_27 · Jul 28, 2014

Hi,
Design/Write a state-machine that will control this operation. As Klaus suggested generate some flags with this FSM which control this operation.

Adnan86 · Jul 28, 2014

How compare value of inputs ? for example if my input be a (data_in) , so how compare data_in before and after first result ??

KlausST said:
Hi,

This has to be clocked. Input data are clocked in. Previouas (calculated) input data is also clocked in.
Compare both values to check if they differ. If differ start calculation and set a "busy" flag.
After calculations clear busy flag.

*****
But the usual (and recommended) way is to work with a flag that shows new input data.

Klaus

- - - Updated - - -

sid_27 said:
Hi,
Design/Write a state-machine that will control this operation. As Klaus suggested generate some flags with this FSM which control this operation.

use FSM it's good idea but as told KlausST , i should at first stop all operations till input data are changed , but use some thing to compare input data , but how i can compare input data for before and after first result ?

sid_27 · Jul 28, 2014

Hi,
use FSM it's good idea but as told KlausST , i should at first stop all operations till input data are changed , but use some thing to compare input data , but how i can compare input data for before and after first result ?
Can U elaborate what exactly your design is, what are the inputs and outputs of ur design, or a code snippet will help understanding the problem.

TrickyDicky · Jul 28, 2014

So far, the post is very vague. Why not spec what you are actually trying to do, or show some code thats not working?

Adnan86 · Jul 28, 2014

my code worked , and had no error or any thing wrong , here my code :

Code:

ENTITY mac_mat IS
   
   GENERIC (M : integer:=3;
            K : integer:=1;
            N : integer:=2);
   
   PORT (
      a: IN matrix_32(1 to M, 1 to N); -- MxN matrix ;
      b: IN matrix_32(1 to N, 1 to K); -- NxK matrix ;
      clk: IN STD_LOGIC:='0';
      reset: IN STD_LOGIC:='1';
      ready : out std_logic := '1' ;
      c: OUT  matrix_32(1 to M, 1 to K):= (others=>(others => x"00000000"))
   ) ;
END  mac_mat;
ARCHITECTURE rtl OF  mac_mat IS 
SIGNAL x_c:  matrix_32(1 to M, 1 to K):= (others=>(others => x"00000000"));  
BEGIN
   PROCESS (clk)
    Variable mm,nn,kk : integer := 1 ;
    Variable sum : signed(63 downto 0) := x"0000000000000000" ;
    Variable x_sum : signed(31 downto 0) := x"00000000" ;
    Variable x_reg:  matrix_64(1 to M, 1 to K):=(others=>(others => x"0000000000000000"));
   BEGIN 
         IF (reset = '1') then
          x_reg := (others=>(others => x"0000000000000000"));
          x_c <= (others=>(others => x"00000000"));
          sum :=  x"0000000000000000";
          x_sum := x"00000000" ;
          mm := 1 ;
          nn := 1 ;
          kk := 1 ;
         ELSIF (rising_edge(clk)) THEN
---------------------------------------Given the matrices:
---------------------------------------MxN = 3x2
---------------------------------------NxK = 2x1                  
             IF (mm <= M and nn <= N and kk <= K) then
                x_reg(mm,kk) := a(mm,nn) * b(nn,kk);
                sum := sum + x_reg(mm,kk);
                nn := nn + 1 ;
                IF (nn >= N) THEN
                x_sum(31) := sum(63) ;
                x_sum(30 downto 0):= sum(30 downto 0) ;
                x_c(mm,kk) <= x_sum;
                END IF ;
             ELSIF (mm <= M and nn > N and kk <= K) then
                --mm <= mm + 1;
                kk := kk + 1 ;
                nn := 1 ;
                sum := x"0000000000000000" ;
                x_sum := x"00000000" ;
             ELSIF (mm <= M and kk > K) then
                mm := mm + 1;
                kk := 1 ;
                nn := 1 ;
                sum := x"0000000000000000" ;
                x_sum := x"00000000" ;
             ELSIF (mm > M ) then
                 ready <= '0' ;
                 c <= x_c ;
             ELSE
                 c <= x_c ;
             END IF ;
         END IF ;   
   END process;
  -- c <= x_c ;
END rtl;

that in work.rom ... use this type :

Code:

.
.
..
type matrix_32 is array (integer range <>, integer range <>) of signed(31 downto 0);
type matrix_64 is array (integer range <>, integer range <>) of signed(63 downto 0);
. 
.

my point is , when in test benc give a,b value . code worked correctly , but after that for new a,b . code had the last answer .
also i know if in bottom of code , it means here :

Code:

ELSIF (mm > M ) then
                 ready <= '0' ;
                 c <= x_c ;
             ELSE

change mm := 1 ; code worked again , but in loop and frequently repeated .
and here my point , i want after i get the result , whole operation will be stop till i have new a, b .
i hope you get my points .
Thanks

- - - Updated - - -

and something else , can i used this code for implementation on fpga ??

TrickyDicky · Jul 28, 2014

Your code only sets ready to '0'. Dont you ever want to set it to '1'? the assignment on the port gets overridden as you assign the signal internally, so you will get 'U' until it is set to '0'.

why not post your testbench code too?

As for use in an FPGA - it will probably work for very small values of M, K and N, but it's going to eat all the resources of the FPGA very quickly.

Do you have an architectural diagram of the code, on PAPER? drawn BEFORE you wrote any code?

Adnan86 · Jul 28, 2014

this is my test :

Code:

LIBRARY  IEEE;
USE IEEE.std_logic_1164.all;
USE IEEE.numeric_std.all ;
--USE work.matrix_pkg.all ;
USE work.ROM_SVM.all ;

ENTITY mac_mat_tb IS
END  mac_mat_tb;

ARCHITECTURE test_mac_behav OF  mac_mat_tb IS
 
 constant M : integer := 3;
 constant K : integer := 1;
 constant N : integer := 2;
 COMPONENT mac_mat IS 
  --   GENERIC (M : integer:=M;
  --            K : integer:=K;
  --            N : integer:=N);
   
   PORT (
      a: IN matrix_32(1 to M, 1 to N); -- MxN matrix ;
      b: IN matrix_32(1 to N, 1 to K); -- NxK matrix ;
      clk: IN STD_LOGIC;
      reset: IN STD_LOGIC;
      ready: out STD_LOGIC;
      c: OUT  matrix_32(1 to M, 1 to K)
   ) ;
 END COMPONENT;
 
SIGNAL a:  matrix_32(1 to M, 1 to N); -- MxN matrix ;
SIGNAL b:  matrix_32(1 to N, 1 to K); -- NxK matrix ;
SIGNAL clk   : std_logic := '0' ;
SIGNAL ready   : std_logic := '0' ;
SIGNAL reset:  STD_LOGIC := '1';
SIGNAL c:   matrix_32(1 to M, 1 to K);
       
BEGIN
	UUT : mac_mat
	--generic map(M=>M,
	--            N=>N,
	--            K=>K);
	PORT MAP (
		 a => a,
		 b => b,
		 clk=>clk,
		 reset => reset,
		 ready => ready,
		 c => c
		  );
   clk <= NOT clk AFTER 10 NS ;
   reset <= '0' AFTER 15 NS ;
--END ;

PROCESS --(clk)
BEGIN
WAIT FOR 20 NS;
a(1,1) <=x"00000001";
a(1,2) <=x"00000001";

a(2,1) <=x"00000010";
a(2,2) <=x"00000011";

a(3,1) <=x"00000010";
a(3,2) <=x"00000001";

b(1,1) <=x"00000010";
b(2,1) <=x"00000001";

WAIT FOR 10000 NS;
a(1,1) <=x"00000091";
a(1,2) <=x"00000001";

a(2,1) <=x"00000010";
a(2,2) <=x"00000011";

a(3,1) <=x"00000010";
a(3,2) <=x"00000001";

b(1,1) <=x"00000010";
b(2,1) <=x"00000001";


WAIT;

END PROCESS;
END ;

for FPGA said to me you you use more than 100% of device , i want use virtex 4 .
i want use this code 6 times to get final answer and my max of mm , nn it's should be 20. after that had this warning :you you use more than 100% of device, any suggestion for solve this problem .
for ready assign to '0' ,at first ready it's '1' , after get result ready change to '0' , and because i used it for next step as reset for next step .
i have no diagram or code or paper , i just wrote it by myself.

TrickyDicky said:
Your code only sets ready to '0'. Dont you ever want to set it to '1'? the assignment on the port gets overridden as you assign the signal internally, so you will get 'U' until it is set to '0'.

why not post your testbench code too?

As for use in an FPGA - it will probably work for very small values of M, K and N, but it's going to eat all the resources of the FPGA very quickly.

Do you have an architectural diagram of the code, on PAPER? drawn BEFORE you wrote any code?

TrickyDicky · Jul 28, 2014

THe code you have written suggests you know some software, but VHDL is NOT a programming language, its a description language.
I think you really need to go back to basics. Get your model and assess what you need to acheive. Then you need to draw an architectural diagram showing the hardware needed to implement your design. Only then should you write the VHDL.

Do you really need 32 bits/word? could you do any processing serially rather than in parrallel? could you use memories instead of registers?
IF none of this makes sense, you need to go back and start learning some basics.

mrflibble · Jul 28, 2014

Problem identified:

Adnan86 said:
i have no diagram or code or paper , i just wrote it by myself.

Solution suggested:

TrickyDicky said:
I think you really need to go back to basics. Get your model and assess what you need to acheive. Then you need to draw an architectural diagram showing the hardware needed to implement your design. Only then should you write the VHDL.

@ regular posters:
Do we have an existing post or stricky somewhere with ready to go book/website suggestions for this sort of case? Would save a lot of time since this occurs quite often...

Adnan86 · Jul 28, 2014

TrickyDicky said:
Do you really need 32 bits/word? could you do any processing serially rather than in parrallel? could you use memories instead of registers?
.

i really had to use 32 bit .
you said use memory instead register !!! can you give me some reference about this topic .
Thank you

TrickyDicky · Jul 28, 2014

Are you sure? 32 bit sounds very much like a hangover from any C support. FPGA applications rarely need the full 18 bits, let alone 32. Hence why you need to go back to the algorithm level and make a bit true model, and do some investigation as to how many bits is acceptible.

I really suggest a good tutorial on digital logic and VHDL is in order. The designers guide to VHDL is supposed to be a good reference: **broken link removed**

Adnan86 · Jul 28, 2014

TrickyDicky said:
Are you sure? 32 bit sounds very much like a hangover from any C support. FPGA applications rarely need the full 18 bits, let alone 32. Hence why you need to go back to the algorithm level and make a bit true model, and do some investigation as to how many bits is acceptible.

I really suggest a good tutorial on digital logic and VHDL is in order. The designers guide to VHDL is supposed to be a good reference: **broken link removed**

Thank you for the time you gave me.

ads-ee · Jul 28, 2014

Tricky, mrfibble, Adnam86

The points both tricky and mrfibble brought up we things I identified in your other thread. https://www.edaboard.com/threads/319891/#post1367787

I already suggested using RAM to implement the matrices as any value of M,N,K that was much bigger than your original values would become an issue. Somewhere along the line I also suggested you should really think about the hardware you are trying to implement not just writing code.

As a free alternative to Tricky's suggestion for a book... There are a few books there that might be helpful. Personally I've only read parts of the DSP one so can't comment on the quality of any others.

Adnan86 · Jul 28, 2014

ads-ee said:
Tricky, mrfibble, Adnam86

The points both tricky and mrfibble brought up we things I identified in your other thread. https://www.edaboard.com/threads/319891/#post1367787

I already suggested using RAM to implement the matrices as any value of M,N,K that was much bigger than your original values would become an issue. Somewhere along the line I also suggested you should really think about the hardware you are trying to implement not just writing code.

As a free alternative to Tricky's suggestion for a book... There are a few books there that might be helpful. Personally I've only read parts of the DSP one so can't comment on the quality of any others.

I'm so appreciate for your answer and your patient ,you give me so many advice in my previews thread , and i used almost all of them, i used ROM too , and as i said in this thread my code worked correctly in modelsim ,and give me correct answer , just had one problem for changing input data, that not so important . I just want to reduce IOB , because as i used ROM i could reduced it to 261 , but i can just use 240 IOB, i tried so much to reduced it again but i really can't more . i read so many paper for this but couldn't find any advice or clue for it . all i understand for reduce it USE ROM ,but any where that necessary i used ROM .
for example for above code , how i can reduce IOB by use ROM or anything else ????
my asking too much just because of my limit time to finish my project and tiered to try several way and write several code without finishing my project, if it bother you i'm really sorry .
Thank you again

ads-ee · Jul 28, 2014

So your problem is with sending in the M x N and N x K matrices and getting a M x K result?

That is an architectural problem, which you should have addressed by up front design of how you get the data into an array from outside to make a usable module.

You implemented this in a brute force broadside input.

Depending on the requirements (which none of us are privy to) you might be able to do one of the following.
1. you could load each element one at a time using a demultiplexer and an address for each element.
2. you could load each element into a ram and change your logic to read from a ram and perform the operations, which are then loaded into an output ram.
3. you could load each element serially into each elements register (as you are still using flip-flops). If you daisy chain all the registers you can load all the registers with only 2 pins (data, shift_enable) thereby saving an enormous number of pins.
4. etc.

Regards

- - - Updated - - -

my asking too much just because of my limit time to finish my project and tiered to try several way and write several code without finishing my project

I get the impression this project was a bit over your head. I suspect you are missing expertise in some very rudimentary basics and/or are not very adept at architectural design of a digital circuit.

I'm not actually sure how one teaches how to architect a design, I never needed to be taught that skill. I've always been good at coming up with creative solutions.

Adnan86 · Jul 28, 2014

ads-ee said:
So your problem is with sending in the M x N and N x K matrices and getting a M x K result?

That is an architectural problem, which you should have addressed by up front design of how you get the data into an array from outside to make a usable module.

You implemented this in a brute force broadside input.

Depending on the requirements (which none of us are privy to) you might be able to do one of the following.
1. you could load each element one at a time using a demultiplexer and an address for each element.
2. you could load each element into a ram and change your logic to read from a ram and perform the operations, which are then loaded into an output ram.
3. you could load each element serially into each elements register (as you are still using flip-flops). If you daisy chain all the registers you can load all the registers with only 2 pins (data, shift_enable) thereby saving an enormous number of pins.
4. etc.

Regards

Thanks for your Solutions .

- - - Updated - - -

ads-ee said:
So your problem is with sending in the M x N and N x K matrices and getting a M x K result?

That is an architectural problem, which you should have addressed by up front design of how you get the data into an array from outside to make a usable module.

You implemented this in a brute force broadside input.

Depending on the requirements (which none of us are privy to) you might be able to do one of the following.
1. you could load each element one at a time using a demultiplexer and an address for each element.
2. you could load each element into a ram and change your logic to read from a ram and perform the operations, which are then loaded into an output ram.
3. you could load each element serially into each elements register (as you are still using flip-flops). If you daisy chain all the registers you can load all the registers with only 2 pins (data, shift_enable) thereby saving an enormous number of pins.
4. etc.

Regards

- - - Updated - - -

I get the impression this project was a bit over your head. I suspect you are missing expertise in some very rudimentary basics and/or are not very adept at architectural design of a digital circuit.

I'm not actually sure how one teaches how to architect a design, I never needed to be taught that skill. I've always been good at coming up with creative solutions.

sadly no one teach me , i just learn this by myself and because of this have problem in rudimentary basics ...

Need some Advice in VHDL

Full Member level 2

Advanced Member level 7

Junior Member level 2

Full Member level 2

Junior Member level 2

Advanced Member level 7

Full Member level 2

Advanced Member level 7

Full Member level 2

Advanced Member level 7

Advanced Member level 5

Full Member level 2

Advanced Member level 7

Full Member level 2

Super Moderator

Full Member level 2

Super Moderator

Full Member level 2

Similar threads

Privacy & Transparency

Privacy & Transparency