Adnan86
Full Member level 2
- Joined
- Apr 4, 2013
- Messages
- 121
- Helped
- 26
- Reputation
- 52
- Reaction score
- 26
- Trophy points
- 1,308
- Activity points
- 2,153
till input data are change
Hi,
This has to be clocked. Input data are clocked in. Previouas (calculated) input data is also clocked in.
Compare both values to check if they differ. If differ start calculation and set a "busy" flag.
After calculations clear busy flag.
*****
But the usual (and recommended) way is to work with a flag that shows new input data.
Klaus
Hi,
Design/Write a state-machine that will control this operation. As Klaus suggested generate some flags with this FSM which control this operation.
ENTITY mac_mat IS
GENERIC (M : integer:=3;
K : integer:=1;
N : integer:=2);
PORT (
a: IN matrix_32(1 to M, 1 to N); -- MxN matrix ;
b: IN matrix_32(1 to N, 1 to K); -- NxK matrix ;
clk: IN STD_LOGIC:='0';
reset: IN STD_LOGIC:='1';
ready : out std_logic := '1' ;
c: OUT matrix_32(1 to M, 1 to K):= (others=>(others => x"00000000"))
) ;
END mac_mat;
ARCHITECTURE rtl OF mac_mat IS
SIGNAL x_c: matrix_32(1 to M, 1 to K):= (others=>(others => x"00000000"));
BEGIN
PROCESS (clk)
Variable mm,nn,kk : integer := 1 ;
Variable sum : signed(63 downto 0) := x"0000000000000000" ;
Variable x_sum : signed(31 downto 0) := x"00000000" ;
Variable x_reg: matrix_64(1 to M, 1 to K):=(others=>(others => x"0000000000000000"));
BEGIN
IF (reset = '1') then
x_reg := (others=>(others => x"0000000000000000"));
x_c <= (others=>(others => x"00000000"));
sum := x"0000000000000000";
x_sum := x"00000000" ;
mm := 1 ;
nn := 1 ;
kk := 1 ;
ELSIF (rising_edge(clk)) THEN
---------------------------------------Given the matrices:
---------------------------------------MxN = 3x2
---------------------------------------NxK = 2x1
IF (mm <= M and nn <= N and kk <= K) then
x_reg(mm,kk) := a(mm,nn) * b(nn,kk);
sum := sum + x_reg(mm,kk);
nn := nn + 1 ;
IF (nn >= N) THEN
x_sum(31) := sum(63) ;
x_sum(30 downto 0):= sum(30 downto 0) ;
x_c(mm,kk) <= x_sum;
END IF ;
ELSIF (mm <= M and nn > N and kk <= K) then
--mm <= mm + 1;
kk := kk + 1 ;
nn := 1 ;
sum := x"0000000000000000" ;
x_sum := x"00000000" ;
ELSIF (mm <= M and kk > K) then
mm := mm + 1;
kk := 1 ;
nn := 1 ;
sum := x"0000000000000000" ;
x_sum := x"00000000" ;
ELSIF (mm > M ) then
ready <= '0' ;
c <= x_c ;
ELSE
c <= x_c ;
END IF ;
END IF ;
END process;
-- c <= x_c ;
END rtl;
.
.
..
type matrix_32 is array (integer range <>, integer range <>) of signed(31 downto 0);
type matrix_64 is array (integer range <>, integer range <>) of signed(63 downto 0);
.
.
ELSIF (mm > M ) then
ready <= '0' ;
c <= x_c ;
ELSE
LIBRARY IEEE;
USE IEEE.std_logic_1164.all;
USE IEEE.numeric_std.all ;
--USE work.matrix_pkg.all ;
USE work.ROM_SVM.all ;
ENTITY mac_mat_tb IS
END mac_mat_tb;
ARCHITECTURE test_mac_behav OF mac_mat_tb IS
constant M : integer := 3;
constant K : integer := 1;
constant N : integer := 2;
COMPONENT mac_mat IS
-- GENERIC (M : integer:=M;
-- K : integer:=K;
-- N : integer:=N);
PORT (
a: IN matrix_32(1 to M, 1 to N); -- MxN matrix ;
b: IN matrix_32(1 to N, 1 to K); -- NxK matrix ;
clk: IN STD_LOGIC;
reset: IN STD_LOGIC;
ready: out STD_LOGIC;
c: OUT matrix_32(1 to M, 1 to K)
) ;
END COMPONENT;
SIGNAL a: matrix_32(1 to M, 1 to N); -- MxN matrix ;
SIGNAL b: matrix_32(1 to N, 1 to K); -- NxK matrix ;
SIGNAL clk : std_logic := '0' ;
SIGNAL ready : std_logic := '0' ;
SIGNAL reset: STD_LOGIC := '1';
SIGNAL c: matrix_32(1 to M, 1 to K);
BEGIN
UUT : mac_mat
--generic map(M=>M,
-- N=>N,
-- K=>K);
PORT MAP (
a => a,
b => b,
clk=>clk,
reset => reset,
ready => ready,
c => c
);
clk <= NOT clk AFTER 10 NS ;
reset <= '0' AFTER 15 NS ;
--END ;
PROCESS --(clk)
BEGIN
WAIT FOR 20 NS;
a(1,1) <=x"00000001";
a(1,2) <=x"00000001";
a(2,1) <=x"00000010";
a(2,2) <=x"00000011";
a(3,1) <=x"00000010";
a(3,2) <=x"00000001";
b(1,1) <=x"00000010";
b(2,1) <=x"00000001";
WAIT FOR 10000 NS;
a(1,1) <=x"00000091";
a(1,2) <=x"00000001";
a(2,1) <=x"00000010";
a(2,2) <=x"00000011";
a(3,1) <=x"00000010";
a(3,2) <=x"00000001";
b(1,1) <=x"00000010";
b(2,1) <=x"00000001";
WAIT;
END PROCESS;
END ;
Your code only sets ready to '0'. Dont you ever want to set it to '1'? the assignment on the port gets overridden as you assign the signal internally, so you will get 'U' until it is set to '0'.
why not post your testbench code too?
As for use in an FPGA - it will probably work for very small values of M, K and N, but it's going to eat all the resources of the FPGA very quickly.
Do you have an architectural diagram of the code, on PAPER? drawn BEFORE you wrote any code?
i have no diagram or code or paper , i just wrote it by myself.
I think you really need to go back to basics. Get your model and assess what you need to acheive. Then you need to draw an architectural diagram showing the hardware needed to implement your design. Only then should you write the VHDL.
i really had to use 32 bit .Do you really need 32 bits/word? could you do any processing serially rather than in parrallel? could you use memories instead of registers?
.
Are you sure? 32 bit sounds very much like a hangover from any C support. FPGA applications rarely need the full 18 bits, let alone 32. Hence why you need to go back to the algorithm level and make a bit true model, and do some investigation as to how many bits is acceptible.
I really suggest a good tutorial on digital logic and VHDL is in order. The designers guide to VHDL is supposed to be a good reference: **broken link removed**
Tricky, mrfibble, Adnam86
The points both tricky and mrfibble brought up we things I identified in your other thread. https://www.edaboard.com/threads/319891/#post1367787
I already suggested using RAM to implement the matrices as any value of M,N,K that was much bigger than your original values would become an issue. Somewhere along the line I also suggested you should really think about the hardware you are trying to implement not just writing code.
As a free alternative to Tricky's suggestion for a book... There are a few books there that might be helpful. Personally I've only read parts of the DSP one so can't comment on the quality of any others.
I get the impression this project was a bit over your head. I suspect you are missing expertise in some very rudimentary basics and/or are not very adept at architectural design of a digital circuit.my asking too much just because of my limit time to finish my project and tiered to try several way and write several code without finishing my project
Thanks for your Solutions .So your problem is with sending in the M x N and N x K matrices and getting a M x K result?
That is an architectural problem, which you should have addressed by up front design of how you get the data into an array from outside to make a usable module.
You implemented this in a brute force broadside input.
Depending on the requirements (which none of us are privy to) you might be able to do one of the following.
1. you could load each element one at a time using a demultiplexer and an address for each element.
2. you could load each element into a ram and change your logic to read from a ram and perform the operations, which are then loaded into an output ram.
3. you could load each element serially into each elements register (as you are still using flip-flops). If you daisy chain all the registers you can load all the registers with only 2 pins (data, shift_enable) thereby saving an enormous number of pins.
4. etc.
Regards
sadly no one teach me , i just learn this by myself and because of this have problem in rudimentary basics ...So your problem is with sending in the M x N and N x K matrices and getting a M x K result?
That is an architectural problem, which you should have addressed by up front design of how you get the data into an array from outside to make a usable module.
You implemented this in a brute force broadside input.
Depending on the requirements (which none of us are privy to) you might be able to do one of the following.
1. you could load each element one at a time using a demultiplexer and an address for each element.
2. you could load each element into a ram and change your logic to read from a ram and perform the operations, which are then loaded into an output ram.
3. you could load each element serially into each elements register (as you are still using flip-flops). If you daisy chain all the registers you can load all the registers with only 2 pins (data, shift_enable) thereby saving an enormous number of pins.
4. etc.
Regards
- - - Updated - - -
I get the impression this project was a bit over your head. I suspect you are missing expertise in some very rudimentary basics and/or are not very adept at architectural design of a digital circuit.
I'm not actually sure how one teaches how to architect a design, I never needed to be taught that skill. I've always been good at coming up with creative solutions.
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?