CRC circuit question

Status
Not open for further replies.
Wait, in the long division process just above, why there is "no subtraction" when the most significant bit of dividend is 0 (not 1) ?
 

in polynomial long division, you want the highest power coef to become zero. if it is zero, this isn't hard to do.

For GF2 math, addition and subtraction are the same and both are xor.
 

Code:
100000000
100101 subtract
=========
 001010 (The 5 highest bits are the first row in the matrix)
 100101 no subtraction
=========
  010100 (The 5 highest bits are the second row in the matrix)
  100101 no subtraction
=========
   101000 (The 5 highest bits are the third row in the matrix)
   100101 subtract
=========
    01101 (No more dividend bits, this is the remainder (CRC) for the message "1000" and the fourth row in the matrix)

Your long division mechanism seems to be different from that described in wikipedia article

Code:
11010011101100 000 <--- input right padded by 3 bits
1011               <--- divisor
01100011101100 000 <--- result (note the first four bits are the XOR with the divisor beneath, the rest of the bits are unchanged)
 1011              <--- divisor ...
00111011101100 000
  1011
00010111101100 000
   1011
00000001101100 000 <--- note that the divisor moves over to align with the next 1 in the dividend (since quotient for that step was zero)
       1011             (in other words, it doesn't necessarily move one bit per iteration)
00000000110100 000
        1011
00000000011000 000
         1011
00000000001110 000
          1011
00000000000101 000
           101 1
-----------------
00000000000000 100 <--- remainder (3 bits).  Division algorithm stops here as dividend is equal to zero.
--- Updated ---

I think I know what the first matrix is doing. It is basically the long division but again storing all the in-between steps whether subtraction operation is performed or not. Refer to the wikipedia version for comparison.

In other words, the first matrix will possibly waste some LUT storage area since the second and third rows of the first matrix are simply shifted version of the previous row in your own crc example.

By the way , what do we need the second matrix (Nin = 0) ?
 
Last edited:

My long division is the same, but I show the "no subtraction" lines which don't do anything.

Each bit of the CRC is an XOR by a set of input bits and a set of "previous" (Mout) bits.

The first matrix shows the output bits that are affected by a certain input bit by calculating the CRC for all input words with only one bit set:
0001 (Nin[0] = 1]
0010 (Nin[1] = 1]
0100 (Nin[2] = 1]
1000 (Nin[3] = 1]

That you get all results as intermediate results when calculating the CRC for 1000 is just a "lucky" circumstance.

The second matrix shows the output bits that are affected by a certain "previous" output bit.
You have to initialize the shift register to all possible words with one bit set and then shifting in a full input word of only zeros. You will not get all results "for free" as intermediate results when doing only one of the calculations.
The CRC for input word 0000 must be calculated after initializing the shift register to the following words:
00001 (Min[0] = 1]
00010 (Min[1] = 1]
00100 (Min[2] = 1]
01000 (Min[3] = 1]
10000 (Min[4] = 1]

The "parallel" function for each output bit is simply the XOR of all input and "previous" bits that affect that particular output bit.
Combine the same column in both matrices to get the function for the corresponding output bit.
 

The second matrix shows the output bits that are affected by a certain "previous" output bit.

How to compute values for all 4 rows in the second matrix ?
 

The first matrix has N=4 rows.
The second matrix has M=5 rows.

It seems that all rows in the second matrix can be generated by two calculations.
The first row is the remainder when "dividing" a '1' + N zeros by the polynomial.
For the current example, a '1' followed by N zeros is 10000, which can't be divided by 100101, so it is the remainder (= the first row).

The last row can be generated by calculating the remainder when dividing the first row + M zeros by the polynomial.
The other rows can be seen in the intermediate results.

For the current example (N=4 M=5), this is exactly the same calculation as for the first matrix, so the last 4 rows in the second matrix are identical to the first 4 rows in the first matrix.


The above calculations will probably only confuse your understanding of what really happens.
The goal with the two matrices is to see which "input bits" (the input word + the previous remainder word) has an effect on each output bit (in the new remainder word).

The effect from the input bits can be seen in matrix 1, by beginning with all zeros in the shift register (the previous remainder bits) and process an input word with only one bit set.
The effect from the previous remainder bits can be seen in matrix 2, by setting one bit at a time in the shift register and then process an input word with only zeros.
 

The last row can be generated by calculating the remainder when dividing the first row + M zeros by the polynomial.

I suppose this second matrix is doing the exact same long division as the first matrix ?
 

but why is the first row of first matrix different from that of the second matrix ?

Note: All other rows are the same across first matrix and second matrix
 

When you really understand what is going on, generating the matrices is easy.
Each row is the result of a single '1' being processed N clock cycles in the serial CRC implementation.
The single '1 can be in the input word (matrix 1) or in the shift register/intermediate remainder/Min (matrix 2).

I don't think the following shortcut calculation will help you understand what is really going on,
but it may explain why the matrices can be very similar.
They can also be different in many lines when the input word size N is much larger than the CRC size M.

Both matrices are part of the same sequence, but with different starting points.
Take M-1 zeros, followed by a '1', followed by M+N-1 zeros. In this case 0000100000000.
Do a long division by the polynomial and record all intermediate remainders.
The first matrix will be N values starting at intermediate value number M
The second matrix will be M values starting at intermediate value number N

Code:
0000100000000
100101 no subtraction
=============
00010 intermediate value 1
000100 Get next bit from the dividend
100101 no subtraction
=============
  00100 intermediate value 2
  001000 Get next bit from the dividend
  100101 no subtraction
=============
   01000 intermediate value 3
   010000 Get next bit from the dividend
   100101 no subtraction
=============
    10000 intermediate value 4 (= N, so the first row in matrix 2)
    100000 Get next bit from the dividend
    100101 subtract
=============
     00101 intermediate value 5 (= M, so the first row in matrix 1)
     001010 Get next bit from the dividend
     100101 no subtraction
=============
      01010 intermediate value 6
      010100 Get next bit from the dividend
      100101 no subtraction
=============
       10100 intermediate value 7
       101000 Get last bit from the dividend
       100101 subtract
=============
        01101 Final remainder, always the last row in both matrices.
 
Last edited:

    promach

    Points: 2
    Helpful Answer Positive Rating
Just for your info, I have already understood how to generate first matrix.

I just do not know how to generate the second matrix.

From what I understand, first matrix alone is enough to compute CRC value.
Why still need the second matrix ?
 

From what I understand, first matrix alone is enough to compute CRC value.
Why still need the second matrix ?
How do you get the equations for a parallel CRC without matrix 2?
The equations for most output bits (probably all) contain terms from both matrices.
There is at least one Min[x] term (Min = from matrix 2) in all parallel equations in your post #13.
 

How do you get the equations for a parallel CRC without matrix 2?

Why parallel CRC needs matrix 2 ?
 

Why parallel CRC needs matrix 2 ?
Example from your post #13:
Mout[0] = Min[1]^Min[4]^Nin[0]^Nin[3]

You only know that Min 1 and 4 should be in the equation because the corresponding lines in matrix 2 have a '1' in the Mout[0] column.
The Nin terms come from the Mout[0] column in matrix 1.
 

    promach

    Points: 2
    Helpful Answer Positive Rating
You only know that Min 1 and 4 should be in the equation because the corresponding lines in matrix 2 have a '1' in the Mout[0] column.

Wait, what kind of logical reasoning behind this "parallel" CRC ?
 

Wait, what kind of logical reasoning behind this "parallel" CRC ?
The serial implementation can only process one input bit per clock cycle.
A parallel implementation can process N input bits per clock cycle.
The CRC result will stay the same.
One problem that can occur for a parallel implementation is that the input message is not an exact multiple of the input word size (N).
Extra logic is needed for that, so you normally choose N to be the smallest "chunk" size of the input message.

Example: if you want to calculate the CRC32 for a file, it would be logical to process 8 input bits (one byte) per clock cycle. N = 8.
 

One problem that can occur for a parallel implementation is that the input message is not an exact multiple of the input word size (N).
Extra logic is needed for that, so you normally choose N to be the smallest "chunk" size of the input message.

Extra logic ? As in matrix 2 ?
I am very confused on the purpose of matrix 2 and how to generate it.
 

in reply #30 , what does it exactly mean by 10000 intermediate value 4 (= N, so the first row in matrix 2) ?
 

Extra logic ? As in matrix 2 ?
I am very confused on the purpose of matrix 2 and how to generate it.
Forget about that special case and the needed extra logic for now. It has nothing to do with matrix 2.
in reply #30 , what does it exactly mean by 10000 intermediate value 4 (= N, so the first row in matrix 2) ?
The calculation in post #30 is a "shortcut" to get both matrix 1 and 2 with a minimum of work. It will not help you understand what is going on.
You need to understand your own post #13.

A serial implementation calculates in one clock cycle the new shift register value Mout from the old shift register Min and the input bit Nin (only one bit for the serial implementation).
We can write this as
Mout = CRC_serial(Nin, Min) where Nin is a single input bit and Min is Mout from the previous clock cycle
The calculation is repeated until all input bits are processed.

For a parallel implementation, we want the same result but we process N input bits per clock cycle:
Mout = CRC_parallel(Nin, Min) where Nin is N input bits and Min is Mout from the previous clock cycle

To get the equation for each individual output bit in Mout, we must investigate the relation between the CRC_parallel inputs bits Nin + Min and the output bits Mout. To do this we set only one input bit at a time and use the serial implementation to see what the result should be.

For N=4 and M=5 we want the results from the following calculations:
Mout = CRC_parallel("1000", "00000")
Mout = CRC_parallel("0100", "00000")
Mout = CRC_parallel("0010", "00000")
Mout = CRC_parallel("0001", "00000")
Mout = CRC_parallel("0000", "10000")
Mout = CRC_parallel("0000", "01000")
Mout = CRC_parallel("0000", "00100")
Mout = CRC_parallel("0000", "00010")
Mout = CRC_parallel("0000", "00001")

The results from having a bit set in Nin go to matrix 1. One Mout result per row.
The results from having a bit set in Min go to matrix 2. One Mout result per row.

Each of the results above is calculated by executing the serial implementation N times.
Examples:
CRC_parallel("1000", "00000")
is calculated by the following operations:
Mout = CRC_serial('1', "00000")
Mout = CRC_serial('0', Min) where Min is Mout from the previos clock cycle
Mout = CRC_serial('0', Min) where Min is Mout from the previos clock cycle
Mout = CRC_serial('0', Min) where Min is Mout from the previos clock cycle

CRC_parallel("0000", "10000")
is calculated by the following operations:
Mout = CRC_serial('0, "10000")
Mout = CRC_serial('0', Min) where Min is Mout from the previos clock cycle
Mout = CRC_serial('0', Min) where Min is Mout from the previos clock cycle
Mout = CRC_serial('0', Min) where Min is Mout from the previos clock cycle

Every row in matrix 1 and matrix 2 is calculated in the same way by executing CRC_serial N times.

When done, you have the relations between all input and output bits for CRC_parallel in matrix 1 and 2,
and you can write the equations for the individual bits of Mout.
Every '1' in matrix 1 and 2 means that the input bit corresponding to the line should be included when XOR-ing to get the output bit corresponding to the column.
 


Why do we need 9 instances of CRC_parallel() functions with different input parameters ?
 

Status
Not open for further replies.
Cookies are required to use this site. You must accept them to continue using the site. Learn more…