Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

CRC circuit question

Status
Not open for further replies.
Why do we need 9 instances of CRC_parallel() functions with different input parameters ?
They are not instances. It is what you do in software or manually for steps 1-8 in your post #13 to generate matrix 1 and matrix 2.
The "instance" of CRC_parallel you use in verilog/VHDL is what you generate in step 9, from matrix 1 and matrix 2.
 

why "parallel" CRC requires doing cross-mapping across matrix 1 and matrix 2 ?
 

why "parallel" CRC requires doing cross-mapping across matrix 1 and matrix 2 ?
Bot the serial and the parallel implementations produce data that depends both on the previous state (Min = old Mout) and the input data (Nin).
The person who wrote the "instructions/steps" in post #13 decided to separate the dependencies in 2 matrices, matrix 1 for dependence from Nin, and matrix 2 for dependence from Min (old Mout).

It will be the same if you put all dependencies in one matrix, the "input data" (Nin + Min) on the left side, and the "output" (Mout) at the top.
 

    promach

    Points: 2
    Helpful Answer Positive Rating
Both the serial and the parallel implementations produce data that depends both on the previous state (Min = old Mout) and the input data (Nin).

but again, this still does not explain how using Min and Nin help to generate the "parallel" CRC equation
 

but again, this still does not explain how using Min and Nin help to generate the "parallel" CRC equation
You have a number of inputs (N + M) to the "unknown" parallel implementation and you can calculate the correct output by executing the serial implementaion N times. By flipping one input bit at a time and observing the correct result (from the serial implementation), you can tell if the flipped input bit should be included in the equation for a particular output bit.

When we flip Nin[3] (we set it to '1'), we see that Mout[0], Mout[2] and Mout[3] change (they become '1'). We then know that the term Nin[3] must be included in the equations for Mout[0], Mout[2] and Mout[3].
 

By flipping one input bit at a time and observing the correct result (from the serial implementation), you can tell if the flipped input bit should be included in the equation for a particular output bit.

What is the logical reasoning behind such operation ?

In your previous reply just above, you were explaining about the steps for parallel CRC operation, but not explaining why doing such operation resulted in parallel CRC ?
 

What is the logical reasoning behind such operation ?

In your previous reply just above, you were explaining about the steps for parallel CRC operation, but not explaining why doing such operation resulted in parallel CRC ?
Here is an attempt to explain "why" in small steps:

1. A CRC calculation uses only XOR operations to modify bits (the serial shift operation moves bits but doesn't change them)

2. A serial implementation processes one input bit per clock cycle

3. To process N input bits we can execute the serial implementation N times, where Min = Mout from the previous clock cycle.

4. Even if you execute the serial implementation N times, there are still only XOR operations that modify bits.

5. Combining XOR operatins results in a new XOR operation. You can't create an AND-operation or OR-operation from XOR-operations.

6. The parallel implementation for N input bits must use XOR operations

7. If an input bit is involved an even (zero is even) number of times in the serial XOR operations to produce a particular output bit, is has no effect. That particular output bit will have the same value regardless of the input bit value. That input bit shall not be used in the parallel XOR operation to produce that particular output bit.

8. If an input bit is involved an odd number of times in the serial XOR operations to produce a particular bit, is has the same effect as being used exactly once. If only that input bit is flipped, that particular output bit will also flip. The input bit must be used in the parallel XOR operation to produce that particular output bit.

9. To find if an input bit is used an even or odd number of times for the output bits, we flip it and see which output bits flip (by looking at the result after N iterations of the serial implementation). The output bits that flipped have used the input bit an odd number of times.

10. Every test with a flipped input bit according to the previous step will produce one row in matrix 1 or 2. A '1' in a column means that the corresponding output bit flipped when the input bit (corresponding to the row) was flipped.

11. When flipping of all input bits have been done (one at a time) we have all the rows in matrix 1 and 2 (a single matrix with all the rows is also OK).

12. To see which input bits shall be used in the equation for an output bit, we look in the column(s) corresponding to the output bit. Every '1' means that the input bit corresponding to the row must be included in the XOR operation.
 

    promach

    Points: 2
    Helpful Answer Positive Rating
@promach, are you familiar with lfsrs and forming the transition matrix to generate 2+ bits of output? This is an easier related problem. Or the general concept of how the connection polynomial works? This is a prerequisite to understand parallel CRC.
 

@std_match explanation about how the two matrices are inter-related is quite refreshing given the critical hint of relationship between flipping bits and the characteristics of xor gate used in crc.

However, as for generating matrix 2, I am still confused.

Note: I already know how to generate matrix 1
 

However, as for generating matrix 2, I am still confused.
For matrix 2, you flip bits in Min, which is Mout from the previous clock cycle (= the shift register in the serial implementation).
You need a way to set the shift register to a specific value. You set it to "00001". "00010", "00100" etc. and then execute N clock cycles while keeping all Nin bits to zero. For every such Min you will get one row in matrix 2, the row that corresponds to the flipped bit.

This is easy if you simulate the serial CRC implementation, but trickier if you only have access to a hardwired version.
 

Code:
100000000
100101 subtract
=========
 001010 (The 5 highest bits are the first row in the matrix)
 100101 no subtraction
=========
  010100 (The 5 highest bits are the second row in the matrix)
  100101 no subtraction
=========
   101000 (The 5 highest bits are the third row in the matrix)
   100101 subtract
=========
    01101 (No more dividend bits, this is the remainder (CRC) for the message "1000" and the fourth row in the matrix)

Could you show me how to compute matrix 2 from the above serial implementation ?
 

Code:
100000000
100101 subtract
=========
001010 (The 5 highest bits are the first row in the matrix)
100101 no subtraction
=========
  010100 (The 5 highest bits are the second row in the matrix)
  100101 no subtraction
=========
   101000 (The 5 highest bits are the third row in the matrix)
   100101 subtract
=========
    01101 (No more dividend bits, this is the remainder (CRC) for the message "1000" and the fourth row in the matrix)

Could you show me how to compute matrix 2 from the above serial implementation ?
What you have above is the mathematical way of doing the CRC calculation, not the serial implementation.
They of course give the same CRC result, but the "standard" serial implementation uses a smart trick to eliminate the need for appending M zero bits to the input message. The appended zeros in the first line above have nothing to do with Min bits for the "standard" serial implementation. It isn't obvious how Min maps to the mathematical calculation. We don't know which bit to flip in the mathematical calculation to compute a row in matrix 2 (there is a simple but not obvious mapping, so we leave that for now).

For the parallel implementation we need the same intermediate results (Min = old Mout) as the serial implementation has every N clock cycles. The changes (XOR operations) that the serial implementaion does in N clock cycles is done by the parallel implementation in one clock cycle.

We must first see how the serial implementation does the calculation above.
Note that no zero bits are appended to the input bits.
Code:
The XOR operation in the shift register in done when the input bit is different from the highest shift register bit (when Nin[x] XOR Min[4] = '1')
The highest polynomial bit isn't used for XORing the shift register, so "00101" is used.

Input message Nin = "1000", the highest bit Nin[3] is processed first
Initial shift register = "00000" (the shift register is Mout, which is also Min for the next clock cycle)
Nin[3] XOR Min[4] = '1', do shift and XOR with polynomial
new shift register = "00101"
Nin[2] XOR Min[4] = '0', only shift, no XOR
new shift register = "01010"
Nin[1] XOR Min[4] = '0', only shift, no XOR
new shift register = "10100"
Nin[0] XOR Min[4] = '1', do shift and XOR with polynomial
new shift register = "01101" = the CRC = the row in matrix 1 for Nin[3]

To generate a row in matrix 2, we need to have Nin = "0000" and flip a bit in Min, which means starting the calculation with a different initial value in the shift register.
To flip e.g. Min[4], we must start with "10000" in the shift register:
Nin[3] XOR Min[4] = '1', do shift and XOR with polynomial
new shift register = "00101"
Nin[2] XOR Min[4] = '0', only shift, no XOR
new shift register = "01010"
Nin[1] XOR Min[4] = '0', only shift, no XOR
new shift register = "10100"
Nin[0] XOR Min[4] = '1', do shift and XOR with polynomial
new shift register = "01101" = the row in matrix 2 for Min[4]

We can see that flipping the highest Nin bit gives the same result as flipping the highest Min bit.
This is also true for the second highest bits etc. so the last lines will always be identical in matrix 1 and matrix 2.
 
Last edited:

The XOR operation in the shift register in done when the input bit is different from the highest shift register bit (when Nin[x] XOR Min[4] = '1')

Huh ? I thought it should be the opposite : when Nin[x] AND Min[4] == '1'

remember that crc intermediate stages only do XOR operation between input message and polynomial Min whenever the msb of input message Nin is '1'
 

Huh ? I thought it should be the opposite : when Nin[x] AND Min[4] == '1'

remember that crc intermediate stages only do XOR operation between input message and polynomial Min whenever the msb of input message Nin is '1'
That is not correct. Do you mean the leftmost bit in the intermediate results in the "long division"?
Every time you do the XOR/subtraction in the long division, you modify the input bits, so the intermediate leftmost bit is not an input bit.

If you look at a serial CRC implementation, the first thing that happens with input bits is an XOR with the highest bit in the shift register.
The result from this XOR decides if the shift register is XORed with the polynomial.

The mathematical "long division" and the "standard" serial CRC implementation give the same result, but they don't do the bit processing in the same order.
It may sound confusing, but the standard serial CRC implementation subtracts the polynomial from input bits that haven't arrived at the input yet.
It is assumed that they are zero and it is compensated for by adding them when they arrive. This is why the serial CRC doesn't need M zero bits appended to the input message. It has already assumed that there are zero bits coming after the last input bit.

You can do a serial CRC implementation that works exactly as the "long division", but you then need to append M zero bits to the input message, as in the mathematical definition of the CRC calculation. Nobody wants that.
 

What does it exactly mean by the standard serial CRC implementation subtracts the polynomial from input bits that haven't arrived at the input yet. ?
 

What does it exactly mean by the standard serial CRC implementation subtracts the polynomial from input bits that haven't arrived at the input yet. ?
The polynomial is mathematically applied to M+1 (the polynomial size) input bits.
The long division first gets M+1 bits from the input and then starts processing. The first XOR can't happen until M+1 bits are in the "shift register".
The standard serial CRC looks at one input bit at a time and immediately decides if the XOR should be applied.
The XOR operation involves the following M bits. The long division already has those bits. The standard serial CRC implementation hasn't received them yet.
The order for XOR operations don't matter.
We can say that the long division first has the input bits and then XOR with the polynomial.
The standard CRC implementation first has the polynomial and then XOR with the input bits.
Because of this, the intermediate bits in the "shift register" are different, but the end result is the same.

You want a parallel implementation that works like the serial, with no zeros appended to the input.
This means that you must use the intermediate bits (Min = old Mout) from the serial CRC to generate the parallel equations.

If you convert the long division to a parallel version, you must also append M zero bits to the input message.
 

1. Understand the mathematical definition of the CRC calculation. It states that M zero bits must be appended to the input message before "dividing" by the polynomial. The remainder is the CRC.

2. The long division calculation does exactly what the definition says. M zero bits must be appended to the input message.

3. Understand how to make an efficient serial implementation from the definition, without the need to append zero bits.
This is probably the most difficult step.

4. Understand how to make an efficient N-bit parallel implementation from the efficient serial implementation. This is not difficult. Just flip one bit at a time and see what output bits have flipped after N clock cycles. Matrix 1 is for the flipped input bits. Matrix 2 is for the flipped shift register bits, which aren't the same as the intermediate result bits in the long division.

If you want a parallel implementation that works like the serial implementation, you must use the serial implementation to get the parallel equations.
 

    promach

    Points: 2
    Helpful Answer Positive Rating
I do not understand the following Verilog function implements the serial USB CRC5

Code:
/=============================================================
// Verilog function that implements serial USB CRC5
//=============================================================
function [4:0] crc5_serial;
input [4:0] crc;
input data;
begin
        crc5_serial[0] = crc[4] ^ data;
        crc5_serial[1] = crc[0];
        crc5_serial[2] = crc[1] ^ crc[4] ^ data;
        crc5_serial[3] = crc[2];
        crc5_serial[4] = crc[3];
end
endfunction 
//============================================================
 

I do not understand the following Verilog function implements the serial USB CRC5

Code:
/=============================================================
// Verilog function that implements serial USB CRC5
//=============================================================
function [4:0] crc5_serial;
input [4:0] crc;
input data;
begin
        crc5_serial[0] = crc[4] ^ data;
        crc5_serial[1] = crc[0];
        crc5_serial[2] = crc[1] ^ crc[4] ^ data;
        crc5_serial[3] = crc[2];
        crc5_serial[4] = crc[3];
end
endfunction
//============================================================
It is the normal serial implementation.
The polynomial is x^5 + x^2 + 1
The XOR operation (crc[4] ^ data) decides if the polynomial should be "subtracted" or not.
The polynomial is applied to bits [2] and [0].

As I mentioned earlier, it isn't obvious that this gives the same result as the CRC definition (the "long division").
 

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top