Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

scaling factor. 2^n or (2^n - 1)

Status
Not open for further replies.

bravoegg

Member level 2
Member level 2
Joined
Mar 28, 2016
Messages
51
Helped
1
Reputation
2
Reaction score
1
Trophy points
8
Activity points
501
I planned to use 8 bits to represent a sine wave in 2's complement form. Ideally it uses 1bit as sign bit, 7 bit as fractional bit, which mean I scale the sine wave by 2^7 = 128.
But when sine wave equals positive 1, it'll cause overflow after scaling. Because 1*128 exceeds the range 8bit 2's complement can represent.

I could use instead 6 bits to scale the sine wave(Q2,6), but this would reduce the precision.

I do see some people scale by (2^n - 1), and in this case it'll be scaled by 2^7-1 = 127.
Scaled by 2^7 equals shifting the binary point right by 7. But scaled by 2^7 - 1 seems to violate that relation.


Questions:
1. is it true that after scaled by (2^n - 1), I no longer know the location of binary point?
2. following (1), since I lose track of binary point, I have to treat the numbers now as INTEGERS, how can I do addition without knowledge of binary point's location?
3. what modifications needed if I still want to use 2^n, in this case, n = 7(total bit width 8) which is not enough to represent whole range.

Thanks in advance.
Sy.
 

I planned to use 8 bits to represent a sine wave in 2's complement form. Ideally it uses 1bit as sign bit, 7 bit as fractional bit, which mean I scale the sine wave by 2^7 = 128.
But when sine wave equals positive 1, it'll cause overflow after scaling. Because 1*128 exceeds the range 8bit 2's complement can represent.

I could use instead 6 bits to scale the sine wave(Q2,6), but this would reduce the precision.
Or you could use 9 bits and maintain the precision. Which leads to a couple of points:
- What is your reason for planning to use 8 bits? Is there a specific requirement to do so?
- What are your actual requirements for how accurately you need to represent the sine wave?
- All the digital stuff is simply generating a number, so how is the actual sine wave generated?

I do see some people scale by (2^n - 1), and in this case it'll be scaled by 2^7-1 = 127.
Scaled by 2^7 equals shifting the binary point right by 7. But scaled by 2^7 - 1 seems to violate that relation.
Scaling by 2^7-1 is multiplying by 127, just like scaling by 2^7 is multiplying by 128. Nothing magic.


Questions:
1. is it true that after scaled by (2^n - 1), I no longer know the location of binary point?
No
2. following (1), since I lose track of binary point, I have to treat the numbers now as INTEGERS, how can I do addition without knowledge of binary point's location?
The point's location has not been lost.
3. what modifications needed if I still want to use 2^n, in this case, n = 7(total bit width 8) which is not enough to represent whole range.
Several solutions are immediately obvious:
- Use nine bits rather than 8
- Scale by 127 rather than 128
- Scale by 64 rather than 128.

I'm sure others will post other more creative solutions.

Kevin Jennings
 

Hi,

int8 has a range of -128 ... +127. Therefore i find it usesful to scale to 127 (instead of 128 = overflow on positive side).
It does not need more processing power nor do I see other drawbacks.

I think it rather is a problem of human imagination.

Klaus
 

there's no specific reason for using 8bit. I'm studying signal processing after work...

what confuses me is that where is the binary point after scaling by 2^n-1. Could you please exalain more on this?
I find it hard to imagine the location, and if there's another different QM.N format number coming in, it'll be impossible to do addition for now I have no idea where the binary point is.

Thanks
Sy.
 

what confuses me is that where is the binary point after scaling by 2^n-1. Could you please exalain more on this?
The point is in the same place where you said it was in your first post, right after the sign bit and before the 7 fractional bits (S.NNNNNNN)

Kevin
 

The choice is application dependent. using the maximum value -- (127*sin(x))/128 -- results in the highest dynamic range. (64*sin(x))/64 results in less precision, but does allow for 1.0 and -1.0 results.

It is another source of numerical error. In many cases, it is better to preserve the dynamic range until the very end of processing and then apply a final scaling factor.
 

Hi,

but does allow for 1.0 and -1.0 results.
If you scale itin one direction, then you have to scale it back.

There is no need to only consider 2^n values to be 1.
If you consider 127 to be 1 (127:1 scale), then the range includes +/-1.

The difference is, that using 2^n values may be scaledby bit shifts, while other values need multiplication.
***
With ADCs the same problem is more common:
A 10 bit ADC has a digital range of 0...1023.
Now it depend on VRef what voltage scale this means.
With a 3.00Vref the (re-) scale factor is: 3V/1024.
With using 5V supply voltage as Vref the factor is: 5V/1024.
If the 5V supply voltage is 1% too low = 4.95V, then the factor is 4.95V/1024.

--> here it is obvious and common to use scale factors other than 2^n.

Klaus
 

If you choose Q1.7 fixed point format the representable number range is -1.00 .. 0.99. The sine table can be either scaled with 127/128 (0.99) to fit the range, or with 1.0 and the maximum truncated (saturated) to 0.99. As already said, the decision is application dependent, the practical difference probably not so important. For least rounding error, you'll do the scaling in real arithmetic and round to nearest before converting to integer respectively fixed point.
 
so if I use Q1.7 and scale the sine wave by 127, what happens is:
sine * (127/128) * 128 = sine * 127.

the first (127/128) is used to scale the sine wave to fit in (-0.99, 0.99), to avoid later overflow.
the last 128(2^7) is used to shift binary point.
 

Status
Not open for further replies.

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top