Continue to Site

Welcome to EDAboard.com

Welcome to our site! EDAboard.com is an international Electronics Discussion Forum focused on EDA software, circuits, schematics, books, theory, papers, asic, pld, 8051, DSP, Network, RF, Analog Design, PCB, Service Manuals... and a whole lot more! To participate you need to register. Registration is free. Click here to register now.

Concatenating bytes into a 32 bit word

Status
Not open for further replies.

shaiko

Advanced Member level 5
Advanced Member level 5
Joined
Aug 20, 2011
Messages
2,644
Helped
303
Reputation
608
Reaction score
297
Trophy points
1,363
Visit site
Activity points
18,302
Hello,

I'm comparing 2 methods for concatenating received UART bytes into 32 bit words.

Method #0 uses unions:
Code:
union word_from_bytes
{
   uint8_t array_bytes [4] ;
   uint32_t concatenated_array_bytes ;
} ;
union word_from_bytes some_union ;

int8_t read_and_concatenate ( bool data_ready , int8_t index , union word_from_bytes * some_union )
{
	if ( UARTCharsAvail ( UART0_BASE ) )
	{
		some_union -> array_bytes [ index ] = UARTCharGet ( UART0_BASE ) ;
		if ( index == 3 )
		{
			index = 0 ;
		}
		else
		{
			index ++ ;
		}
	}	
	return index ;
}

int main(void) 
{
    uint8_t current_uart_byte = 0 ;
    bool flag = 0 ;

    while ( 1 )
    {

	if ( index == 3 )
	{
		index = 0 ;
	}
        else
	{
	  index ++ ;
	}
        current_uart_byte = read_and_concatenate ( UARTCharsAvail ( UART0_BASE ) , current_uart_byte , & some_union  ) ;
        if ( some_union.concatenated_array_bytes == 0x20202020 ) // Equivalent to pressing 4 times on the Space button of the keyboard
        {
          flag = 1 ;
        }
    }
}


Method #1 - uses shifting:
Code:
void byte_shift ( char direction , uint8_t locations , uint32_t * data ) 
{
	if ( direction == 'r' ) 
	{
		* data = ( ( * data ) >> 8 * locations ) ;
	}
	
	else if ( direction == 'l' )
	{
		* data = ( ( * data ) << 8 * locations ) ;				
	}
}
	
uint8_t read_and_concatenate ( bool data_ready , uint8_t index , uint32_t * data )
{
	uint32_t x = 0 ;
	if ( UARTCharsAvail ( UART0_BASE ) )
	{
		x = ( uint32_t ) UARTCharGet ( UART0_BASE ) ;
		* data = ( ( * data ) <<  8 ) | x ;
		if ( index == 3 )
		{
			index = 0 ;
		}
		else
		{
			index ++ ;
		}
	}	
	return index ;
}

int main(void) 
{
    uint8_t current_uart_byte = 0 ;
    bool flag = 0 ;

	uint8_t index = 0 ;
	uint32_t full_word = 0 ;
	
    while ( 1 )
    {
		
		//void byte_shift ( char direction , uint32_t * data )
		byte_shift ( 'r' , 1 , & y ) ;

        current_uart_byte = read_and_concatenate ( UARTCharsAvail ( UART0_BASE ) , current_uart_byte , & full_word ) ;
        if ( full_word == 0x20202020 ) // Equivalent to pressing 4 times on the Space button of the keyboard
        {
			flag = 1 ;
        }		
    }
}

Although both of the above work - I'd like to know which one do you find better to use and why?
If you'd do it in a different way - please write an example.
 

Better way on which sense ? Whenever possible, I always prefer to use any approach which do not make use of pointers, this often make the code less understandable, in addition to increase the chance of making access to the variable out of its bounds.
 
  • Like
Reactions: shaiko

    shaiko

    Points: 2
    Helpful Answer Positive Rating
For the real comparison you should compare the assembly code generated by both C codes and see which has less instructions.

Anyway, I'd go for the union method, because you'll leaving pointer arithmetic here to the compiler.

Also, I'd optimize your code a bit:
Code:
		if ( index == 3 )
		{
			index = 0 ;
		}
		else
		{
			index ++ ;
		}
can be replaced with a single line of code without any branching. Unnecessary ifs are bad. Think about it...
 
  • Like
Reactions: shaiko

    shaiko

    Points: 2
    Helpful Answer Positive Rating
Also, I'd optimize your code a bit:

optimize how?

Code:
		if ( index == 3 )
		{
			index = 0 ;
		}
		else
		{
			index ++ ;
		}

This is exactly what I wrote...
 

optimize how?

Code:
		if ( index == 3 )
		{
			index = 0 ;
		}
		else
		{
			index ++ ;
		}

This is exactly what I wrote...

Well, I wanted to you to try figure it youself, but if you're still asking, then I will tell you - use the modulo operator to wrap around the counter.
 
  • Like
Reactions: shaiko

    shaiko

    Points: 2
    Helpful Answer Positive Rating
Hello!

Well, I wanted to you to try figure it youself, but if you're still asking, then I will tell you - use the modulo operator to wrap around the counter.

Careful with the modulo operator, depending on the compiler, it can be extremely
inefficient. I got the problem once and had to fold back to the basic comparison
method.
Beside this, what I usually use to concatenate bytes is this:


Code C - [expand]
1
2
3
4
uint32 four_bytes_value;
uint8 * one_byte_value_ptr;
 
one_byte_value_ptr = (uint8 *)(&four_bytes_value);



Then you can fill up the pointer values.

Example: if you do this:


Code C - [expand]
1
2
3
4
one_byte_value_ptr[0] = 1;
one_byte_value_ptr[1] = 2;
one_byte_value_ptr[2] = 3;
one_byte_value_ptr[3] = 4;



then the 4 bytes value will be 0x01020304 or 0x04030201 depending on whether
your processor works in little or big endian.

Dora.
 
  • Like
Reactions: shaiko

    shaiko

    Points: 2
    Helpful Answer Positive Rating
Code:
		if ( index == 3 )
		{
			index = 0 ;
		}
		else
		{
			index ++ ;
		}

Here is an optimized version of the code above, without test/if:

Code:
		index++;
		index &= 3;

Edit: maybe better:

Code:
		index = (index + 1) & 3;
 
Edit: maybe better:

Code:
		index = (index + 1) & 3;


That's indeed even better than modulo, but itsn't it a nasty trap for newcomers? I mean, it works with "3" because that's the binary stuff, but it wouldn't work with any number, while modulo would...
 

For the original question:

Method #0 is efficient but not portable (will not give the same result on little-endian and big-endian machines)
Method #1 is portable but moves data around more than necessary.
 
  • Like
Reactions: FvM

    FvM

    Points: 2
    Helpful Answer Positive Rating
I thought doraemon's comment had obsoleted the modulo discussion.

At best, the compiler recognizes that (index+1) % 4 can be replaced by (index+1) & 3 for power of two modulo, at worst case it calls a divider function (if the compiler has no hardware divider). A very good optimizing compiler might recognize the comparison option.

Without referring to a specific processor and compiler it's impossible to determine the most effective coding style for a byte concatenation. Compiling the alternatives and comparing the code size (or possible speed) is the way to go.

Some constructs in the original code are however causing unwanted overhead with almost any compiler, e.g. using an ASCII constant as direction flag
Code:
direction == 'r'

- - - Updated - - -

I see that std_match also recalls the original question.

Method #1 is portable
Only if right shift is the intended concatenation method, which may or may not depend on processor endianess.
 

Code:
index = (++index) & 0x03;

The above implementation, rather than built in Modulo operator, at least in theory should save code size, once you are telling to compiler do not use any other instruction than Inc and And stored to Accumulator. Note that the count limit above consider exactly a number power of two [0-3] indexes. The Modulo in other hand is more general purpose Intended and should work for any number. Anyway, looking at the C/ASM listing of nowardays compilers, we can note that they somehow has already an "inteligence" to detect whan some optimization can ba done ( eg. a division made by shift operation with specifict numbers ), so I would not surprise if either Modulo and a generic implementation as made above would generate the same set of instructions for the particular account of 4 indexes.

Note:
There is an syntatical error at the original code, leastwise with compilers I have used; I let to you realize by yourself.

Code:
index ++ ;
 

Only if right shift is the intended concatenation method, which may or may not depend on processor endianess.

Code is portable if it gives the same result on all machines. If right shift is the desired operation is another question, but post #1 says that both methods are working.
 

Status
Not open for further replies.

Similar threads

Part and Inventory Search

Welcome to EDABoard.com

Sponsor

Back
Top