Real time, FAST data storing in tab-delimtied in C++

Status
Not open for further replies.

syedshan

Advanced Member level 1
Joined
Feb 27, 2012
Messages
463
Helped
27
Reputation
54
Reaction score
26
Trophy points
1,308
Location
Jeonju, South Korea
Visit site
Activity points
5,134
Hey every one


Problem: My system consist of reading data from sensor at real time (using ADC+ FPGA) then save in DDR3 and then after after algorithm implementation FAST DMA using C++ back from FPGA to

1. display on screen ( I am looking for appropriate libraries, like QT or boost may be)
2. simultanous file storage in tab-delimited format.

Currently I am implementing only part 2, I have been able to do it with only writing the data from c++ (without sensor) to FPGA -> DDR3 to test my algorithm and it was successful. But DMA read for my whole files + data from DDR3 -> FPGA to PC was very fast (< 0.4 seconds) but when storing in tab-delimited format time increase to 9 seconds (500 files of 40000 data WORDs each).

Threading is not an option since it is just waste of resources and immensely increase time...
What else can I do. Need experience advice

Shan
 

As a first step, I would use a profiling tool to find out what's the actual bottleneck. Or design tests that allow to determine the processing time of each data processing step.
 
Hi
Thank you for reply

I know what is the bottleneck. The file saving method in .txt (or tab delimited in other words) format. I all the files in the .bin (binary format) and it took only extra 0.15 seconds and whole process of DMA read from FPGA and file saving in binary was done in less than 6 seconds.
So I think this is the bottle-neck that the std library in C++ is not fast enough to save the INT values in file (or txt file in other words). But note that for the final movie generation I need to have the data in tab-delimited form, hence I need to convert back (using Matlab) that .bin file in the appropriate format and then generate movie.

But now that I see what profiler is, I suppose it will help me doing some good stuff. thanks

The following is my code snippet for the single DMA transfer + file saving


Code:
       //read from DMA file
        rc = sipif_readdata(pInDMA, (40000)*2);
	if(rc != SIPIF_ERR_OK) {
	printf("Error in reading data\n"); 
	return -1; 		}	
		
        sll_pInDMA = (ssint *) pInDMA;
	
        //Saving in file in integer format (forming an excel sheet)
        for(int x_lp = 0; x_lp < 40000; x_lp ++){			
			fprintf(opfile , "%hi\t", (*(sll_pInDMA + x_lp)));			
			if(top_n < 199)	top_n ++; 
			else { fprintf(opfile, "\n"); top_n = 0;  }
	}


thanks
 
Last edited:

Should be further distinguished, is it a problem decoímal conversion (processor speed) or file-IO itself (non-optimal file handling). In the former case you might consider to do the "atoi()" decimal conversion in FPGA.
 

Usually we chant "testbench, testbench" around here. But in this case I agree with FvM: profile the site from orbit, it's the only way to be sure. For all you know it could be an excessively crappy file system that has problems handling all of 500 fopen's. Or maybe it's C++ stream implementation, but unlikely given the rather low number of words (20M). Most likely it's just some non-optimal file reading...

Best way is to profile it and see where your cpu time goes. And if all else fails, you might reconsider your storage methods. As in, if you really really need speed you might want to skip tab seperated files and go for a binary format. But as said, if you only have 20M words (which I assume are max 64-bit words, not 64K-bit words) then on a modern system that should be peanuts. Unless maybe you are reading uncached files from a really crappy usb stick or something. What are those files stored on? It's maybe juuuust big enough that storage would matter, but most likely it's just inefficient file access.
 
Thank you for replies.

previosuly I used to design some test methods and write extra codes to know about the time consumed by the different processes, now I can us microprofiler which I found to be very handy thanks

ok so now coming to the issue. ^^

so I noticed that the most time 9.01 sec is being consumed in writing text file function. While DMA read only took only few hundred ms (both for 500 files) please see the image

if you really really need speed you might want to skip tab seperated files and go for a binary format.

I tested already in the binary format earlier, and it took only extra < 0.3 seconds to save the file in binary format (all 500 files). Hence total around 0.6 seconds for Whole reading process from
FPGA -> DMA -> malloced buffer -> .bin file save

But when I did for tab-delimited
FPGA -> DMA -> malloced buffer in PC-> .txt file save

As planned in our system only when we need to save the files and use it for future. But to generate the movie at the run-time, we must take it as an excel file. This is the necessary part.

Moreover, I tried the ofstream (and although I knwo prior to this ofstream is dead slow, but when I tried this same process end up in 60 seconds....sigh sigh !!). Hence fprintf() looks the best available option for me now...

Do you have experience over making some sort of my own function or method to save the files ( I have not such experience, looking forward if can !!)

Otherwise please share any relevant idea so that I can bring this time as close to real-time as possible. (may be 5 seconds or less...thanks)
 

Hence fprintf() looks the best available option for me now...
It is if you like pain. You will probably be better of already by doing snprintf() to a reasonably large buffer, and then writing this buffer to file. Otherwise you are doing 10 gazilion tiny writes to a file, which tends to be slow. And an extra you can do with fprintf() as well as snprintf() is to write N words with one single {f,sn}printf() if you know that you are always getting multiples of N. Less function call overhead.
 

fprintf() is a stream IO function, so it involves basic file buffering, e.g. with 512 byte buffer size. In addition, the buffer parameters can be adjusted for better performance.

Controlling the buffering yourself may be still reasonable. Unless you explicitely disproved it, I'll assume that the decimal conversion performed as part of %hi formatted print will consume most of the processing time. Some extra time is wasted by parsing format strings.
 

Status
Not open for further replies.
Cookies are required to use this site. You must accept them to continue using the site. Learn more…