In an effort to resolve memory and performance problems with generating large CSV and tab delimited files in an application I wrote at Duke, I started hunting around for solutions.
Initially, I was using the java stringbuffer method, but found that it's really hard to be sure that CF doesn't use String objects, especially when doing things like calling out to an external function to perform formatting (ie, if the field is a string, then surround it with double quotes and escape any internal double quotes).
A simple file drop of 7200 rows and 140 columns took 68 seconds and sucked a lot of memory. And no, it wasn't the file writing that caused the problem, it was the call out to the formatting function.
If I performed the same drop using the tab delimited format, I didn't have to call out to that function, but the drop still took 30 seconds. I needed it to be faster because some of the drops my users perform are much much larger.
so I started hunting around for a java-based solution and found the JavaCSV library:
- Java CSV Homepage:
http://www.csvreader.com/java_csv.php - JavaCSV Sourceforge Project Page:
http://sourceforge.net/projects/javacsv/
After installing this library in my C:\Jrun4\servers\myInstance\cfusion.ear\cfusion.ear\WEB-INF/cfusion/lib directory and restarting Coldfusion, I was able to use the following code to generate my CSV files:
<cfset var fileOutput = createObject("java","com.csvreader.CsvWriter")>
<cfset fileOutput.init("#expandPath("..")#\drops\#filename#")>
<cfif format eq "TAB">
<cfset fileOutput.setDelimiter( javacast("char", " ") )>
</cfif>
<!--- write header --->
<cfloop from="1" to="#numFields#" index="i" step="1">
<cfset fileOutput.write( fieldsArray[i] ) >
</cfloop>
<!--- end of header row --->
<cfset fileOutput.endRecord()>
<!--- loop through results --->
<cfloop query="resultSet">
<!--- write record --->
<cfloop from="1" to="#numFields#" index="i" step="1">
<cfset fileOutput.write( resultSet[fieldsArray[i]][resultSet.currentRow].toString() )>
</cfloop>
<!--- write end of record --->
<cfset fileOutput.endRecord()>
</cfloop>
<cfset fileOutput.close()>
The same drop which had previously taken 68 seconds now only took 18 seconds - AND used considerably less memory.
As you can see, the code handles both CSV and tab-delimited formats AND handles the proper escaping of strings containing delimiters as well.
You are not logged in, so your subscription status for this entry is unknown. You can login or register here.
Post a comment (login required)
