RSS feed
All(28) Eclipse(4) Java(5) JDBC(5) JSP(3) Maven(7) Personal (1) Spring(3) Tomcat(5)
<< How to handle redeploy of java WARs to tomcat5.5 on Windows and avoid jar file locking | Home | JavaRebel-Compiling and Reloading Java Code on the fly. Is much easier in Unexploded Format >>

Comparison of 2 Java Excel APIs on Large Workbooks/Worksheets

Comparison of Apache POI and JExcelApi

Essentially what I did was to compare the Apache POI HSSF Api and the JExcelApi.

My final findings show the most I could write on my machine before getting java.lang.OutOfMemory errors. I have 4GB of ram on my box. Basically I created the ExcelHelper inteface and had a PoiHelper implementation and a JExcelApi implementation. I then could compare apples to apples by making a jexcelapi junit tests which extended the poihelper tests with a different factor method for the ExcelHelper reference. In my junit tests basically I had a loop of paged queries which grabbed 10K Firestorm JDBC Dtos at a time and wrote them to the Excel Workbook.

 

public class PoiRegressionTest extends TestCase {

       private static final Logger log = Logger.getLogger(PoiRegressionTest.class);

 

       public static Test suite() {

              TestSuite suite = new TestSuite();

              suite.addTest(new PoiRegressionTest("testWriteSmallWorkbook"));

              suite.addTest(new PoiRegressionTest("testWriteLargeWorkbook"));

              return new IntegrationTestSetup(suite);

       }

      

       public ExcelHelper getExcelHelper(String fileName) throws Exception

       {

              return new PoiHelper(fileName, "0");

       }

 

 

public class JExcelApiRegressionTest extends PoiRegressionTest {

 

       private static final Logger log = Logger

       .getLogger(JExcelApiRegressionTest.class);

 

       public static Test suite() {

              TestSuite suite = new TestSuite();

              suite.addTest(new PoiRegressionTest("testWriteSmallWorkbook"));

              suite.addTest(new PoiRegressionTest("testWriteLargeWorkbook"));

              return new IntegrationTestSetup(suite);

       }

      

       public ExcelHelper getExcelHelper(String fileName) throws Exception

       {

              return new JExcelApiHelper(fileName, "0");

       }

 What I could write without Out Of Memory Errors:

Apache POI HSSF Api - 35 K rows of all 90+ columns of records.JExcelApi - 50K rows of all 90+ columns of records.
JExcelApi - 150K rows of 25 columns of records. This is spread across mutliple worksheets in a single workbook.

In the end game analysis it seems like POI is a bit more feature rich than JExcelApi, but JExcelApi is slightly more memory efficient. However they both suffer from the same problems in that Microsoft OLE documents have upstream pointers to downstream data which doesn’t allow you to stream them. Thereby ulitmately making the memory of the machine the limiting factor since the whole document must be in memory while its being written.

 

Microsoft Documents are not linear. They have upstream pointers to downstream documents. This shows the excel format.

 

http://www.onjava.com/pub/a/onjava/2003/04/16/poi_excel.html

 

Bla.xls {
    OLE2CDF headers
    "Workbook" stream {
        Workbook {
            Static String Table Record..
            Sheet names... and pointers <- this appears to be the issue
        } 
        Sheet {
            ROW
            ROW
            ...
            NUMBER RECORD (cell)
            LABELSST Record (cell)
            ...
        }
        Sheet
    }
}
... images, macros, etc.
Document Summary
Summary

 

Optimized PDFs are linear and therefore can be flushed like a stream and therefore memory of the box, as long as you page through the data, is not a limiting factor.  I believe the problem for us however is that they want the data in Excel and PDF is not going to meet the requirements of the “accounting” folks.

 


Tags :


Re: Comparison of 2 Java Excel Apis on Large Workbooks/Worksheets

one way to do that is to stream out the xml whihc is the MS excel style and then the client will try to open it in excel. As for the style, save a excel file as an xml will do. hope this helps.

Add a comment Send a TrackBack