Friday, January 18, 2019

Introduction to Apache POI - 1


Introduction to Apache POI - 1

Many times it is required to generate reports for an application or an application can receive data in excel files. In such situations, it required to either read or write to excel files.
There are many software which help you achieve it such as Apache POI, GrapeCity, Aspose Cells, JExcel, JXL …

Apache POI developed by Apache Software Foundation, is an open sourced library used by many Java developers as it provides easy way to design, modify Office files. It helps you almost imitate your original Office files.
Components of Apache POI:
Commonly used and supported file formats:

  • HSSF : excel files (.xls) 
  • HSLF : power point (.ppt)
  • HWPF : word documents (.doc)
  • XSSF : excel files(.xlsx)

Excel files formats:
  • There are two major formats Excel 93-2003 workbook- it follows binary file format of .xls extension
  • Excel 2007+ - it follows XML based file format of .xlsx extension
Apache POI has options to format both file formats. HSSF can be used for excel 93-2003 workbooks. XSSF can be used for excel 2007+ workbooks.
Apache POI provided us with another streaming API SXSSF which helps us work with vert large spreadsheets while using very less memory.

The following table synopsizes the comparative features of POI’s spreadsheet API:


Working with Workbooks:
Maven dependency:
<!-- https://mvnrepository.com/artifact/org.apache.poi/poi -->
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>4.0.1</version>
</dependency>

<!-- https://mvnrepository.com/artifact/org.apache.poi/poi-ooxml -->
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>4.0.1</version>
</dependency>

 If you are not using maven, then you need to add the following jars:
  • poi-3.17.jar
  • poi-ooxml-3.17.jar
  • poi-ooxml-schemas-3.17.jar
  • curvesapi-1.04.jar
  • xmlbeans-2.6.0.jar
  • commons-codec-1.10.jar
  • commons-collections4-4.1.jar
When you start using excel, you hear common words like workbook, sheet, row, cell, formula…
The same has been implemented in Apache POI.

Workbook: It is the high level representation of a excel workbook. User create this object to read or write a workbook. HSSFWorkbook and XSSFWorkbook class implement this class to read or write compatible formats.

            Creating a Workbook:
Workbook wb = new HSSFWorkbook ();
Workbook wb = new XSSFWorkbook ();


 Copying one Workbook to another:
             Workbook wb = new HSSFWorkbook ();
              …
        try  (OutputStream fileOut = new FileOutputStream("workbook.xls")) {
                wb.write(fileOut);
                 }


  Workbook wb = new XSSFWorkbook ();
                try  (OutputStream fileOut = new FileOutputStream("workbook.xls")) {
                wb.write(fileOut);
                 }
Sheet: It is the high level representation for worksheet. Sheets are the central structures of a workbook where user does most of his spreadsheet work.
Creating a new Sheet in a workbook:
Workbook wb = new XSSFWorkbook ();
wb.createSheet(String sheetname);

Get existing sheet in a workbook:
            Workbook wb = new XSSFWorkbook (file name);
            wb.getSheet(string sheetname);

Row: High level representation of a row in spreadsheet.
            Creating a new row:
Row row=sheet.createRow(int rownum);

Getting a created row:
Row row=sheet.getRow(int rownum);



Cell:  High level representation of a cell in a row of a spreadsheet.
Cells can be numeric, formula-based or string-based (text). The cell type specifies this. String cells cannot contain numbers and numeric cells cannot contain strings (at least according to our model). Client apps should do the conversions themselves. Formula cells have the formula string, as well as the formula result, which can be numeric or string.
Cells should have their number (0 based) before being added to a row.
Creating a cell:
            Cell cell=row.createCell(int cellnumber);
            cell.setValue(“creating a new cell”);

CellStyle: High level representation for formatting or styling the information for a cell in excel.

References:




No comments:

Post a Comment