Jakarta POI Jakarta provides Jakarta POI APIs for manipulation of various file formats based upon Microsoft's OLE 2 Compound Document format using pure Java. In short, we can read and write MS Excel files using Java. Currently we can read and write excel
Overview of the POI APIs
Jakarta POI
Jakarta provides Jakarta POI APIs for manipulation of various file formats
based upon Microsoft's OLE 2 Compound Document format using pure Java. In short,
we can read and write MS Excel files using Java. Currently we can read and write
excel file and PowerPoint file only. In future Jakarta POI (Java API To Access Microsoft Format Files)
will be able to read and write Word files using Java. However, Jakarta POI
have a complete API for porting other OLE 2 Compound Document formats.
OLE 2 Compound Document Format based files include most Microsoft Office files such as XLS and DOC as well as MFC serialization API based file formats.
With the collaboration of other projects e.g. Open Office.org, Jakarta collaborates in documenting the XLS format; and Lucene for which Jakarta
will soon have file format
interpreters. Jakarta provides all these functionality.
Why POI is used ?
We can use POIFS if we had a document written in OLE 2 Compound Document
Format, probably written using MFC, that you needed to read in Java.We
can use HSSF if you needed to read or write an Excel file using Java
(XLS). We can use HWPF for Word Documents , HSLF for PowerPoint Documents
and HPSF for Document Properties.You can also read and modify
spreadsheets using POI API, although right now writing is more mature.
Overview
The following are components of the entire POI project and a brief summary of their purpose.
1.The POIFS is used for OLE 2 Documents . POIFS is the oldest and most stable part of the project. It is
the port of the OLE 2 Compound Document Format to pure Java. It supports both read and write functionality.
2.The HSSF is used for Excel Documents. HSSF is the port of the Microsoft Excel 97(-2003) file format (BIFF8) to pure Java. It supports read and write capability.
3.The HWPF is used for Word Documents. HWPF is
the port of the Microsoft Word 97 file format to pure Java. It supports read, and limited write capabilities.
4.The HSLF is used for PowerPoint Documents.HWSL is the port of the Microsoft PowerPoint 97(-2003) file format to pure Java. It
is supporting read and write capabilities of some, but can not yet all of the core
records.
5.The HPSF is used for Document Properties. HPSF is the port of the OLE 2 property set format to pure Java. Property sets are mostly use to store a document's properties (title, author, date of last modification etc.), but they can be used for application-specific purposes
also. HPSF supports reading and writing of properties. However, the current POI release only
provides reading feature. In order to write properties , we will have to fetch the latest POI version.
Version history
1.Version 3.0-FINAL:
This version has been released at 18 May 2007. In this version one PATCH(
named POM ) has been fixed by Jakarta Apache ,one new PACH has been
added to create picture to HSSFShapeGroup, added new feature to detect
Office 2007 XML documents and throw a meaningful exception,
added additional feature HSLF support for PowerPoint and add HWPF
for extract the image.
2.Version 3.0-alpha3:
This version has been released at 12 December 2006.In this version a
new feature HSLF has been added to support for PowerPoint.
3.Version 3.0-alpha2:
This version has been released at 16 June 2006.In this version additional
features are formula support, PowerPoint support and has been extended
ASCII support.
4.Version 3.0-alpha1:
This version has been released at 04 June 2005.This version has been
added a patch for HSSF hyperlink formula size problem, add image and
initial PowerPoint Support.
That includes the support for text extraction across from file, getting individual slides, and their notes, and extracting text from those;
initial support for changing (but not adding) text (NB).
5.Version 2.5.1-FINAL:
This version has been released at 29 February 2004.This version has added
outlining support and HSSFDateUtil.getExcelDate().
6.Version 2.5-FINAL :(2004-02-29)
7.Version 2.0-FINAL (2004-01-26)
8.Version 2.0-RC2 (2004-01-11)
9.Version (2003-11-02)
10.Version 2.0-pre3 (2003-07-29)
11.Version 2.0-pre2 (2003-07-06)
12.Version 2.0-pre1 (2003-05-17)
13.Version 1.10-dev (2003-02-19)
14.Version 1.8-dev (2002-09-20)
15.Version 1.7-dev (Release date not recorded)
16.Version 1.5.1 (2002-06-16)
18.Version 1.5 (2002-05-06)
19.Version 1.1.0 (Release date not recorded)
20.Version 1.0.0 (Release date not recorded)
21.Version 0.14.0 (Release date not recorded)
22.Version 0.13.0 (Release date not recorded)
23.Version 0.12.0 (Release date not recorded)
24.Version 0.11.0 (Release date not recorded)
25.Version 0.10.0 (Release date not recorded)
26.Version 0.7 (and interim releases) (Release date not recorded)
27.Version 0.6 (release) (Release date not recorded)