Spring Framework for Apache Hadoop 2.3.0 GA released

Spring Foundation released its Spring Framework for Apache Hadoop 2.3.0 with new features and many improvements.

Spring Framework for Apache Hadoop 2.3.0 GA released

The version of Spring for Apache Hadoop 2.3.0 is released on 22nd December 2015 and this release comes with new features and many improvements. In this post we are examining the features added to this release.

Spring Framework for Hadoop helps the developers to quickly develop and deploy Hadoop based applications. Spring for Hadoop provides the APIs for using HDFS, MapReduce, Pig, and Hive power in their application.

Following new features and improvements are added to Spring for Apache Hadoop 2.3:

  • In this release support for Hive 1.x is added and HiveTemplate was updated to work with HiveServer2
  • A new batch tasklet added for Apache Spark
  • FlushTimeoutTrigger is also added to StoreObjectSupport which can be used to flush to disk during writes
  • The internal state machine now implements though new project called "spring-statemachine".
  • The jobHistoryAddress is added to SpringHadoopProperties for Boot configuration
  • The build has been updated to use Spring Framework 4.2.4, Spring Batch 3.0.6.RELEASE, Spring Boot 1.3.1.RELEASE and Spring Integration 4.2.4.RELEASE.

This version of Spring framework can be used to run the Apache Spark jobs on the Hadoop clusters. Check our tutorial 'How to setup Apache Spark Development Environment?' for getting started with the Apache Spark Framework.

Version specific artifices support :

  • 2.3.0.RELEASE (default - Apache Hadoop stable 2.7.1)
  • 2.3.0.RELEASE-hadoop26 (Apache Hadoop 2.6.0)
  • 2.3.0.RELEASE-phd30 (Pivotal HD 3.0)
  • 2.3.0.RELEASE-phd21 (Pivotal HD 2.1)
  • 2.3.0.RELEASE-cdh5 (Cloudera CDH 5.4)
  • 2.3.0.RELEASE-hdp23 (Hortonworks HDP 2.3)

How to use Spring Framework for Apache Hadoop 2.3.0 GA?

You can add the following dependency in your pom.xml file:

<dependencies>
    <dependency>
        <groupId>org.springframework.data</groupId>
        <artifactId>spring-data-hadoop</artifactId>
        <version>2.3.0.RELEASE</version>
    </dependency>
</dependencies>

In the Gradle based application following dependency can be used:

dependencies {
    compile 'org.springframework.data:spring-data-hadoop:2.3.0.RELEASE'
}

Spring Hadoop framework allows the developers to create application using Spring, Spring Batch, and Spring Integration which can be deployed on the Hadoop Clusters.

Check our Spring Framework tutorials section.