Category Archives: Java

Streaming MySQL Results Using Java 8 Streams

The article is inspired by the posts here and here.

There is a RESTful service as the infrastructure for data access in our team. It is based on Jersey/JAX-RS and runs fast. However, it consumes large memory when constructing large data set as response. Since it builds the entire response in memory before sending it.

As suggested in the above posts. Streaming is the solution. They integrated Hibernate or Spring Data for easy adoption. But I need a general purpose RESTful service, say, I do not know the schema of a table. So I decided to implement it myself using raw JDBC interface.

My class is so-called MysqlStreamTemplate:

  • It does not extend JdbcTemplate, since there is only one interface for streaming, not one series. I’m not writing a general purpose library.
  • It is MySQL only, I have no time to verify with other relation databases.
  • It does accept a DataSource as the parameter of the its constructor.
  • Staff like Hibernate session is not concerned, since it maintains Statement & Connection by itself.
  • Staff like @Transcational is not concerned, since we do not care about transactions. Actually, MySQL gives HOLD_CURSORS_OVER_COMMIT in StatementImpl#getResultSetHoldability() in its JDBC driver, saying that our ResultSet survives after commit.

So, here is my class. NOTE: closing our Statement & Connection requires explicit invoke of Stream#close():

Read inline comments for additional details. Now the response entry and controller mapping:

Complete code can be find on my GitHub repository.

My simple benchmark script looks like:

Dramatic improvements in memory usage as shown in jconsole, especially Old Gen:
all_memory
old_gen_memory

Some raw data from jmap:

  • Jersey
  • Spring Boot
  • Spring Boot with Streams

Setting up Hadoop HDFS in Pseudodistributed Mode

Well, new to the big data world.

Following Appendix A in the book Hadoop: The Definitive Guide, 4th Ed, just get it to work. I’m running Ubuntu 14.04.

1. Download and unpack the hadoop package, and set environment variables in your ~/.bashrc.

Verify with:

The 2.5.2 distribution package is build in 64bit for *.so files, use the 2.4.1 package if you want 32bit ones.

2. Edit config files in $HADOOP_HOME/etc/hadoop:

3. Config SSH:
Hadoop needs to start daemons on hosts of a cluster via SSH connection. A public key is generated to avoid password input.

Verify with:

4. Format HDFS filesystem:

5. Start HDFS:

Verify running with jps command:

6. Some tests:

7. Stop HDFS:

8. If there is an error like:

Just edit $HADOOP_HOME/etc/hadoop/hadoop-env.sh and export JAVA_HOME explicitly again here. It does happen under Debian. Not knowing why the environment variable is not passed over SSH.

9. You can also set HADOOP_CONF_DIR to use a separate config directory for convenience. But make sure you have the whole directory copied from the Hadoop package. Otherwise, nasty errors may occur.

OO Impelementation in Java

Continue with last article, we will try to write an identical application to use OO features including: encapsulation, inheritance, polymorphism, properties, meta info and event-driven mechanism. Java supports the 3 basic features in language level. It uses interfaces to implements event-driven. To implements properties and meta info, we have to write our own code. We want to implements API like someObject.setProperty(prop-name, prop-value). I write my own NewObject class:

To use our setProperty()/getProperty() method, all classes should derive from the NewObject class. To be consistent with the JavaBean convention, we assume that the getter/setter function to be “get”/”set” + capitalize_first_letter_of(member-variable-name).

Property annotation and PropertyAccess enum are defined to indicate properties:

ClassInfo and ClassInfoList annotation are defined to indicate class meta info:

Let’s see how to use them, our Base is defined as:

Since our implementation of properties are simply methods, they can be inherited by subclasses. But the class meta info cannot be retrieved in subclasses. They just get their own.

I do not want to demo events/listeners code here, just find them in source code in my skydrive: http://cid-481cbe104492a3af.office.live.com/browse.aspx/share/dev/TestOO. In the TestJavaObject-{date}.zip file.

Zip Compression Bug in JDK was Fixed

The long-living zip compression file name encoding bug in JDK was finally fixed. Since the original ZipEntry class will encode/decode the file name in platform’s native encoding, one zip file created under one codepage cannot be decoded correctly under another codepage. The workaround is to use the ZipEntry class in the ant project. But in current JDK7 early access, the ZipEntry class also added the encoding support as the ant project.

This feature was found occasionally when I checked the JDK7 project’s changeset: http://download.java.net/jdk7/changes/jdk7-b57.html