programming

Postgre OO database example

Traditional NonOO DB:

CREATE TABLE capitals (
name text,
population real,
altitude int, — (in ft)
state char(2)
);
CREATE TABLE non_capitals (
name text,
population real,
altitude int — (in ft)
);
CREATE VIEW cities AS
SELECT name, population, altitude FROM capitals
UNION
SELECT name, population, altitude FROM non_capitals;

 

This works OK as far as querying goes, but it gets ugly when you need to update several rows, to name
one thing.

A better solution is this:
CREATE TABLE cities (
name text,
population real,
altitude int — (in ft)
);
CREATE TABLE capitals (
state char(2)
) INHERITS (cities);

Standard
programming

Merge 2 BST’s

 

  • Flattening a BST into a sorted list is O(N)
    • It’s just “in-order” iteration on the whole tree.
    • Doing it for both is O(n1+n2)
  • Merging two sorted lists is into one sorted list is O(n1+n2).
    • Keep pointers to the heads of both lists
    • Pick the smaller head and advance its pointer
    • This is how the merge of merge-sort works
  • Creating a perfectly balanced BST from a sorted list is O(N)
    • The value at the middle would be the root, and recurse.
    • In our case the sorted list is of size n1+n2. so O(n1+n2)
    • The resulting tree would be the conceptual BST of binary searching the list

 

BinaryTree* sortedArrayToBST(int arr[], int start, int end) {

  if (start > end) return NULL;
  // same as (start+end)/2, avoids overflow.
  int mid = start + (end - start) / 2;
  BinaryTree *node = new BinaryTree(arr[mid]);
  node->left = sortedArrayToBST(arr, start, mid-1);
  node->right = sortedArrayToBST(arr, mid+1, end);
  return node;
}
BinaryTree* sortedArrayToBST(int arr[], int n) {
  return sortedArrayToBST(arr, 0, n-1);
}
Standard
programming

Dijkstra’s Shortest Path Algorithm in Java

http://www.vogella.com/articles/JavaAlgorithmsDijkstra/article.html

Dijkstra’s Shortest Path Algorithm in Java

Dijkstra’s Algorithms describes how to find the shortest path from one node to another node in a directed weighted graph. This article presents a Java implementation of this algorithms.

1. Shortest Path through a network – Dijkstra Algorithm

1.1. Overview

Dijkstra finds the shortest path from a source to all destination in a directed graph (single source shortest path problem). During this process it will also determine a spanning tree for the graph.

1.2. Graph

A graph is made of out nodes and directed edges which defines a connection from one node to another node. A node (or vertex) is a discrete position in a graph. Edges can be directed an undirected. Edges have an associated distance (also called costs or weight). The distance between two nodes a and b is labeled as [a,b].

The mathematical description for graphs is G= {V,E}, meaning that a graph is defined by a set of vertexes (V) and a collection of edges.

The “order” of a graph is the number of nodes.

The “size” of a graph is the number of edges.

Typical graph problems are:

  • finding the shortest path from a specific node to another node
  • finding the maximum possible flow through a network where each edges has a pre-defined maximum capacity (maximum-flow minimum-cut problem)

 

The following will focus on finding the shortest path from one node to another node in a graph.

1.3. Algorithms Description

The idea of Dijkstra is simple.

Dijkstra partitions all nodes into two distinct sets. Unsettled and settled. Initially all nodes are in the unsettled sets, e.g. they must be still evaluated. A node is moved to the settled set if a shortest path from the source to this node has been found.

Initially the distance of each node to the source is set to a very high value.

First only the source is in the set of unsettledNodes.

The algorithms runs until the unsettledNodes are empty. In earch iteration it selects the node with the lowest distance from the source out the unsettled nodes. If reads all edges which are outgoing from the source and evaluates for each destination node in these edges which is not yet settled if the known distance from the source to this node can be reduced if the selected edge is used. If this can be done then the distance is updated and the node is added to the nodes which need evaluation.

In Pseudocode the algorihm can be described as follows. Please note that Dijkstra also determines the presuccessor of each node on its way to the source. I’ll leave that out of the pseudo code to simplify it.

 

				

Foreach node set distance[node] = HIGH
SettledNodes = empty
UnSettledNodes = empty

Add sourceNode to UnSettledNodes
distance[sourceNode]= 0

while (UnSettledNodes is not empty) {
	evaluationNode = getNodeWithLowestDistance(UnSettledNodes)
	remove evaluationNode from UnSettledNodes 
    add evaluationNode to SettledNodes
    evaluatedNeighbors(evaluationNode)
}

getNodeWithLowestDistance(UnSettledNodes){
	find the node with the lowest distance in UnSettledNodes and return it 
}

evaluatedNeighbors(evaluationNode){
	Foreach destinationNode which can be reached via an edge from evaluationNode AND which is not in SettledNodes {
		edgeDistance = getDistance(edge(evaluationNode, destinationNode))
		newDistance = distance[evaluationNode] + edgeDistance
		if (distance[destinationNode]  > newDistance ) {
			distance[destinationNode]  = newDistance 
			add destinationNode to UnSettledNodes
		}
	}
}

 

2. Model

A graph consists out of vertices and edges. These are represented by the following model.

 

			
package de.vogella.algorithms.dijkstra.model;

public class Vertex {
	final private String id;
	final private String name;

	public Vertex(String id, String name) {
		this.id = id;
		this.name = name;
	}
	public String getId() {
		return id;
	}

	public String getName() {
		return name;
	}

	@Override
	public int hashCode() {
		final int prime = 31;
		int result = 1;
		result = prime * result + ((id == null) ? 0 : id.hashCode());
		return result;
	}

	@Override
	public boolean equals(Object obj) {
		if (this == obj)
			return true;
		if (obj == null)
			return false;
		if (getClass() != obj.getClass())
			return false;
		Vertex other = (Vertex) obj;
		if (id == null) {
			if (other.id != null)
				return false;
		} else if (!id.equals(other.id))
			return false;
		return true;
	}

	@Override
	public String toString() {
		return name;
	}

}

 

A edge has a source and a destination.

 

			
package de.vogella.algorithms.dijkstra.model;

public class Edge  {
	private final String id; 
	private final Vertex source;
	private final Vertex destination;
	private final int weight; 

	public Edge(String id, Vertex source, Vertex destination, int weight) {
		this.id = id;
		this.source = source;
		this.destination = destination;
		this.weight = weight;
	}

	public String getId() {
		return id;
	}
	public Vertex getDestination() {
		return destination;
	}

	public Vertex getSource() {
		return source;
	}
	public int getWeight() {
		return weight;
	}

	@Override
	public String toString() {
		return source + " " + destination;
	}

}

 

Both represent a graph.

 

			
package de.vogella.algorithms.dijkstra.model;

import java.util.List;

public class Graph {
	private final List<Vertex> vertexes;
	private final List<Edge> edges;

	public Graph(List<Vertex> vertexes, List<Edge> edges) {
		this.vertexes = vertexes;
		this.edges = edges;
	}

	public List<Vertex> getVertexes() {
		return vertexes;
	}

	public List<Edge> getEdges() {
		return edges;
	}

}

 

3. Algorithmus

3.1. Implementation

The following is a simple implementation of Dijkstra’s algorithm. It does not use any performance optimization (e.g. by using a PriorityQueue for the UnSettledNodes of does not cache the result of the target evaluation of the edges) to make the algorihms as simple as possible.

 

				
package de.vogella.algorithms.dijkstra.engine;

import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.HashSet;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.Set;

import de.vogella.algorithms.dijkstra.model.Edge;
import de.vogella.algorithms.dijkstra.model.Graph;
import de.vogella.algorithms.dijkstra.model.Vertex;

public class DijkstraAlgorithm {

	private final List<Vertex> nodes;
	private final List<Edge> edges;
	private Set<Vertex> settledNodes;
	private Set<Vertex> unSettledNodes;
	private Map<Vertex, Vertex> predecessors;
	private Map<Vertex, Integer> distance;

	public DijkstraAlgorithm(Graph graph) {
		// Create a copy of the array so that we can operate on this array
		this.nodes = new ArrayList<Vertex>(graph.getVertexes());
		this.edges = new ArrayList<Edge>(graph.getEdges());
	}

	public void execute(Vertex source) {
		settledNodes = new HashSet<Vertex>();
		unSettledNodes = new HashSet<Vertex>();
		distance = new HashMap<Vertex, Integer>();
		predecessors = new HashMap<Vertex, Vertex>();
		distance.put(source, 0);
		unSettledNodes.add(source);
		while (unSettledNodes.size() > 0) {
			Vertex node = getMinimum(unSettledNodes);
			settledNodes.add(node);
			unSettledNodes.remove(node);
			findMinimalDistances(node);
		}
	}

	private void findMinimalDistances(Vertex node) {
		List<Vertex> adjacentNodes = getNeighbors(node);
		for (Vertex target : adjacentNodes) {
			if (getShortestDistance(target) > getShortestDistance(node)
					+ getDistance(node, target)) {
				distance.put(target, getShortestDistance(node)
						+ getDistance(node, target));
				predecessors.put(target, node);
				unSettledNodes.add(target);
			}
		}

	}

	private int getDistance(Vertex node, Vertex target) {
		for (Edge edge : edges) {
			if (edge.getSource().equals(node)
					&& edge.getDestination().equals(target)) {
				return edge.getWeight();
			}
		}
		throw new RuntimeException("Should not happen");
	}

	private List<Vertex> getNeighbors(Vertex node) {
		List<Vertex> neighbors = new ArrayList<Vertex>();
		for (Edge edge : edges) {
			if (edge.getSource().equals(node)
					&& !isSettled(edge.getDestination())) {
				neighbors.add(edge.getDestination());
			}
		}
		return neighbors;
	}

	private Vertex getMinimum(Set<Vertex> vertexes) {
		Vertex minimum = null;
		for (Vertex vertex : vertexes) {
			if (minimum == null) {
				minimum = vertex;
			} else {
				if (getShortestDistance(vertex) < getShortestDistance(minimum)) {
					minimum = vertex;
				}
			}
		}
		return minimum;
	}

	private boolean isSettled(Vertex vertex) {
		return settledNodes.contains(vertex);
	}

	private int getShortestDistance(Vertex destination) {
		Integer d = distance.get(destination);
		if (d == null) {
			return Integer.MAX_VALUE;
		} else {
			return d;
		}
	}

	/* * This method returns the path from the source to the selected target and * NULL if no path exists */
	public LinkedList<Vertex> getPath(Vertex target) {
		LinkedList<Vertex> path = new LinkedList<Vertex>();
		Vertex step = target;
		// Check if a path exists
		if (predecessors.get(step) == null) {
			return null;
		}
		path.add(step);
		while (predecessors.get(step) != null) {
			step = predecessors.get(step);
			path.add(step);
		}
		// Put it into the correct order
		Collections.reverse(path);
		return path;
	}

}

 

3.2. Test

The following is a small JUnit Test to validate the correctness of the algorithm.

 

				
package de.vogella.algorithms.dijkstra.test;

import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;

import org.junit.Test;

import de.vogella.algorithms.dijkstra.engine.DijkstraAlgorithm;
import de.vogella.algorithms.dijkstra.model.Edge;
import de.vogella.algorithms.dijkstra.model.Graph;
import de.vogella.algorithms.dijkstra.model.Vertex;

import static org.junit.Assert.assertNotNull;
import static org.junit.Assert.assertTrue;

public class TestDijkstraAlgorithm {

	private List<Vertex> nodes;
	private List<Edge> edges;

	@Test
	public void testExcute() {
		nodes = new ArrayList<Vertex>();
		edges = new ArrayList<Edge>();
		for (int i = 0; i < 11; i++) {
			Vertex location = new Vertex("Node_" + i, "Node_" + i);
			nodes.add(location);
		}

		addLane("Edge_0", 0, 1, 85);
		addLane("Edge_1", 0, 2, 217);
		addLane("Edge_2", 0, 4, 173);
		addLane("Edge_3", 2, 6, 186);
		addLane("Edge_4", 2, 7, 103);
		addLane("Edge_5", 3, 7, 183);
		addLane("Edge_6", 5, 8, 250);
		addLane("Edge_7", 8, 9, 84);
		addLane("Edge_8", 7, 9, 167);
		addLane("Edge_9", 4, 9, 502);
		addLane("Edge_10", 9, 10, 40);
		addLane("Edge_11", 1, 10, 600);

		// Lets check from location Loc_1 to Loc_10
		Graph graph = new Graph(nodes, edges);
		DijkstraAlgorithm dijkstra = new DijkstraAlgorithm(graph);
		dijkstra.execute(nodes.get(0));
		LinkedList<Vertex> path = dijkstra.getPath(nodes.get(10));

		assertNotNull(path);
		assertTrue(path.size() > 0);

		for (Vertex vertex : path) {
			System.out.println(vertex);
		}

	}

	private void addLane(String laneId, int sourceLocNo, int destLocNo,
			int duration) {
		Edge lane = new Edge(laneId,nodes.get(sourceLocNo), nodes.get(destLocNo), duration );
		edges.add(lane);
	}
}

 

Standard
programming, tech

memcached love story chinese

https://docs.google.com/View?id=dcm8p5tk_11djb3mtgp

 

这是一个关于缓存的故事……

 

两位中学时期的好朋友, 李雷和韩梅梅,开始了他们的创业之旅。 他们一起开始做网站。是那种经典的Web服务器外加数据库结构的网站。来自互联网的每个的用户通过对Web服务器请求他们所需要的页面,Web服务器就找数据库要构建这些页面所需要的数据。

 

李雷负责编程,韩梅梅负责维护管理Web服务器和数据库.

 

一天,韩梅梅发现他们的数据库生病了! 由于负荷过重而上吐下泻!韩梅梅发现它的平均负载有20! 李雷就问韩梅梅: “啊,那我们咋办哟!” 韩梅梅说: “我听说memcached这种技术不错,它在livejournal中发挥了很大作用呢!” “好吧,那我们也试试看吧!” 李雷回答到。

 

韩梅梅看了看她的6台Web服务器,分析了一阵,然后她决定用其中的3台来运行这个memcached服务器。她把每台Web服务器都额外添加了1G的内存 – 也就是说,现在他们有3个memcached实例, 每个能存1G的数据。

搞定之后,李雷和韩梅梅静坐在一旁,看着他们刚架设好的memcached 服务器。

 

然后呢? “它一点动静都没有啊,这些memcached没有和其他任何程序通信啊,也没有任何有用的数据, 更惨的是,现在数据库的负载到了25了!”

 

无畏的李雷拿起了memcached客户端使用手册,对着韩梅梅深情配置好的memcached服务器说, “有什么好怕的!”他说. “我想到一个点子了” 他把那3台memcached服务器的ip地址和端口添加到一个php 数组中:

 

$MEMCACHE_SERVERS = array(

“10.1.1.1”, //web1
“10.1.1.2”, //web2
“10.1.1.3”, //web3 );
然后他创建了一个叫’$memcache’的对象
$memcache = new Memcache();
foreach($MEMCACHE_SERVERS as $server){
    $memcache->addServer ( $server );

李雷继续想啊想,想啊想, 突然灵机一动: “对了! 我记得有个页面经常执行 SELECT * FROM hugetable WHERE timestamp > lastweek ORDER BY timestamp ASC LIMIT 50000 ; 这个查询要花费5秒的时间呢! 我来把他放到memcache里面吧”, 结果他就把这些查询放进’$memcache’对象中, 他的代码询问:

 

Select语句所得的结果在我的memcache里面吗? 如果不在, 就执行查询并把结果放到memcache里:

 

$huge_data_for_frong_page = $memcache->get(“huge_data_for_frong_page”); if($huge_data_for_frong_page === false){

     $huge_data_for_frong_page = array();
     $sql = “SELECT * FROM hugetable WHERE timestamp > lastweek
                            ORDER BY timestamp ASC LIMIT 50000″;
    $res = mysql_query($sql, $mysql_connection);
    while($rec = mysql_fetch_assoc($res)){
         $huge_data_for_frong_page[] = $rec;
     }
// cache for 10 minutes
$memcache->set(“huge_data_for_frong_page”, $huge_data_for_frong_page, 600); }
// use $huge_data_for_frong_page how you please

李雷提交了他的代码. 韩梅梅胆战心惊~ 呼~~ 数据库负载下降到10了! 网站速度良好!

不过,韩梅梅就纳闷了,怎么一回事啊? 我用cacti看到所有的数据请求都是去了其中一个memcached服务器,可是我明明建立了3个啊? 韩梅梅很快学会了用ASCII协议telnet一个个登录她设立的memcached 服务器进行查询:

 

喂喂, huge_data_for_front_page, 你在哪里?

第一个memcached 没反应

第二个memcached也没反应

第三个memcached, 吐出来一大堆的Q$%@#$&%*($^@#^#$WY%@^N@$%

数据! 只有第三台memcached缓存了李雷的数据!

 

搞不懂, 韩梅梅只好在邮件列表里面去询问, 她得到的回答惊人的一致, “这是因为memcached是一个分布式的缓存系统!” 但是这又是什么意思呢? 韩梅梅还是不是很懂,她走到李雷面前,让他缓存更多的数据 “让我们静观其变, 我们一定能搞懂它的!”

 

“好吧, 我手头上还有一个很慢的查询语句, 而且每秒还需要执行100次, 或许我们可以把它也缓存起来” 李雷二话不说就去编程了, 就像之前那样, 数据库的负载下降到8了!

 

李雷继续把越来越多的数据缓存起来,他还使用了在FAQ学到的新技术! 与此同时, 数据库的负载继续下降: 7,5,3,2,1 !

 

“很好”韩梅梅说, “让我们继续” 她继续观察系统的图表, 三台memcached都运行起来了! 他们都在处理请求!

韩梅梅继续在三台memcached上查询李雷的输入数据,但每次其结果都只能在一台memcached里找到. 为什么要这样呢?她不解,奇怪了,把数据都在三台memcached上都存着难道不是更好么? “等一下”韩梅梅想,我给每台memcached机器1G的内存, 也就是说这样他们一共可以缓存3G的数据库数据,而不只是1G, 太棒了!

 

“mmm…,不过好像还是有问题, 其中一台memcached的机器已经过时了, 需要更新升级, 这样的话我就必须把这台机器给关了,这样一来我心爱的memcached会咋样啊~?” “那…让我试试看!” 韩梅梅二话不说,关掉了其中一台机器,然后她小心翼翼地观察系统曲线变化… “糟了,数据库的负载又直线上升了! 1…现在是2了,不过还算可以接受吧.

其余的2台memcached都还工作正常. 不算太糟糕,只是多了一些无法被缓存的数据.” 韩梅梅试验完后, 把原先的机器重启, 几分钟后, 数据库的负载又恢复到1了.

 

“缓存又自动恢复了! 我懂了, 如果其中有部分机器下线了,只是说会有部分应该缓存的数据不存在而已, 没什么大不了的, 偶稀饭!”

 

就这样,李雷和韩梅梅继续架设网站, 继续缓存. 当他们遇到了什么问题了,他们就在邮件列表,或者FAQ来寻求答案.他们继续监视着数据库负载的图表,从此永远快乐地生活在一起.

Standard
programming

The Java serialization algorithm revealed

http://www.javaworld.com/community/node/2915

 

Serialization is the process of saving an object’s state to a sequence of bytes; deserialization is the process of rebuilding those bytes into a live object. The Java Serialization API provides a standard mechanism for developers to handle object serialization. In this tip, you will see how to serialize an object, and why serialization is sometimes necessary. You’ll learn about the serialization algorithm used in Java, and see an example that illustrates the serialized format of an object. By the time you’re done, you should have a solid knowledge of how the serialization algorithm works and what entities are serialized as part of the object at a low level.

Why is serialization required?

In today’s world, a typical enterprise application will have multiple components and will be distributed across various systems and networks. In Java, everything is represented as objects; if two Java components want to communicate with each other, there needs be a mechanism to exchange data. One way to achieve this is to define your own protocol and transfer an object. This means that the receiving end must know the protocol used by the sender to re-create the object, which would make it very difficult to talk to third-party components. Hence, there needs to be a generic and efficient protocol to transfer the object between components. Serialization is defined for this purpose, and Java components use this protocol to transfer objects.

Figure 1 shows a high-level view of client/server communication, where an object is transferred from the client to the server through serialization.

A high-level view of serialization in action

Figure 1. A high-level view of serialization in action (click to enlarge)

How to serialize an object

In order to serialize an object, you need to ensure that the class of the object implements the java.io.Serializableinterface, as shown in Listing 1.

Listing 1. Implementing Serializable

import java.io.Serializable;

class TestSerial implements Serializable {
public byte version = 100;
public byte count = 0;
}

In Listing 1, the only thing you had to do differently from creating a normal class is implement the java.io.Serializableinterface. The Serializable interface is a marker interface; it declares no methods at all. It tells the serialization mechanism that the class can be serialized.

Now that you have made the class eligible for serialization, the next step is to actually serialize the object. That is done by calling the writeObject() method of the java.io.ObjectOutputStream class, as shown in Listing 2.

Listing 2. Calling writeObject()

public static void main(String args[]) throws IOException {
FileOutputStream fos = new FileOutputStream("temp.out");
ObjectOutputStream oos = new ObjectOutputStream(fos);
TestSerial ts = new TestSerial();
oos.writeObject(ts);
oos.flush();
oos.close();
}

Listing 2 stores the state of the TestSerial object in a file called temp.outoos.writeObject(ts); actually kicks off the serialization algorithm, which in turn writes the object to temp.out.

To re-create the object from the persistent file, you would employ the code in Listing 3.

Listing 3. Recreating a serialized object

public static void main(String args[]) throws IOException {
FileInputStream fis = new FileInputStream("temp.out");
ObjectInputStream oin = new ObjectInputStream(fis);
TestSerial ts = (TestSerial) oin.readObject();
System.out.println("version="+ts.version);
}

In Listing 3, the object’s restoration occurs with the oin.readObject() method call. This method call reads in the raw bytes that we previously persisted and creates a live object that is an exact replica of the original object graph. BecausereadObject() can read any serializable object, a cast to the correct type is required.

Executing this code will print version=100 on the standard output.

The serialized format of an object

What does the serialized version of the object look like? Remember, the sample code in the previous section saved the serialized version of the TestSerial object into the file temp.out. Listing 4 shows the contents of temp.out, displayed in hexadecimal. (You need a hexadecimal editor to see the output in hexadecimal format.)

Listing 4. Hexadecimal form of TestSerial

AC ED 00 05 73 72 00 0A 53 65 72 69 61 6C 54 65
73 74 A0 0C 34 00 FE B1 DD F9 02 00 02 42 00 05
63 6F 75 6E 74 42 00 07 76 65 72 73 69 6F 6E 78
70 00 64

If you look again at the actual TestSerial object, you’ll see that it has only two byte members, as shown in Listing 5.

Listing 5. TestSerial’s byte members

public byte version = 100;
public byte count = 0;

The size of a byte variable is one byte, and hence the total size of the object (without the header) is two bytes. But if you look at the size of the serialized object in Listing 4, you’ll see 51 bytes. Surprise! Where did the extra bytes come from, and what is their significance? They are introduced by the serialization algorithm, and are required in order to to re-create the object. In the next section, you’ll explore this algorithm in detail.

Java’s serialization algorithm

By now, you should have a pretty good knowledge of how to serialize an object. But how does the process work under the hood? In general the serialization algorithm does the following:

  • It writes out the metadata of the class associated with an instance.
  • It recursively writes out the description of the superclass until it finds java.lang.object.
  • Once it finishes writing the metadata information, it then starts with the actual data associated with the instance. But this time, it starts from the topmost superclass.
  • It recursively writes the data associated with the instance, starting from the least superclass to the most-derived class.

I’ve written a different example object for this section that will cover all possible cases. The new sample object to be serialized is shown in Listing 6.

Listing 6. Sample serialized object

class parent implements Serializable {
int parentVersion = 10;
}

class contain implements Serializable{
int containVersion = 11;
}
public class SerialTest extends parent implements Serializable {
int version = 66;
contain con = new contain();

public int getVersion() {
return version;
}
public static void main(String args[]) throws IOException {
FileOutputStream fos = new FileOutputStream("temp.out");
ObjectOutputStream oos = new ObjectOutputStream(fos);
SerialTest st = new SerialTest();
oos.writeObject(st);
oos.flush();
oos.close();
}
}

This example is a straightforward one. It serializes an object of type SerialTest, which is derived from parent and has a container object, contain. The serialized format of this object is shown in Listing 7.

Listing 7. Serialized form of sample object

AC ED 00 05 73 72 00 0A 53 65 72 69 61 6C 54 65
73 74 05 52 81 5A AC 66 02 F6 02 00 02 49 00 07
76 65 72 73 69 6F 6E 4C 00 03 63 6F 6E 74 00 09
4C 63 6F 6E 74 61 69 6E 3B 78 72 00 06 70 61 72
65 6E 74 0E DB D2 BD 85 EE 63 7A 02 00 01 49 00
0D 70 61 72 65 6E 74 56 65 72 73 69 6F 6E 78 70
00 00 00 0A 00 00 00 42 73 72 00 07 63 6F 6E 74
61 69 6E FC BB E6 0E FB CB 60 C7 02 00 01 49 00
0E 63 6F 6E 74 61 69 6E 56 65 72 73 69 6F 6E 78
70 00 00 00 0B

Figure 2 offers a high-level look at the serialization algorithm for this scenario.

An outline of the serialization algorithm

Figure 2. An outline of the serialization algorithm

Let’s go through the serialized format of the object in detail and see what each byte represents. Begin with the serialization protocol information:

  • AC EDSTREAM_MAGIC. Specifies that this is a serialization protocol.
  • 00 05STREAM_VERSION. The serialization version.
  • 0x73TC_OBJECT. Specifies that this is a new Object.

The first step of the serialization algorithm is to write the description of the class associated with an instance. The example serializes an object of type SerialTest, so the algorithm starts by writing the description of the SerialTest class.

  • 0x72TC_CLASSDESC. Specifies that this is a new class.
  • 00 0A: Length of the class name.
  • 53 65 72 69 61 6c 54 65 73 74SerialTest, the name of the class.
  • 05 52 81 5A AC 66 02 F6SerialVersionUID, the serial version identifier of this class.
  • 0x02: Various flags. This particular flag says that the object supports serialization.
  • 00 02: Number of fields in this class.

Next, the algorithm writes the field int version = 66;.

  • 0x49: Field type code. 49 represents “I”, which stands for Int.
  • 00 07: Length of the field name.
  • 76 65 72 73 69 6F 6Eversion, the name of the field.

And then the algorithm writes the next field, contain con = new contain();. This is an object, so it will write the canonical JVM signature of this field.

  • 0x74TC_STRING. Represents a new string.
  • 00 09: Length of the string.
  • 4C 63 6F 6E 74 61 69 6E 3BLcontain;, the canonical JVM signature.
  • 0x78TC_ENDBLOCKDATA, the end of the optional block data for an object.

The next step of the algorithm is to write the description of the parent class, which is the immediate superclass of SerialTest.

  • 0x72TC_CLASSDESC. Specifies that this is a new class.
  • 00 06: Length of the class name.
  • 70 61 72 65 6E 74SerialTest, the name of the class
  • 0E DB D2 BD 85 EE 63 7ASerialVersionUID, the serial version identifier of this class.
  • 0x02: Various flags. This flag notes that the object supports serialization.
  • 00 01: Number of fields in this class.

Now the algorithm will write the field description for the parent class. parent has one field, int parentVersion = 100;.

  • 0x49: Field type code. 49 represents “I”, which stands for Int.
  • 00 0D: Length of the field name.
  • 70 61 72 65 6E 74 56 65 72 73 69 6F 6EparentVersion, the name of the field.
  • 0x78TC_ENDBLOCKDATA, the end of block data for this object.
  • 0x70TC_NULL, which represents the fact that there are no more superclasses because we have reached the top of the class hierarchy.

So far, the serialization algorithm has written the description of the class associated with the instance and all its superclasses. Next, it will write the actual data associated with the instance. It writes the parent class members first:

  • 00 00 00 0A: 10, the value of parentVersion.

Then it moves on to SerialTest.

  • 00 00 00 42: 66, the value of version.

The next few bytes are interesting. The algorithm needs to write the information about the contain object, shown in Listing 8.

Listing 8. The contain object

contain con = new contain();

Remember, the serialization algorithm hasn’t written the class description for the contain class yet. This is the opportunity to write this description.

  • 0x73TC_OBJECT, designating a new object.
  • 0x72TC_CLASSDESC.
  • 00 07: Length of the class name.
  • 63 6F 6E 74 61 69 6Econtain, the name of the class.
  • FC BB E6 0E FB CB 60 C7SerialVersionUID, the serial version identifier of this class.
  • 0x02: Various flags. This flag indicates that this class supports serialization.
  • 00 01: Number of fields in this class.

Next, the algorithm must write the description for contain‘s only field, int containVersion = 11;.

  • 0x49: Field type code. 49 represents “I”, which stands for Int.
  • 00 0E: Length of the field name.
  • 63 6F 6E 74 61 69 6E 56 65 72 73 69 6F 6EcontainVersion, the name of the field.
  • 0x78TC_ENDBLOCKDATA.

Next, the serialization algorithm checks to see if contain has any parent classes. If it did, the algorithm would start writing that class; but in this case there is no superclass for contain, so the algorithm writes TC_NULL.

  • 0x70TC_NULL.

Finally, the algorithm writes the actual data associated with contain.

  • 00 00 00 0B: 11, the value of containVersion.

Conclusion

In this tip, you have seen how to serialize an object, and learned how the serialization algorithm works in detail. I hope this article gives you more detail on what happens when you actually serialize an object.

About the author

Sathiskumar Palaniappan has more than four years of experience in the IT industry, and has been working with Java-related technologies for more than three years. Currently, he is working as a system software engineer at the Java Technology Center, IBM Labs. He also has experience in the telecom industry.

Resources

Standard