Monday, October 31, 2011

Content Repository API for Java a.k.a. JCR

Acknowledgement and disclaimer
I used Wikipedia to do my research. And then, I added my own interpretation.

What is a Content Repository and why is it needed?
While Content Management Systems (CMS) stores users' content and allow people to access (read) it, there are few functions/features a CMS should provide for maintenance, query, version management, import, export, etc. Now that there are so many CMS around, people (involved in managing CMS) started asking whether a CMS provides these features. So, it has become more of a compliance issue.

These "Content Repository" features made into separate product (Example: Apache Jackrabbit). Some CMS(s) now delegate this "Content Repository" function to separate components (Example: Hippo CMS and Magnolia).

Java's implementation of Content Repository
"Content Repository API for Java", or JCR is Java's implementation of Content Repository. It has been developed via JSR-170 and JSR-283. The main Java packages are javax.jcr.

JCR has certain features:

  • Export and import via XML
  • Query by SQL and DOM
  • Associate types, attributes and values to each document


Implementation of JCR

  • Apache Jackrabbit is reference and open-source implementation of JCR.
  • Content Repository Extreme (CRX) is commercial JCR implementation.
  • Alfresco has CMS that offers JSR-170 API.
  • etc.



CMS using JCR

  • Hippo CMS and Magnolia use Apache Jackrabbit, and can switch to other JSR-170 implementation.
  • Oracle Beehive supports JSR-170.
  • etc.


Reference:


Monday, October 24, 2011

Introduction to Databases by db-class.org

Stanford University is offering a few free courses - Artificial Intelligence, Machine Language and Introduction to Databases. I signed up for the database course; link is here http://www.db-class.org/course/class/index.

I have taken this course before in college. However, it gives me an opportunity to refresh the ideas; so it feels great.

So far, I have finished Relational Algebra (RA), XML, DTD. I will have to cover XSD, SQL and few other topics.

For RA, the site has a simple workbench for the exercises. It is backed by a SQLLite installation. The expressions have to be written in a certain way and it does not support all aspects of RA, but that is not a problem. I am looking for a full-fledged open source tool.

There were two exercises on DTD and XSD. It forced me to write some DTD and it was helpful to shove the learning into the memory.

I will post the relevant links here later on.

References:

  • TBD
  • TBD

Android OS on VirtualBox

Out of curiosity, today I goggled about installing Android OS on VirtualBox. And, I found something. There is a port of Android OS (v2.3?) called LiveAndroid. It is distributed in LiveUSB and LiveISO format. Anyway, I installed following a blog post (TBD: Put the link here); it was very easy.

Few notes about the installation:

  • I didn't need any username/password to install - simply because it is an OS targeted for phone, where I  usually don't enter any credentials.
  • While running the Android OS, I did not see any mouse. It made sense - simply because it is an OS targeted for phone, where I  usually don't have a mouse.
References:
  1. TBD
  2. TBD

Saturday, April 30, 2011

Threads in Java

Book: Java Concurrency in Practice
Author: Brian Goetz, et. al.

Chapter 3: Sharing Objects

"Sharing" here means sharing between threads. Before we dive deep, let me reiterate some basic behavior by the JVM.

Basic Behavior
--------------
1. The whole JVM memory is roughly divided between "heap" and "stack".
2. All methods are mere container of code, static, and lifeless.
3. Threads are like living persons that walk the instructions mentioned in methods. I will refer a thread that is executing a method as "walking thread".
4. When threads create method-local objects, they are created on the heap; the references to objects are created on the stack. All primitives are created on the stack.
5. When threads create method-local primitives, they are created on the stack.
6. References and primitives passed as parameter are copies from the source. While in a method, threads use their copies.

If the source is modified by some other thread, there is no guarantee that the changes will be visible immediately to the current thread. We should assume that things will go wrong.

7. When a primitive is written, it is done as an "atomic" operation. This means, the complete content is either fully written or not.

Two exceptions - long and double; both of these are 64-bit primitives. They are written as two 32-bit operations.


8. Class-level primitives, references (and objects, of course) are stored in the heap.

Problems
========
* Stale Data caused by Visibility

The problem depicted by #6 is this - the walking thread is seeing. "stale" data. The reason it is happening is because of lack of "visibility" management.

To solve,
Option 1: Use synchronized.
Option 2: Use volatiles.

* 64-bit primitives are non-atomic

Use volatile to solve.

* Unsafe publication
** "this" escapes the constructor
** Reference escapes through different methods
** Mutable objects


Techniques and their usage
==========================
1. synchronized

Solves visibility problem. It provides mutex locks reading/writing - instance primitives and instance references. Method-locals do not need that; thread-confinement guarantees single-thread access, UNLESS we allow a reference escape.

Allows to create atomic operation.

2. volatile

Solves visibility problem.
Must be used with 64-bit primitives to overcome partial writing problem.

Catch: Each read and write operation are guaranteed to reflect the central data. Operations like x++, or x = x + 4 may corrupt data (e.g. after reading x as 3, another thread may have modified it to 10; the current thread will not see that; and hence, it will set x to 7)

To overcome, use volatile only when one thread writes and more than one thread reads.

3. Immutable object

TBD

4.

Other issues to highlight
=========================
1. Statement reordering
2. Multiple processors executing the same method for the same thread.


Things a developer should make sure to eliminate thread-related problems
==================================================================
Method level
------------
1. Do not worry about objects and primitives created as method-local. Unless you explicitly design for it, do not escape any reference to any method-local.

Constructor level
-----------------
1. Do not allow "this" escape during construction. Common ways it can happen - (a) Allowing "this" to another object (b) Staring a thread.

If you have to do all of these, (a) Make your constructor private, (b) Provide factory method, so that you can wire necessary things are the object is created.

Instance variables
------------------
1. Make all 64-bit primitives volatile to fix partial writing problem, or access them only through synchronized getter/setter.

2. Use volatile only when you can make sure that only one thread can modify volatiles.

3. Atomicity of for 1+ instance variables can be also achieved by putting those variables in another object holder where the reference is volatile. Again, only one thread should assign objects to the reference. ** Check this concept.
















Non-related topic
=================
1. How does hashtable works in Java

Tuesday, February 08, 2011

Some handy Linux links

Some handy Linux links

  • http://distrowatch.com/ - A single place to know about all Linux distributions.
  • http://www.tightvnc.com/ - A free tool that can let you use remote GUI, including Unix.
  • http://polishlinux.org/ - Another good site.
  • http://polishlinux.org/choose/comparison/ - Compare different distros of Linux side by side.