Generate Java code using XML Schema

Many tools are available regarding XMLBeans: XMLBeans Tools 
Here I only copied description of scomp which compiles schema to JAVA code. I already described how to get xsd from xml file here.

Generate Java code from XSD

If you want to get right to it with your own XML schema and instance, follow these basic steps:

  1. Install XMLBeans.
  2. Compile your schema. Use scomp to compile the schema, generating and jarring Java types. For example, to create a employeeschema.jar from an employeesschema.xsd file:
    scomp -out employeeschema.jar employeeschema.xsd
  3. Write code. With the generated JAR on your classpath, write code to bind an XML instance to the Java types representing your schema. Here's an example that would use types generated from an employees schema:
    File xmlFile = new File("c:\employees.xml"); 
    
    // Bind the instance to the generated XMLBeans types.
    EmployeesDocument empDoc = 
     EmployeesDocument.Factory.parse(xmlFile); 
    
    // Get and print pieces of the XML instance.
    Employees emps = empDoc.getEmployees(); 
    Employee[] empArray = emps.getEmployeeArray(); 
    for (int i = 0; i < empArray.length; i++) 
    { 
     System.out.println(empArray[i]); 
    }

Read a tutorial.

Read their tutorial to get a sense of XMLBeans basics.

***

Computer Science's Math

There is nothing wrong if you apply your calculus to say (a+b)/2 = a + (b-a)/2. However, computer science says (a+b)/2 is NOT always equals to a + (b-a)/2. But why ? Think for a while.

Here is the answer: To compute (a+b) /2, one has to add two numbers first and then divide the sum by 2. Let's say we have a computer with 32 bit register size and both a and b are integers. If the sum of a and b i.e. a+b is bigger than the number it can be represented by 32 bit integer, then there will be an overflow. Thus, even if the actual result of (a+b)/2 can be represented by 32 bits, the intermediate calculation can be larger than it is supported by 32 bits.

On the other hand, when you compute a + (b-a)/2 , you are adding a, and (b-a)/2. obviously, (b-a) / 2 can be supported by 32 bits. Since we know that the final sum is also an integer of 32 bits, a+ (b-a)/2 will never be suffered from the overflow problem. Hence, (a+b)/2 may not always be equal to a + (b-a)/2!

Expected Value of a Function

Definition (From WorlframAlpha):
The expectation value of a function in a variable is denoted or. For a single discrete variable, it is defined by
where is the probability density function.
Example:

An algorithm that is explained after decades !

What happens if you propose an algorithm that nobody understands ? Folks might say that your algorithm is silly. Don't be sad. It could be possible that folks are not capable enough to understand what you meant. Your algorithm could be explained even after your death.

Physicist J. W. Gibbs has similar story. He proposed Gibbs sampling but nobody could describe the algorithm for more than eight decades. Here is a paragraph from Wikipedia regarding Gibbs sampling:

"Gibbs sampling is an example of a Markov chain Monte Carlo algorithm. The algorithm is named after the physicist J. W. Gibbs, in reference to an analogy between thesampling algorithm and statistical physics. The algorithm was described by brothers Stuart and Donald Geman in 1984, some eight decades after the passing of Gibbs."


So conclusion here is not to be sad even though people do not understand you and your idea now. They'll understand after years and years of your death !

Monte Carlo Calculation of Pi

Note: I've not written this article. Rather I've copied here because I found this tutorial very useful. It makes me clear about the Monte Carlo method.  Readers are advised to visit the original website (here) to get the original article's taste.
----------------------------------
How can Monte Carlo be used to calculate value of Pi ?
----------------------------------
We can play Dart game to calculate value of Pi. Consider we have following board for darts:


If you are a very poor dart player, it is easy to imagine throwing darts randomly at figure, and it should be apparent that of the total number of darts that hit within the square, the number of darts that hit the shaded part (circle quadrant) is proportional to the area of that part. In other words,



If you remember your geometry, it's easy to show that 
If each dart thrown lands somewhere inside the square, the ratio of "hits" (in the shaded area) to "throws" will be one-fourth the value of pi. If you actually do this experiment, you'll soon realize that it takes a very large number of throws to get a decent value of pi...well over 1,000. To make things easy on ourselves, we can have computers generate random* numbers.

If we say our circle's radius is 1.0, for each throw we can generate two random numbers, an x and a y coordinate, which we can then use to calculate the distance from the origin (0,0) using the Pythagorean theorem. If the distance from the origin is less than or equal to 1.0, it is within the shaded area and counts as a hit. Do this thousands (or millions) of times, and you will wind up with an estimate of the value of pi. How good it is depends on how many iterations (throws) are done, and to a lesser extent on the quality of the random number generator. Simple computer code for a single iteration, or throw, might be:

x=(random#)
 y=(random#)
 dist=sqrt(x^2 + y^2)
 if dist.from.origin (less.than.or.equal.to) 1.0 
  let hits=hits+1.0

Ideas are simple but powerful !

Ideas are simple but they are powerful. Here is a story of a such idea, the idea of using "Captcha". If you don't know what a Captcha is, don't be worried. I'll explain it later in this article.
I'd heard this story from a radio about "WHAT IT DOES" but didn't know "HOW IT DOES". Today I read the HOW part and thus want to share to everyone.

What is a CAPTCHA ?
A CAPTCHA is a program that can tell whether its user is a human or a computer. You've probably seen them — colorful images with distorted text at the bottom of Web registration forms. CAPTCHAs are used by many websites to prevent abuse from "bots," or automated programs usually written to generate spam. No computer program can read distorted text as well as humans can, so bots cannot navigate sites protected by CAPTCHAs.

What is RECAPTA ?
It is a free CAPTCHA service used to digitize books, news paper etc.

How does RECAPTA help in digitizating documents ?
OCR is a device used to digitize a document. The problem is that the device cannot digitize all the words in the document correctly. It informs if it cannot digitize the words. A word which is not digitized by OCR is mixed with a known word and a new captcha is formed. The new captcha is challenged to human. Those who type the known word correctly are believed to type the word to be digitized correctly. To make it perfect, the new captcha is challenged against many people. By this way the confidence of the word to be digitized is improved.

Therefore, the idea is simple but powerful !

Reference:
Recapta

Writing LaTeX equations online

Recently I needed to write a lot of math equations for a homework of Inference Theory course. I could have used MSWord or OpenOffice to write the equations. However, I know LaTeX as well. The problem is that I don't want to install it. I found following website which can be used to edit equations using LaTeX command.

LaTeX Online Equation Editor:  codecogs.com
Equation Help: Wikipedia

Below is a sample equation I've written using this website.

Khan Academy: Random Variables and Probability Distribution

 People are so nice. They share their knowledge for free. Salman Khan is one such example which runs a organization called the Khan Academy. The Khan Academy is a not-for-profit educational organization created by Salman Khan. With the stated mission "of providing a high quality education to anyone, anywhere", the Academy supplies a free online collection of over 2,000 videos on mathematics, history, finance, physics, chemistry, astronomy, and economics (Source Wikipedia).

I knew this website after I watched the video on Random Variables and Probability Distribution which I'm learning now. Lectures are pretty good. I  love such people and such activities !

Demography of Facebook

Facebook demographics of 2011 has been revealed. The figures are interesting to look at :

Generate XSD from XML

Sometimes we need to generate XML-Schema (e.g. XSD) from a given XML document. There are many tools but the one I use is called trang. Here are simple steps to generate xsd of a xml file (input.xml):

1. Download and Unzip trang-20030619.zip
2. Go to trang-20030619 folder and use following command:
java -jar trang.jar -I xml -O xsd input.xml output.xsd

Searching Assertion using TextRunner

TextRunner searches hundreds of millions of assertions extracted from 500 million high-quality Web pages.
Links:
  1. Paper
  2. Demo

Creating a RSS Feed using ROME

Creating a RSS Feed using Java is straightforward. You just need to know how to use ROME API. The website presents some sample codes too. Following code, copied from Codeidol.com, will be useful.