Tremend Tech Blog

"Software is a great combination between artistry and engineering. When you finally get done and get to appreciate what you have done it is like a part of yourself that you've put together." (Bill Gates)

Looking for software experts?

Need an expert advice on software development? Need consulting work done in time and at high standards? Tremend has the right solution for you.

We can provide expertise in:
  • high traffic and complex content website infrastructures
  • website development-advanced web programming with PHP, .NET, Java, Flash/Flex, Ajax

Our friends

Create a Solr filter that replaces diacritics

August 28th, 2007 by Sebastian Mitroi

Some languages (like Romanian) have special characters (diacritics, often called accent marks). It’s generally useful to remove diacritic marks from characters, for example when you create an index with Solr. You don’t want to index text with these characters because you want to find for example both words “proprietăţi” and “proprietati”. If you are using Solr to index your text you have to create a Solr filter.
First of all you have to put the filter in the schema.xml configuration file :


<fieldtype name="text_st" class="solr.TextField" positionIncrementGap="100">
            <analyzer>
                <tokenizer class="solr.StandardTokenizerFactory"/>
                // ... some other filters for example lower case filter
                <filter class="solr.LowerCaseFilterFactory"/>   
                <filter class="ro.tremend.solr.diacritics.DiacriticsFilterFactory"/>

            </analyzer>
</fieldtype>

Then create 3 small classes and a properties file. The filter factory for Solr DiacriticsFilterFactory :

package ro.tremend.solr.diacritics;

import org.apache.lucene.analysis.TokenStream;
import org.apache.solr.analysis.BaseTokenFilterFactory;

/**
 * Create a Solr Filter Factory for diacritics
 *
 * @author Sebastian
 *
 */
public class DiacriticsFilterFactory extends BaseTokenFilterFactory {
	public TokenStream create(TokenStream input) {
		return new DiacriticsFilter(input);
	}
}

Now you have to create the filter class DiacriticsFilter :

package ro.tremend.solr.diacritics;

import org.apache.lucene.analysis.*;
import java.io.IOException;

/**
 * Create the diacritics filter
 *
 * @author Sebastian
 *
 */
public final class DiacriticsFilter extends TokenFilter {
	public DiacriticsFilter(TokenStream in) {
		super(in);
	}

	public final Token next() throws IOException {
		Token t = input.next();

		if (t == null)
			return null;

		t.setTermText(DiacriticsUtils.replaceDiacritics(t.termText()));
		return t;
	}
}

and finally the class that does the work DiacriticsUtils :

package ro.tremend.solr.diacritics;

import java.util.HashMap;
import java.util.Map;
import java.util.MissingResourceException;
import java.util.ResourceBundle;
import java.util.Set;

/**
 * Replace romanian characters
 *
 * @author Sebastian
 *
 */
public class DiacriticsUtils {
	private static Map diacritics = new HashMap();

	static {
		// Get diacritics from diacritics.properties
		try {
			ResourceBundle resource = ResourceBundle.getBundle("diacritics");
			Set keySet = resource.keySet();
			for (String key : keySet) {
				diacritics.put(key, resource.getString(key));
			}
		} catch (MissingResourceException e) {
			e.printStackTrace();
		}
	}

	/**
	 * Replace all diacritics in a string
	 *
	 * @param s the string
	 * @return the string without diacritics
	 */
	public static String replaceDiacritics(String s) {
		for (String key : diacritics.keySet()) {
			s = s.replaceAll(key, diacritics.get(key));
		}
		return s;
	}

	public static Map getDiacritics() {
		return diacritics;
	}
}

This class needs a properties file with the diacritics you want to replace:
diacritics.properties

\\u0102=A
\\u0103=a
... define all your language specific characters


Now the index will not contain diacritics, but you have to remove the diacritics from the query too. To do that just write this:

textToFind = DiacriticsUtils.replaceDiacritics(textToFind);


I hope this will help.

Share/Save

Posted in Java, General | 7 Comments »

Avoiding SQL joins with java enums

August 28th, 2007 by Sebastian Mitroi

Lets say you have a Coffee object and 3 sizes for coffee (small, medium and large).You can create a Coffee class like this

public class Coffee {
	private Long id;
	// add other propeties
	private CoffeeSize coffeeSize;

	// ... setters and getters
}	

and the CoffeeSize class:

public class CoffeeSize {
	private int id;
	private String name;
	private String i18nKey;

	// ... setters and getters
}

But every time you load a Coffee object you execute a sql join to load the CoffeeSize.

Of course you can create a CoffeeSize object and define some public static final CoffeeSize objects, but with java 1.5 you can do something like this :
create a java enum named CoffeeSize and persists just the CoffeeSize id.

package coffee;

import java.util.HashMap;
import java.util.Map;

public enum CoffeeSize {
	SMALL(1, "Small", "coffeeSize.small"),
	MEDIUM(1, "Medium", "coffeeSize.medium"),
	LARGE(1, "Large", "coffeeSize.large");

	private int id;
	private String name;
	private String i18nKey;

	private static Map coffeeSizes = new HashMap();

	static{
		CoffeeSize[] coffeeSizesArray = CoffeeSize.values();
		for (CoffeeSize coffeeSize : coffeeSizesArray) {
			coffeeSizes.put(coffeeSize.getId(), coffeeSize);
		}
	}

	private CoffeeSize(int id, String name, String key) {
		this.id = id;
		this.name = name;
		i18nKey = key;
	}

	public int getId() {
		return id;
	}

	public String getName() {
		return name;
	}

	/**
	 * This is an i18n key defined in message.properties
	 * @return the i18n key
	 */
	public String getI18nKey() {
		return i18nKey;
	}

	/**
	 * For the id stored in database get the CoffeeSize object
	 * @param id the id stored in database
	 * @return the {@link CoffeeSize} object
	 */
	public static CoffeeSize getCoffeeSizeById(Integer id) {
		return CoffeeSize.coffeeSizes.get(id);
	}
}

The Coffee object will look something like this (I used hibernate and ejb3 annotations):

package coffee;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.Table;
import javax.persistence.Transient;

@Entity
@Table(name = "coffee")
public class Coffee {
	private Long id;
	// add other propeties
	private Integer coffeeSizeId;

	@Id
	@GeneratedValue(strategy = GenerationType.IDENTITY)
	public Long getId() {
		return id;
	}

	public void setId(Long id) {
		this.id = id;
	}

	public Integer getCoffeeSizeId() {
		return coffeeSizeId;
	}

	public void setCoffeeSizeId(Integer coffeeSizeId) {
		this.coffeeSizeId = coffeeSizeId;
	}

	@Transient
	public CoffeeSize getCoffeeSize(){
		return CoffeeSize.getCoffeeSizeById(coffeeSizeId);
	}
}


Look at the getCoffeeSize method: it is transient(is not persisted). The persisted fields are id and coffeeSizeId. You avoid a sql join putting private Integer coffeeSizeId instead private CoffeeSize coffeeSize and create a getCoffeeSize method that returns a CoffeeSize object. All the CoffeeSize objects are loaded in memory in the static map coffeeSizes.

But remember, this will work only if you don’t want to add a new size for coffee without recompiling and deploying your application.

Share/Save

Posted in Java, General | No Comments »

Hibernate annotations - default value

August 27th, 2007 by Marius Hanganu

If you wanted to set the default value using hibernate annotations, you’ve probably had some difficulties, as it was the case for me. Some posts on the web talk about default values to the members of the Java class. That is, if you declare

class Test {

    private Integer count = 3;

    @Column(name = “count”, nullable = false)

    public Long getCount() {

        return Count;

    }

}

you should have the default value set in the database.

Well, this does not seem to work (at least not for me). So the first solution I found involves the usage of “columndefinition”. Hence, it is database dependent, since Hibernate specifies the usage of “columndefinition” attribute for database specific declarations. The following will work well with MySQL - the database of choice for my current project:


class Test {

    private Integer count = 3;

    @Column(name = “count”, nullable = false, columnDefinition = “bigint(20) default 0″)

    public Long getCount() {

        return Count;

    }

}

Again - this is database dependent, so use it if your project is db dependent.

Share/Save

Posted in Java, General | 6 Comments »

Dojo vs Ext.js - How Dojo lost in front of other UI frameworks like Ext js

August 22nd, 2007 by Marius Hanganu

I’ve been meaning to write these thoughts for a long time. I have been using Dojo from version 0.2.x (almost two years) and I have promoted Dojo in all the projects I’ve been involved. Long story short - I was and still am - a huge fan of Dojo.

However, in the last few months, I’ve been using Ext js and actually promoted the idea of rewriting an old UI for another project (I hope Martin will share more with us about his Ext js experience).

Both projects are great. Actually dojo is greater than Ext js :-). It has a lot of super-uber-cool stuff in it:

  • superb infrastructure - the dojo.event.connect is the most prominent example; dojo.io.bind is another great functionality
  • i18n and a11y - I’m not even sure if other JS frameworks are implementing this
  • good solid written API - one can easily write its own custom widgets
  • dojo storage and dojo offline - excellent ideas from Brad Neuberg
  • the most ingenious graphics package - uses Canvas for FF and VML for IE making 3d graphics in browser portable between browsers
  • many, many others - just check Dojo’s website and you can find out more

Ext JS is a relatively young framework (compared to dojo) that is best described by Dojo’s own website: “It features a large number of consistent, good looking widgets with an
emphasis on pixel-perfect layout and desktop-like UIs across browsers.
Originally developed to run on top of YUI and later JQuery, EXT now has
it’s own low-level library, removing the need for 3rd party
dependencies. The EXT community is very active and good documentation
is available for the library.

As they put it - Ext js is just a widget library with eye catchy UI. Much less than dojo which has dojo, dijit and dojox. But the true facts here are:

  1. People need widgets. The number of developers wishing to adopt a base framework without a widget hierarchy is ridiculously small compared to the number of users searching for the building blocks of a Web UI - the widgets
  2. A good UI is the first selling point, whether your project it’s open source or not. This is the first and foremost lesson. Alex Russell did not get that. Jack Slocum did. That’s why the Ext js forum are now having their traffic at least triple compared to dojo’s. Dojo’s themes are L.O.U.S.Y. One cannot use them outside the box. On the other hand - Ext JS looks fantastic. The perfect thing for an administrative interface. Perhaps you’ll customize it for a public website, but then again, who’s really using such powerful frameworks for their public website? Try looking for that in their mailing lists. I tried that several times during the last two years and the answers are insignificant. To give an answer to that - almost everyone uses and loves prototype.
  3. The ever lasting table widget. Just look at the number of questions/requests in dojo’s mailing list for a good table widget that supports resizing, pagination, etc. That should’ve been one of the first widgets to develop. Ext js has a marvellous widget here.
  4. Page loading time. Again, this one of Dojo’s ghosts that haunted them until their 0.9 release. Perhaps they’ve fixed it, but until now, it required so much effort if you had your custom widgets, or if you just needed a custom build.

Of all these reasons, I blame number 2 the most. This is a HUGE blocker in adopting a framework for a project. I adopted Dojo with much enthusiasm and I am still waiting for it to become the ultimate framework, since it has such a great potential.

But as a pragmatic programmer, if you have something that gets your job done, you use it. I don’t want to create an awful looking interface and work on “pixel-perfection” another several days. Clients are not paying for customizing Dojo - they’ll just ask - “if you could’ve built the same UI with another library and looked much cooler, why didn’t you do it?”. (we can argue here also about the usability of Dojo’s widgets vs Ext js’ widgets, for I believe Ext js is simpler to use - but this is a different topic).

Dojo had several attempts in pushing for some good looking themes. Unfortunately they did not get that “good looking widgets” and “pixel-perfect” layout. Widgets are still difficult to integrate (I haven’t switched to dojo 0.9 yet), and as I said, developers need widgets. Widgets that work well with other widgets, that are easy to program. Just compare the mail example from Dojo with the RSS feeder written by Jack Slocum.

So the people behind Dojo - excellent developers and arhitects, you must consider some marketing advices. It would rise both Dojo (and Sitepen of course) to the level where it should be - the best JS framework out there. Everyone needs marketing - even open source projects …

technorati tags:, , ,

Share/Save

Posted in HTML, Javascript | 14 Comments »

Javascript advanced tutorial

August 21st, 2007 by Marius Hanganu

I prepared a quick Javascript tutorial during the weekend and presented it to my colleagues in Tremend. You can view it better here. Although there are entire presentations or documents dedicated to what may be covered by one single line in this presentation, I tried to summarize some of the Javascript highlights to my colleagues.

Some of the subjects covered were:

  • common mistakes
  • common constructions for a web developer
  • arguments, apply, prototype keywords
  • utility functions with prototype, dojo, ext js, DWR
  • AOP in JS - the simplest way to learn AOP is using Javascript
  • usage of Javascript in Java


Share/Save

Posted in HTML, Javascript | No Comments »

How to set the default charset to utf-8 for create table when using hibernate with java persistence annotations

August 14th, 2007 by spostelnicu

Yesterday I encountered a problem when trying to persist a String value into a MySQL column of type ‘text‘ (the problem also occurs for column types ‘tinytext‘, ‘mediumtext‘ etc.)

The first confusing thing was that the error message returned by mysql was
java.sql.BatchUpdateException: Data truncation: Data too long for column 'my_column'

The confusing thing was that the column was of type ‘text‘ (length=65536), and I then changed it to type ‘mediumtext‘ (length=16777215), but the value that I was trying to persist had only around 6000 characters, so it was not a problem with the data length.
Instead the one thing that was noticeable about my String value was that it contained non-ASCII characters (particularly Romanian characters with diacritics).

After a little search on the web, I found the following mysql bug description http://bugs.mysql.com/bug.php?id=17872
A comment on that page ([8 Mar 2006 9:53] [ name withheld ]) also contains some more links to similar bug descriptions.

Now it was a little clearer that the problem was caused by the Romanian characters (and the error message was just stupid), so the next thing to do was to figure out why the database table didn’t store UTF-8 characters.

Although I created the database with
create database mydb character set utf8 collate utf8_general_ci;
still the tables created inside it had the collation latin1_swedish_ci.

My application uses Hibernate for persistence and uses java persistence annotations to specify the hibernate mappings.

In my case the code is something like:


import javax.persistence.*;

@Entity
@Table(name = "my_table")
public class MyClass implements Serializable {
    private String myValue;

    @Column(name = "my_column", columnDefinition = "mediumtext", length = 16777215)
    public String getMyValue() {
        return this.myValue;
    }

    public void setMyValue(String value) {
        this.myValue = value;
    }
}

The database schema is automatically generated based on the classes and hibernate mappings, by using the following ant target:


    <taskdef name="hibernatetool"
             classname="org.hibernate.tool.ant.HibernateToolTask"
             classpathref="project.classpath"/>

    <target name="hbm2ddl-schema"
            description="Generates the database schema from hibernate mappings">
        <hibernatetool destdir="">
            <classpath refid="project.classpath"/>
            <annotationconfiguration configurationfile="${classes.dir}/hibernate.cfg.xml"/>
            <hbm2ddl export="true" drop="false" create="true" haltonerror="true"/>
        </hibernatetool>
    </target>

If I had created the database tables by writing the SQL DDL by hand, I would have used the following script:


  create table `my_table` (
      `Id` int(11) NOT NULL auto_increment,
      `my_column` mediumtext NOT NULL default ”,
      ......
  ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

But in my case the database schema is automatically generated by HibernateTools, so I have to specify the default charset for the table somewhere in the @Table annotation, or hibernate configuration, or hibernatetool parameters.

After trying a few options (and I must admit I did not search thoroughly, so if you know of some better way to specify the default charset, please leave a comment), I chose the following solution:
I implemented my own custom org.hibernate.dialect.Dialect, by subclassing org.hibernate.dialect.MySQL5InnoDBDialect:


import org.hibernate.dialect.MySQL5InnoDBDialect;

/**
 * Extends MySQL5InnoDBDialect and sets the default charset to be UTF-8
 * @author Sorin Postelnicu
 * @since Aug 13, 2007
 */
public class CustomMysqlDialect extends MySQL5InnoDBDialect {

    public String getTableTypeString() {
        return " ENGINE=InnoDB DEFAULT CHARSET=utf8";
    }
}

and used it in my hibernate.cfg.xml:


<hibernate-configuration>
    <session-factory>
        <property name="hibernate.dialect">my.package.CustomMysqlDialect</property>
        .....
    </session-factory>
</hibernate-configuration>

Share/Save

Posted in Java, General | 7 Comments »

Change your Bluetooth address of your Linux machine

August 10th, 2007 by Bogdan Nitulescu

Do you have Bluetooth on your computer? Is it a Linux machine? For some weird reason, do you need it to have a different address?

If you answered yes to the above, here’s the magic command:

bccmd -d 0 psset -s 0 bdaddr 0×44 0×00 0×66 0×55 0×33 0×00 0×22 0×11

…and your Bluetooth device address (BDA) becomes 11:22:33:44:55:66. Of course, you will replace the underlined numbers with the actual address that you want to write.

It does not always work. You need bluez-utils 3, and you need a CSR chip in your computer or USB dongle. To find out, type hciconfig hci0 version and the manufacturer should be Cambridge Silicon Radio. Last time I checked, they had~70% market share, so you have a good chance of having one.

If you have more than a single device, use “bccmd -d 1 …” for hci1, and so on.

The option -s 0 stores it into the default memory, which is usually RAM - so the new address may be lost after reboot. Your chip may have various ROM stores - use -s 1 to -s 3. If you want specifically to write your new address in ram, use -s 4. Note that the store with the highest number has priority (e.g. if an address is stored in both RAM and flash, RAM has priority)

For gory details about programming CSR chips, you can get documents from http://www.csrsupport.com/

Share/Save

Posted in General, bluetooth, linux | 4 Comments »