Versus

Getting Started With Versus

The Versus API is being developed to facilitate the comparison of digital objects. To this extent, the library abstracts the task of comparing two digital objects into two stages: (1) extraction of feature descriptors (2) computing the similarity measure of features descriptors. For the purposes of this short tutorial, examples given will focus around comparisons of image data, but the same principles apply to any other digital object.

Stage one involves two class hierarchies, one of Adapters and one of Feature Extractors. Adapters are function has bridges between existing data structures (from existing applications) to Versus-aware data structures.

For example, let’s assume we want to extract pixels from an image in RGB space and we are familiar with the java class BufferedImage. We create a new class BufferedImageAdapter that implements the interface HasRGBPixels:

public class BufferedImageAdapter implements HasRBGPixels, FileLoader {…}

HasRGBPixels is a Versus interface that imposes the ability to extract pixels from an image:

public interface  HasRGBPixels extends Adapter {

  double[][][]  getRGBPixels();
  double  getRGBPixel(int row, int column, int band);
}

Adapter is just a marker interface. We also make BufferedImageAdapter able to load files in case we want to start from files on disk. Now we can use the BufferedImageAdapter to load a file and extract its pixels in RGB. Because Adapter interfaces are being used, any extractor that knows how to extract HasRGBPixels will know how to extract pixels from a BufferedImage using the BufferedImageAdapter.

Now that we have pixels, let’s assume that we want to use a color histogram to describe a particular image and then use some comparison between to compare two color histogram descriptors. This means that we need a feature descriptor representing the color histogram and an extractor that knows how to go from HasRGBPixels to the color histogram.

This introduces two class hierarchies: Extractor and Descriptor. The first is in charge of going from an Adapter to a Descriptor. The second is the feature descriptor itself and will be what we compare with each other to establish how similar two digital objects are.
The most important function in Extractor is the extract function:

public interface  Extractor {

  public  Descriptor extract(Adapter adapter) throws Exception;
}

The extract function is what takes in an Adapter and creates a Descriptor. In our particular example it will take in a HasRGBPixels adapter and output a ColorHistogramDescriptor. Let’s call this extractor ColorHistogramExtractor.
Given two images, the client code to the classes briefly described so far could look something like this:

BufferedImageAdapter  adapter = new BufferedImageAdapter();
adapter.load(File file);
ColorHistogramExtractor colorHistogramExtractor = ColorHistogramExtractor();
ColorHistogramDescriptor colorHistogramDescriptor = colorHistogramExtractor.extract(adapter);

Now that we have a feature descriptor for our digital object, we are ready to discuss the class hierarchy dealing with similarity metrics, the algorithms to compare descriptors. Given an instance of a Descriptor per digital object we want to compare, a Measure implements a method to compare two descriptors:

public interface  Measure {

  Similarity compare(Descriptor feature1, Descriptor feature2) throws Exception;
}

When comparing two digital objects, the output of the comparison is defined as a Similarity. Similarities could be as complex as necessary, but they all implement a getValue() method to represent the level of similarity as a double. This is just a convenience method to get a sense of what the output looks like. It forces the developer of a new Similarity data structure to give some kind of indication of how similar two objects are.

public interface  Similarity {

  double getValue();
}

In our example, we have a HistogramDistanceMeasure that computes the Euclidean distance between the bins of the color histogram. In HistogramDistanceMeasure.compare the first thing we do is check that the two descriptors we passed in are instances of RGBHistogramDescriptor and cast them to this class. If that's not the case, we raise an UnsupportedTypeException.

public class HistogramDistanceMeasure implements Measure {
	@Override
	public Similarity compare(Descriptor feature1, Descriptor feature2)
			throws Exception {

		if (feature1 instanceof RGBHistogramDescriptor
				&& feature2 instanceof RGBHistogramDescriptor) {
			RGBHistogramDescriptor histogramFeature1 = (RGBHistogramDescriptor) feature1;
			RGBHistogramDescriptor histogramFeature2 = (RGBHistogramDescriptor) feature2;
			return compare(histogramFeature1, histogramFeature2);
		} else {
			throw new UnsupportedTypeException(
					"Similarity measure expects feature of type HistogramFeature");
		}
	}
    
    public SimilarityNumber compare(RGBHistogramDescriptor feature1,
			RGBHistogramDescriptor feature2) throws Exception {
            // implementation
    }
}

This concludes our simple tutorial. Please see source code and javadocs for more information and examples.