Toolbox OPF - Optimum Path Forest Classifier

The OPF Toolbox is a set of classifier tools based on the Optimum-Path Forest theory and example of how to use its algorithms.

LibOPF is the default implementation of the Optimum-Path Forest classifier, its newest version (LibOPF3) can be found at: https://github.com/LibOPF/LibOPF.

Also, there is the The oficial page of the LibOPF2. Available at http://www.ic.unicamp.br/~afalcao/libopf.

Introduction

OPF is a general classifier which:

  • supervised and unsupervised learning
  • naturally multi-class
  • fast fitting and predicting
  • good accuracy
  • few parameters
  • allows some superposition between clusters

LibOPF3

LibOPF3 is the current implementation of the Optimum-Path Classifier. It can be downloaded from the Git repository at https://github.com/LibOPF/LibOPF, it is under the BSD license.

It presents:

  • C code and Python bindings
  • dynamic library
  • code paralellization using OpenMP
  • SSE use in distance functions through compiler flags
  • Precomputed distance option
  • Supports serialization (pickling)

Improvements to LibOPF2

How to Install

LibOPF3 is a stand-alone library. These are the steps to follow:

Download the code from the Git repository.

Create the directories 'lib' and 'obj':

$ mkdir lib
$ mkdir obj

type:

$ make

The default number of threads is 4. if you have a different number of processors you may want to change the NTHREADS flag in the Makefile

If you want the python bindings, type:

$ make bindings
$ source env.sh

obs: env.sh just change Python enviroment variables to see the libopf_py module

[This will be easier in the future]

LibOPF3 Python API

We`ll show some usage of the Python OPF library with the Python module scikits.learn to solve the Handwritten-Digits classification problem.

Supervised learning

 1 from sklearn import datasets, metrics
 2 from libopf_py import OPF
 3 import numpy
 4 
 5 digits = datasets.load_digits()
 6 
 7 # To apply an classifier on this data, we need to flatten the image, to
 8 # turn the data in a (samples, feature) matrix:
 9 n_samples = len(digits.images)
10 data  = digits.images.reshape((n_samples, -1))
11 label = digits.target.astype(numpy.int32)
12 
13 #Let's split our dataset in the training and testing parts
14 rand = numpy.random.permutation(n_samples)
15 
16 #half data for traning, other half for testing
17 n_split = n_samples/2
18 
19 data_train,  data_test  = data [rand][:n_split], data [rand][n_split:]
20 label_train, label_test = label[rand][:n_split], label[rand][n_split:]
21 
22 O = OPF()
23 O.fit(data_train, label_train)
24 predicted = O.predict(data_test)
25 
26 print "Classification report for OPF:\n%s\n" % (metrics.classification_report(label_test, predicted))
27 print "Confusion matrix:\n%s" % metrics.confusion_matrix(label_test, predicted)
ERROR execute

------------------------------------------------------------
*** Exception while evaluating code:
  File "<string>", line 2, in <module>
ImportError: No module named libopf_py

------------------------------------------------------------

LibOPF3 C API

Although the library is written in C, we recommend using of the Python API whenever possible.

Headers:

Opf Graph Data Structure

This structure contains all what is needed to run an OPF instance.

struct opf_graph * opf_graph_create      (int node_n);
void               opf_graph_destroy     (struct opf_graph ** sg);
int                opf_graph_set_feature (struct opf_graph *sg, double *feat, int *label, int feat_n);
void               opf_graph_set_metric  (struct opf_graph *sg,
                                          double (*arc_weight) (double *f1, double *f2, int n),
                                          enum METRIC m);
int                opf_graph_set_precomputed_distance (struct opf_graph *sg, double *dist, int *label);

Supervised

Supervised Fitting and Predicting.

void opf_supervised_train (struct opf_graph * sg);

void opf_supervised_train_iterative (struct opf_graph *sg, double split);

void opf_supervised_classify (struct opf_graph * sgtrain, double *feat, int sample_n, int *label);

Non-Supervised

Non-Supervised Fitting and Predicting.

void opf_unsupervised_clustering   (struct opf_graph * sg);

void opf_unsupervised_knn_classify (struct opf_graph * sgtrain, double *feat, int sample_n, int *label);

void opf_best_k_min_cut            (struct opf_graph * sg, int kmin, int kmax);

Contact

If you have any questions, feel free to send e-mails to:

* afalcao [at ] ic        [at] unicamp [dot] com
* lotufo                  [at] unicamp [dot] com
* papa    [dot] joaopaulo [at] gmail   [dot] com
* victormatheus           [at] gmail   [dot] com

References

[1] João P. Papa, Alexandre X. Falcão and Celso T. N. Suzuki. Supervised Pattern Classification based on Optimum-Path Forest. Intl. Journal of Imaging Systems and Technology, Wiley, Vol. 19, Issue 2, pp. 120–131, Jun 2009.

[2] L.M. Rocha, F.A.M. Cappabianco, and A.X. Falcão. Data clustering as an optimum-path forest problem with applications in image analysis. International Journal of Imaging Systems and Technology, 19(2):50-68, 2009.

[3] A. X. Falcao, J. Stolfi, and R. A. Lotufo. The image foresting transform: Theory, algorithms, and applications. IEEE Trans. on Patt. Anal. Mach. Intell., 26(1):19-29, 2004.

[4] A.T. da Silva, A.X. Falcão, L.P. Magalhães: A new CBIR approach based on relevance feedback and optimum-path forest classification. Journal of WSCG, Vol.18, No.1-3, pp. 73-80.

[5] André Tavares da Silva, Alexandre Xavier Falcão, Léo Pini Magalhães, Active learning paradigms for CBIR systems based on optimum-path forest classification, Pattern Recognition, Volume 44, Issue 12, December 2011, Pages 2971-2978, ISSN 0031-3203, 10.1016/j.patcog.2011.04.026. (http://www.sciencedirect.com/science/article/pii/S0031320311001853)

[6] Thiago Vallin Spina, Javier A. Montoya-Zegarra, Fabio Andrijauskas, Fábio Augusto Faria, Carlos E. A. Zampieri, Sheila M. Pinto-Caceres, Tiago J. de Carvalho, Alexandre X. Falcão: A Comparative Study among Pattern Classifiers in Interactive Image Segmentation. SIBGRAPI 2009: 268-275

Recently modified pages
More