OPF Optimum Path Forest - Concepts
A set of samples and a distance function between any two samples computed from their atributes forms a complete weighted graph where the nodes are the samples and the arc weights connecting any two samples are given by the distance function. (Fig)
Given a path cost function defined as the maximum arc weight in the path (Fig) and a set of prototypes (root nodes) labeled with their classes (Fig), the OPF is an optimum partitioning of the graph in a forest such that any node is associated to the prototype which gives the minimum path cost connecting the prototype to that node in the graph (fig). All the nodes more strongly connected to a prototype form a tree of the forest and are classified with the same class of the prototype.
For the supervised classifier, the prototypes are key samples of a class, once the optimum forest is computed from the labeled training samples, the classification process assigns a test sample to the label of the prototype that gives the minimum cost path