If rule-rule and rule-data links are enabled, then data is attracted to the rules which apply to it. From this simple behaviour, several features emerge. Figure 1 shows an overview of the collapsed visualisation of the soybean data and a number of classification rules developed by the machine discovery component of Haiku. The rules are visible as large sphere, and the data as small spheres, generally clustered together into discs, hovering near the classifying rules.
Unclassified data can be seen in the group of variously coloured small spheres in the centre of the figure.
Figure 1: Overview of rules and data from the soybean dataset
Inter-rule relationships can be seen in Figure 2.
As mentioned before, each large sphere represents a classification rule which asserts (perhaps wrongly) that certain data belongs to a particular class. Rule spheres are linked to the data spheres which they claim to classify. Sphere colouring is determined by class - for rules this is the class which they claim to represent, and for (test) data it is actual class.
In Figure 2, four rules which overlap in their classification of the data can be seen. In this diagram, the links have been made visible, to emphasis the relationship between the rules and the data. When Haiku is in use, the links can be made invisible, or partially transparent, it is also possible to view the visualisation from any angle, or distance.
Two of the (pink) rules represent the same class. The one on the left of the figure (Pink1) can be seen to correctly cover a large number of pink data nodes. It also incorrectly covers a single green node. It can also be seen that all the data nodes covered by Pink1 are also covered by the pink rule node to it's right ( Pink2). The intersection of these two rules can be seen to be correct - all the data nodes covered by both rules are of the correct class (pink). Node Pink2 seems to be much too general, it covers many data nodes which are not of the pink class.
Other features visible in Figure 2 include the accuracy and coverage of all the rules. Note data which is covered by three rules (floating between Pink2 and the Green and Purple rule nodes).
Classification of new data can take place within the visualisation simply by adding the data to the visualisation and noting which clusters or rules it is attracted to.