Working with Graph in Oracle Analytics - Intro
You can find the following definition about Graphs and Graph theory on Wiki: In mathematics, graph theory is the study of graphs, which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of vertices (also called nodes or points) which are connected by edges (also called links or lines). A distinction is made between undirected graphs, where edges link two vertices symmetrically, and directed graphs, where edges link two vertices asymmetrically. Graphs are one of the principal objects of study in discrete mathematics.
In essence, graphs are used to model various types of relationships. In case of business analytics, graphs can used in sales and marketing departments to perform recommendations of the products to particular group of customers that are somehow related to some other group of customers, in manufacturing they can be exploited to manage inventory more effectively due to better planning of materials and semi-products based on relations in bills of material, or simply finding the shortest or the most economical path between two places in transportation.
Oracle database supports working with graphs for several years now. It all started as an addition in Spatial Option and continued with dedicated graph server support in the database. With the latest addition, Graph Studio, working with graphs in the database has become extremely easy to work with. Working environment is easy to use and users, particularly business analysts, can simply log in and start using the tool instantly.
But we are not here to talk about Graph Studio (which I will hopefully present on some other occasion). This blog post is about graph support in Oracle Analytics. The first algorithms that enable users to work with graphs is available since Oracle Analytics 6.0.
In Oracle Analytics, Graph Analytics is available as one of the steps in a data flow. Currently four functions are available:
- Clustering finds connected components or clusters in a graph.
- Node Ranking measures the importance of nodes in a graph.
- Shortest Path calculates the shortest path between two nodes in a graph.
- Sub Graph finds all nodes within a specified number of hops of a given node.
In the next blog posts we will take a look at each of these in more details. Stay tuned.