Skip to main content

AI Vision Service in Oracle Data Lakehouse


Oracle AI Services

With this blog post I am still discovering various cloud services that constitute Oracle Data Lakehouse. This time I am looking at Oracle AI Services. Oracle describes these AI services as pre-trained models that can be custom trained with an organisation’s own data to improve model quality, making it easier for developers to adopt and use AI technology.

OCI provides the following prebuilt services:

  • Digital Assistant
  • Language
  • Speech
  • Vision
  • Anomaly Detection.

In this blog post, I am playing wih Vision AI Service. 

Vision AI Service

OCI Vision applies computer vision to analyze image-based content. Developers can easily integrate pretrained models into their applications with APIs or custom train models to meet their specific use cases. These models can be used to detect visual anomalies in manufacturing, extract text from documents to automate business workflows, and tag items in images to count products or shipments.

Basically, what Vision AI Service does is Image Classification, Object Detection and Document AI.

Working with Vision is very straight forward. You just upload, for example an image, and it will be automatically analysed and results will be presented.

In the example above, we can see image of jeans pants has been uploaded via Vision’s UI, and results are immediately displayed.

In case of object detection, the process is the same, picture is uploaded and analysed. The only difference is that here several objects are detected within the same picture. 

Similarly, we can analyse documents to capture the information contained within these and recognise what  are they talking about.

To summarise, we can see that there are prebuilt models which can be effectively used by end-users who have no particular knowledge of AI/ML. Of course these are very simple examples where users don’t do anything, just upload the image. 

Well, we can make a step further. Users can build their own custom models. And this is where things start to become a bit more useful.

Training Custom models with Vision AI Service

Vision AI Service doesn’t limit users to prebuilt models only, but offers users to create their own models for image classification or object recognition. 

So, let’s step into a non-AI developer shoes and let’s create a Vision AI Model on our own and let’s asume we haven’t got a clue how to code in Python and how to train a ML model. 

Let’s assume we’ve got a library of images which we would use for our image classification exercise. In this case there are two tasks in this exercise. First, we need to label our images, basically we need to tell the machine what is on the picture and then we need to run classification algorithm to train the model. Once done, we can test how it works. It should be too complicated, right? 

Indeed, this is really very straight forward. Let’s take a closer look.

Data Labeling

Data Labeling is another OCI based service that can be used for building labeled data sets. 

In case of images, we need to assign a label to an image, which describes the image and classifies it. Or using same service, we can annotate parts of images and again tell the system what is that particular part of an image. For example, a wheel as a part of the car in the picture.

In our first example, we will heavily simplify things. We will begin our journey with rather small dataset which, in start we know, probably won’t give us good results. However, in order to illustrate the use of services, it will be quite sufficient.

We’ll begin with the famous Dogs vs. Cats dataset (source: https://www.kaggle.com/competitions/dogs-vs-cats/data).

All images will be stored on Object Storage, so we need to create a new bucket in Object Storage. As already mentioned, we will use extremely simplified dataset of only 30 dogs and 30 cats.

Now that the bucket is created, we can proceed with Dataset creation. 

In the 2nd step we add files and labels. In our case we have already uploaded files into the Dogs_vs_Cats bucket, therefore we just need to point to that particular bucket …

… and define labels that we will use:

Once labels are added to the list, creation process begins. One can track its progress in a bar:

When dataset is created and process completes, dataset Status is set to Active.

We can now start labeling process. 

It is very simple to begin labeling. Just click on a picture to open Add Labels utility and assign a label to image.

In the example above, we have cat (actually 2 cats) in image, hence we choose label Cat. And we repeat this for all of our images of cats and dogs.

Custom Model Training

We have our data set ready. So we can return back to Vision AI and start creating a new custom Dog/Cat Classification model.

To start with, a new project is created …


… and within a new project, a new model needs to be created as well. Creating and training a new model takes three steps.

The first step is to choose model type to train. In our case we will create an image classification model that will try to classify images into two groups, the dogs and the cats.

In this same step we need to choose a training data set. We will use dataset we just label using Data Labeling Service.

When we define the training data set, we need to name the model and select training duration. 

In the last step review the settings and click Create and Train. Model training might run up to 24 hours (if selected).

So, we just need to wait.

Review and Test the Model

When model is trained, General information and Training metrics are displayed … 

… and it can be instantly used.


Conclusion

With this simple example we have seen that it is relatively easy to create and train (in this case) classification model to classify images. Of course, the data set we used in example was too small to get better results, however with larger data sets and longer time to train (I chose only 1 hour) I expect better performance. 

But what we also learned is that the most time consuming step of the exercise is data labeling. This process is manual and can not be done in one bulk operation … unless it is done programmatically, but that required programming skills which are usually not the skills of the business user. 

After all, this is a service, which is meant to be used instantly, out-of-the-box, without major coding exercise, by some business type of a user. 

Anyway, we will take a closer look at “programming" data labeling case in one of the next blog posts.