Sample 3 – Export to Fluent Editor

Sample 3 – Export to Fluent Editor

One of the greatest features of Ontorion Text Mining AddIn is export of taxonomy to Fluent Editor! The taxonomy will be translated to Controlled Natural Language(CNL) and opened then in Fluent Editor. In this sample I will explain and show how to adjust export configuration.

NOTE: If you are not familiar with a grammar concepts of Fluent Editor please refer at first to Fluent Editor documentation(Grammar section), which can be find in Fluent Editor application.

In this sample I will use taxonomy from Sample 2.

At first we have to open worksheet that contains taxonomy generated by Ontorion Text Mining AddIn. Then open Ontorion tab and click Export to Fluent Editor button. This will open new window as on Figure 1:

Figure 1

At first I will describe all the possible configuration actions that user can perform in this window:

At the top of window in Base instance name text box we can specify how instances of our input text will be named. Default value is “Item”. Below there is table with four columns:

  1. Taxonomy column – name of a column with predefined taxonomy.

  2. Category – specifies how the values of that Taxonomy column will be treated as:
    • Attribute – means that values of that taxonomy column are some values, which type is defined in Data type column.
    • Concept – means that values of that taxonomy column are sub-concepts of concept which is named same as the taxonomy column header.
    • Instances – means that values of that taxonomy column are instances of concept which is named same as the taxonomy column header.

  3. Role – specifies a role/relation between this taxonomy column and an instance.

  4. Data type – this is only available for Taxonomy columns with Category set to Attribute. Specifies one of the following data type for attribute:
    • String – regular text, in CNL wrapped with single quotes (‘).
    • Integer – an 32 bit integer value.
    • Double – double precision floating point number.
    • Boolean – binary value: true or false
    • Date Time – for values that represents date and/or time. Formatting of input dates should be same as for Fluent Editor.

When we will open this window for the first time for our taxonomy it will look similarly to the one on Figure1 – algorithm will treat all taxonomy columns as an Attributes of String value type. Values in Role column were generated automatically by joining word “ has-“ with Taxonomy column name formatted in way to be compatible with Fluent Editor conventions. Configuration in this form is acceptable and can be exported by clicking Export! button:

Figure 2

OK, it works but with our knowledge of this taxonomy we can significantly improve the quality of the CNL. For instance we know that we deal with computers, so a first step in improving this configuration will be changing base instance name from “Item” to “Computer” in Export to Fluent Editor window:

Figure 3

Now let’s get down to the taxonomy columns. First is COMPUTER_TYPE – so what kind of computer it is. I have picked Concept as a Category, because values of COMPUTER_TYPE are very general, and “is a” as Relation to provide simple class assertion. Figure 4 shows how does our CNL looks like after these changes:

Figure 4

Notice that in comparison with Figure 2 at top of the editor we have now 3 additional sentences which are defining sub-concepts(e.g. “ chromebook” is a sub-concept of “computer-type”). At line 4 we can see simple class assertion which means that “Computer-1” is an instance of “laptop” concept. On the right side you can see extended taxonomy tree which visualizes hierarchy of concepts and instances which belongs to them.

Next taxonomy column is VENDOR. For this one I have picked Instance as a Category because it will let us in future to add some more attributes for each instance of VENDOR. In Relation text box I have written “is-produced-by”.

I have repeated this process for each taxonomy column. Figure 5 presents it’s effect:

Figure 5

After exporting this taxonomy to the Fluent Editor using above configuration, I’m able to easily query for computer that I’m interested in. Let’s say that I want a pc with monitor diagonal greater than 13.5 inches with processor made by Intel and sparkling black color of casing. I can ask for it Fluent Editor in this simple way:

Who-Or-What has-diagonal-in-inches greater-than13.5 andhas-cpu-produced-by Intel andhas-color equal-to'sparkling black'?

Figure 6 presents this query along with results:

Figure 6

Results that are matching my search criteria’s are “Computer-5” and “Computer-22”!

You can try it by yourself with Sample 1 or Sample 2 workbooks! TODO <add links>