Understanding Natural Language Queries
Intended audience: USERS
AO Easy Answers: 4.3
Overview
In most business intelligence software tools on the market, asking questions about your data is done through the Structured Query Language (SQL), often “hidden” from the user through purpose-built user interfaces that will construct the SQL and send the request to the underlying data source(s) to retrieve data. Asking questions, processing, and providing results using natural language in plain English, however, requires the system to use advanced techniques and a human-to-machine interface dealing with both the written and spoken form.
Using modern technologies and AI magic, Easy Answers makes use of such techniques in three core areas:
Input - Natural Language Query (NLQ)
Processing - Natural Language Processing (NLP)
Output - Natural Language Generation (NLG)
This topic focuses on NLQ and the options available when asking questions, and/or when the need arises to make quick changes.
Recognized Ontology Objects and Data Values
When a user asks the question, eg, “Show all plastic meters”, the system will “deconstruct” the sentence and identify what is important and known, such as commands, topics (MSOs), operators and filters (value-based synonyms), etc…
An NLQ is interpreted as best as possible by the system. It’s fairly easy to understand the query: “Show all plastic meters”. See Understanding Ontology. However, some words in this statement are actually ignored by the system. The user could have just said, “plastic meters”, and the same data would be shown. The words “Show all” are ignored, whereas the words “plastic” and “meters” are recognized. “meter” is a known word as it’s a business object (an MSO) in our Ontology, and “plastic” is referred to as a value-based synonym, ie a Synonym has been created for the Material field in the Meter object for each value in the data source.
Words that are recognized by the system in the question will be underlined. If a user wants to change a recognized word to something else, click on the word, and a drop-down will allow alternative objects or fields to be selected. A Search box at the top of the list allows the user to search for other values in case of more than 10 alternative search terms.
Red underline - recognized search terms for Objects (MSOs), eg, Customers, Bills, Meters, Meter Issues, Contracts, Installations, Premises, … Only search terms that are within 2 degrees of separation in the ontology graph from the original search term will be shown in the list.
Blue underline - recognized search terms for Fields (MSO Properties), eg, Competitor, Material, Manufacturer
Green underline - recognized search terms for Values (Value Based Synonyms) from a field in an object, eg, Cobber/Brass/Plastic, San Francisco/Los Angeles/Chicago/New York, Pounds/Dollars/Yen/Euro, …
If a word isn’t underlined and a user wants to add the word as a Synonym to be recognized by the system, use the Action menu’s Viewing MSO Query and Adding Synonyms.
Commands and Sentence Keywords
For all the words that are not underlined as per the example above, they will still likely have operational meaning in order for the query to make sense and to produce a result, except in a few cases where “fill words” will simply be ignored. Such other words can be divided into the following groups:
Commands
Aggregations
Filters (Spatial, Numeric, String)
Sorting / Grouping
Below is an insight into which words are considered “keywords”. In most cases, the user can use different words (synonyms) to achieve the same outcome, for example, to find the maximum value of something, the user can state/say any of the following words in a query: max, maximum, highest, maxima, maximal, maximal value, largest, biggest, greatest.
Commands
SELECT - show, display
COMPARISON - compare - (“vs” required between the objects compared)
QUESTION - who, which, how many
RESERVED KEYWORDS - by month, by year, by quarter
Aggregations
MIN - min, minimum, lowest, lowest value, minima, minimal, minimal value, bottom value, smallest, least
COUNT – count, many, how many, how much, no of
MAX - max, maximum, highest, maxima, maximal, maximal value, largest, biggest, greatest
COUNTDISTINCT - count distinct
SUM - sum, total, tally, aggregate, grand total, sum total, summation
AVG - avg, average, average, midpoint
STDDEV - standard deviation
STDDEVPOP - standard deviation pop
VARIANCE - variance
Spatial Filters
STRICT WITHIN – in the path of
WITHIN – within, near
INSIDE - inside
Numeric Filters
GREATERTHAN - greater than, greater, over, more than, over than, bigger, higher, larger, more high, more big, more large, above
LESSTHAN - less than, less, smaller than, more small, lower than, lower, smaller, lesser, more low, more small, below
EQUALS - equals, is, equal, equal to, equals to, are, based on, as, same as, on
NOTEQUALS - not based on, not same as, not as
GREATERTHANEQUALS - greater than and equals, greater than equals, after, greater than and equal, greater than equal, more than and equals, greater than or equals, greater than or equal to
LESSTHANEQUALS - less than equals, less than and equals, before, older than, less than equal, less than and equal, less than or equals, less than or equal to
NOTGREATERTHAN - not greater than, not over, not more than, not over than, not higher, not bigger, not larger
NOTGREATERTHANEQUALS - not greater and equals, not over and equals, not greater than and equals, not more than and equals, not over than and equals, not bigger and equal than, not over and equal than, not greater and equal than, not bigger than or equal to, not higher than or equal to, neither bigger nor equal to, neither over nor equal to, neither greater nor equal to
NOTLESSTHAN - not less than, not less, not lower than
NOTLESSTHANEQUALS - not less and equals, not less than and equals, not less and equal than, not less than equal, not less than or equal to, not smaller than or equal to, not lower than or equal to, neither smaller than nor equal to, neither lower than nor equal to, not smaller nor equal to, not less than nor equal to
String Filters
EQUALS - equals, is, equal, equal to, equals to, are, based on, as, same as, on
NOTEQUALS - not based on, not same as, not as
LIKEIGNORECASE - like, likes, similar to, equivalent to, comparable to, corresponds, corresponding, analogous, identical, containing, contains
NOTLIKEIGNORECASE - not containing, not comparable to, not equivalent to, not corresponding
STARTSWITHIGNORECASE - starting with, starts with
ENDSWITHIGNORECASE - ending with, ends with
NOTSTARTSWITHIGNORECASE - not starting with
NOTENDSWITHIGNORECASE - not ending with
MATCHESIGNORECASEREGEX - matching
NOTMATCHESIGNORECASEREGEX - not matching
BETWEEN - between, per, range, in last, in next, in, from, not just
Date Filters (examples/not full list)
DATE - today
DURATION - current week
DATERANGE DURATION - this week, last week, last 2 weeks, this quarter, previous quarter, Q3 2022, last 2 years
Sorting/Grouping
ASCENDING - ascending, increasing, by least, from small to large, from smallest to largest, augmenting, rising, bottom
DESCENDING - descending, decreasing, inverse, by most, by top, reverse, opposite, from large to small, from big to small, from biggest to smallest, declining, top
GROUPBY - group by, group, grouped, clubbed, by, grouped by, grouped on, clubbed on
ORDERBY - order by, ordered by, sort by, sorted by, sorted out by, ranked by, rank, ranked, sort out, rank by, classify, classify by, classified, classified by, class, classed by, class by, categorize, categorize by, categorized, categorized by
NEGATION - not, no, neither.
NULL - null, empty, not present, not null, without
Additional Functionally
FUNCTION CRITERIA: Functions can be configured from the Ontology composer as Rule Synonym. These get returned from Text2SQL as criteria to execute.
FIELD CRITERIA: Compare fields. Only Equals/Not Equals is supported. Ex: Show all bills where the original balance is equal to the final balance.
HAVING CONDITION: Aggregate criteria with Group By. Ex: Show all meters by manufacturer where the average final balance is more than $100
NEGATION: On Value Synonyms and Rule criteria