Events
Mar 31, 11 PM - Apr 2, 11 PM
The ultimate Microsoft Fabric, Power BI, SQL, and AI community-led event. March 31 to April 2, 2025.
Register todayThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
It's powerful to use common phrases and natural language to ask questions of your data. It's even more powerful when your data answers. When you ask Power BI Q&A a question, it makes a best effort to answer correctly. You can edit the linguistic schema to improve the Q&A answers for even better interactions.
It all starts with your enterprise data. The better the data model, the easier it will be for users to get quality answers. One way to improve the model is to add a linguistic schema that defines and categorizes terminology and relationships between table and column names in your dataset. Power BI Desktop is where you manage your linguistic schemas.
There are two sides to Q&A. The first side is the preparation, or modeling. The second side is asking questions and exploring the data, or consuming. In some companies, employees known as data modelers or IT admins might be the ones to assemble the datasets, create the data models, and publish the datasets to Power BI. A different set of employees would be the ones to "consume" the data online. In other companies, these roles might be combined.
This article is for the data modelers, the people who optimize datasets to provide the best possible Q&A results.
A linguistic schema describes terms and phrases that Q&A should understand for objects within a dataset, including parts of speech, synonyms, and phrasings. When you import or connect to a dataset, Power BI creates a linguistic schema based on the structure of the dataset. When you ask Q&A a question, it looks for matches and relationships in the data to figure out the intention of your question. For example, it looks for nouns, verbs, adjectives, phrasings, and other elements. And it looks for relationships, like which columns are objects of a verb.
You're probably familiar with parts of speech, but phrasings might be a new term. A phrasing is how you talk about (or phrase) the relationships between things. For example, to describe the relationship between customers and products, you might say “customers buy products”. Or to describe the relationship between customers and ages, you might say “ages indicate how old customers are”. Or to describe the relationship between customers and phone numbers, you might say “customers have phone numbers”.
These phrasings come in many shapes and sizes. Some correspond directly with relationships in the data model. Some relate columns with their containing tables. Others relate multiple tables and columns together in complex relationships. In all cases, they describe how things are related by using everyday terms.
Linguistic schemas are saved in a .yaml format. This format is related to the popular JSON format but provides a more flexible and easier-to-read syntax. Linguistic schemas can be edited, exported, and imported into Power BI Desktop.
We recommend using Visual Studio Code to edit linguistic schema .yaml files. Visual Studio Code includes out-of-the-box support for .yaml files and can be extended to specifically validate the Power BI linguistic schema format.
Install Visual Studio Code.
Right-click the .yaml file in the sample linguistic schema that you saved earlier: QnALinguisticSchema.zip.
Select Open with > Choose another app.
Select Visual Studio Code and then choose Always.
In Visual Studio Code, install the YAML Support by Red Hat extension.
There are two ways to work with linguistic schemas. One way is to edit, import, and export the .yaml from the ribbon in Power BI Desktop. That way is covered in the Power BI Q&A Tooling experience article. You don't have to open the .yaml file to improve Q&A.
The other way to edit a linguistic schema is to export and edit the .yaml file directly. When you edit a linguistic schema .yaml file, you tag columns in the table as different grammatical elements and define words that a colleague might use to phrase a question. For instance, you state the columns that are the subject and the object of the verb. You add alternative words that colleagues can use to refer to tables, columns, and measures in your model.
Before you can edit a linguistic schema, you must open (export) it from Power BI Desktop. Saving the .yaml file back to the same location is considered importing. But you can also import other .yaml files instead. If for instance, you have a similar dataset and you've already put in work adding parts of speech, identifying relationships, creating phrasings, and creating synonyms, you can use that .yaml file in a different Power BI Desktop file.
Q&A uses all this information together with any enhancements that you make to provide a better answer, auto completion, and summary of the questions.
When you first export your linguistic schema from Power BI Desktop, most or all of the content in the file is automatically generated by the Q&A engine. These generated entities, words (synonyms), relationships, and phrasings are designated with a State: Generated tag. They're included in the file mostly for informational purposes but can be a useful starting point for your own changes.
Note
The sample .yaml file included with this tutorial doesn't contain State: Generated or State: Deleted tags because it was prepared specifically for this tutorial. To see these tags, open an unedited .pbix file in Relationship view and export the linguistic schema.
When you import your linguistic schema file back into Power BI Desktop, anything that's marked State: Generated is ignored and later regenerated. Thus, if you’d like to change some generated content, remove the corresponding State: Generated tag. Similarly, if you want to remove some generated content, change the State: Generated tag to State: Deleted so that it isn't regenerated when you import your linguistic schema file.
In Power BI Desktop, open the dataset in Model view.
On the Modeling tab, select Linguistic Schema > Export linguistic schema.
Save it. The file name ends with .lsdl.yaml.
Open it in Visual Code or another editor.
In Model view in Power BI Desktop, on the Modeling tab, select Linguistic schema > Import.
Go to the location where you saved the edited .yaml file and select it. A Success message lets you know that the linguistic schema .yaml file was successfully imported.
A phrasing is how you talk about (or phrase) the relationships between things. For example, to describe the relationship between customers and products, you might say “customers buy products”.
Power BI adds many simple phrasings to the linguistic schema automatically based on the structure of the model and guesses based on the column names. For example:
However, your users sometimes talk about things in ways that Q&A can’t guess. Therefore, you might want to add your own phrasings manually.
The first reason for adding a phrasing is to define a new term. For example, if you want to be able to ask “list the oldest customers”, you must first teach Q&A what you mean by “old”. You would do so by adding a phrasing like “ages indicate how old customers are”.
The second reason for adding a phrasing is to resolve ambiguity. Basic keyword search only goes so far when words have more than one meaning. For example, “flights to Chicago” isn't the same as “flights from Chicago”. But Q&A won’t know which one you mean unless you add the phrasings “flights are from departure cities” and “flights are to arrival cities”. Similarly, Q&A will only understand the distinction between “cars that John sold to Mary” and “cars that John bought from Mary” after you add the phrasings “customers buy cars from employees” and “employees sell customers cars.”
The final reason for adding a phrasing is to improve restatements. Rather than Q&A echoing back to you “Show the customers and their products”, it would be clearer if it said “Show the customers and the products they bought” or “Show the customers and the products they reviewed”, depending on how it understood the question. Adding custom phrasings allows restatements to be more explicit and unambiguous.
To understand the different types of phrasings, you’re first going to need to remember a couple of basic grammar terms:
Attribute phrasings are the workhorse of Q&A. They're used when one thing is acting as an attribute of another thing. They’re simple, straightforward, and perform most of the heavy lifting when you haven't defined a subtler, more detailed phrasing. Attribute phrasings are described using the basic verb “have” (“products have categories” and "host countries/regions have host cities"). They also automatically allow questions with the prepositions “of” and “for” (“categories of products” or “orders for products”) and possessive (“John’s orders”). Attribute phrasings are used in these kinds of questions:
Power BI generates most of the attribute phrasings needed in your model based on table or column containment and model relationships. Typically, you don’t need to create them yourself. Here's an example of how an attribute phrasing looks inside the linguistic schema:
product_has_category:
Binding: {Table: Products}
Phrasings:
- Attribute: {Subject: product, Object: product.category}
Name phrasings are helpful if your data model has a table that contains named objects, such as athlete names or customer names. For example, a “product names are names of products” phrasing is essential for being able to use product names in questions. Name phrasing also enables “named” as a verb (for example, “List customers named John Smith”). However, it's most important when used in combination with other phrasings. It allows a name value to be used to refer to a particular table row. For example, in “Customers that bought chai”, Q&A can tell the value “chai” refers to the whole row of the product table rather than just a value in the product name column. Name phrasings are used in these kinds of questions:
Assuming you used a sensible naming convention for name columns in your model (for example, “Name” or “ProductName” rather than “PrdNm”), Power BI generates most of the name phrasings needed in your model automatically. You usually don’t need to create them yourself.
Here's an example of how a name phrasing looks inside of the linguistic schema:
employee_has_name:
Binding: {Table: Employees}
Phrasings:
- Name:
Subject: employee
Name: employee.name
Adjective phrasings define new adjectives used to describe things in your model. For example, “happy customers are customers where rating > 6” phrasing is needed to ask questions like “list the happy customers in Des Moines.” There are several forms of adjective phrasings to use in different situations.
Simple adjective phrasings define a new adjective based on a condition, such as “discontinued products are products where status = D.” Simple adjective phrasings are used in these kinds of questions:
Here's an example of how a simple adjective phrasing looks inside of the linguistic schema:
product_is_discontinued:
Binding: {Table: Products}
Conditions:
- Target: product.discontinued
Operator: Equals
Value: true
Phrasings:
- Adjective:
Subject: product
Adjectives: [discontinued]
Measurement adjective phrasings define a new adjective based on a numeric value that indicates the extent to which the adjective applies, such as “lengths indicate how long rivers are” and "small country/regions have small land areas." Measurement adjective phrasings are used in these kinds of questions:
Here's an example of how a measurement adjective phrasing looks inside of the linguistic schema:
river_has_length:
Binding: {Table: Rivers}
Phrasings:
- Adjective:
Subject: river
Adjectives: [long]
Antonyms: [short]
Measurement: river.length
Dynamic adjective phrasings define a set of new adjectives based on values in a column in the model, such as “colors describe products” and "events have event genders." Dynamic adjective phrasings are used in these kinds of questions:
Here's an example of how a dynamic adjective phrasing looks inside the linguistic schema:
product_has_color:
Binding: {Table: Products}
Phrasings:
- DynamicAdjective:
Subject: product
Adjective: product.color
Noun phrasings define new nouns that describe subsets of things in your model. They often include some type of model-specific measurement or condition. For example, for our model we might want to add phrasings that distinguish champions from medalists, land sports from water sports, teams versus individuals, or age categories of athletes (teens, adults, seniors). For our movie database, we might want to add noun phrasings for “flops are movies where net profit < 0” so that we can ask questions like “count the flops by year.” There are two forms of noun phrasings to use in different situations.
Simple noun phrasings define a new noun based on a condition, such as “contractors are employees where full time = false” and "champion is athlete where count of medals >5." Simple noun phrasings are used in these kinds of questions:
Here's an example of how a simple noun phrasing looks inside of the linguistic schema:
employee_is_contractor:
Binding: {Table: Employees}
Conditions:
- Target: employee.full_time
Operator: Equals
Value: false
Phrasings:
- Noun:
Subject: employee
Nouns: [contractor]
Dynamic noun phrasings define a set of new nouns based on values in a column in the model, such as “jobs define subsets of employees.” Dynamic noun phrasings are used in these kinds of questions:
Here's an example of how a dynamic noun phrasing looks inside of the linguistic schema:
employee_has_job:
Binding: {Table: Employees}
Phrasings:
- DynamicNoun:
Subject: employee
Noun: employee.job
Preposition phrasings are used to describe how things in your model are related via prepositions. For example, a “cities are in countries/regions” phrasing improves understanding of questions like “count the cities in Washington.” Some preposition phrasings are created automatically when a column is recognized as a geographical entity. Preposition phrasings are used in these kinds of questions:
Here's an example of how a preposition phrasing looks inside of the linguistic schema:
customers_are_in_cities:
Binding: {Table: Customers}
Phrasings:
- Preposition:
Subject: customer
Prepositions: [in]
Object: customer.city
Verb phrasings are used to describe how things in your model are related via verbs. For example, a “customers buy products” phrasing improves understanding of questions like “who bought cheese?” and “what did John buy?” Verb phrasings are the most flexible of all of the types of phrasings, often relating more than two things to each other, such as “employees sell customers products.” Verb phrasings are used in these kinds of questions:
Verb phrasings can also contain prepositional phrases, thereby adding to their flexibility, such as “athletes win medals at competitions” or “customers are given refunds for products.” Verb phrasings with prepositional phrases are used in these kinds of questions:
Some verb phrasings are created automatically when a column is recognized as containing both a verb and a preposition.
Here's an example of how a verb phrasing looks inside of the linguistic schema:
customers_buy_products_from_salespeople:
Binding: {Table: Orders}
Phrasings:
- Verb:
Subject: customer
Verbs: [buy, purchase]
Object: product
PrepositionalPhrases:
- Prepositions: [from]
Object: salesperson
Frequently, a single relationship can be described in more than one way. In this case, a single relationship can have more than one phrasing. It's common for the relationship between a table entity and a column entity to have both an attribute phrasing and another phrasing. For example, in the relationship between customer and customer name, you'll want both an attribute phrasing (for example, “customers have names”) and a name phrasing (for example, “customer names are the names of customers”), so you can ask both types of questions.
Here's an example of how a relationship with two phrasings looks inside of the linguistic schema:
customer_has_name:
Binding: {Table: Customers}
Phrasings:
- Attribute: {Subject: customer, Object: customer.name}
- Name:
Subject: customer
Object: customer.name
Another example would be adding the alternate phrasing “employees sell customers products” to the “customers buy products from employees” relationship. You don't need to add variations like “employees sell products to customers” or “products are sold to customers by employees” because the “by” and “to” variations of the subject and indirect object are inferred automatically by Q&A.
If you make a change to a .lsdl.yaml file that doesn't conform to the linguistic schema format, validation squiggles indicate the issue:
More questions? Ask the Power BI Community
Events
Mar 31, 11 PM - Apr 2, 11 PM
The ultimate Microsoft Fabric, Power BI, SQL, and AI community-led event. March 31 to April 2, 2025.
Register todayTraining
Module
Design a semantic model in Power BI - Training
The process of creating a complicated semantic model in Power BI is straightforward. If your data is coming in from more than one transactional system, before you know it, you can have dozens of tables that you have to work with. Building a great semantic model is about simplifying the disarray. A star schema is one way to simplify a semantic model, and you learn about the terminology and implementation of them in this module. You will also learn about why choosing the correct data granularity is important
Certification
Microsoft Certified: Power BI Data Analyst Associate - Certifications
Demonstrate methods and best practices that align with business and technical requirements for modeling, visualizing, and analyzing data with Microsoft Power BI.