Case-sensitive models in Power BI: consequences & considerations
HOW CASE SENSITIVITY AFFECTS YOUR MODEL
…and why you should care because of the implications for your DAX and query results.
Goblin warning:
In the current version of Microsoft Fabric, (October 2023; service version 13.0.21778.91), Direct Lake models created in the Fabric Portal can have a case-sensitive collation.
- This is different behavior compared to models you create in Power BI Desktop.
- You can't opt out of this collation.
- You aren't notified that this collation has been applied.
- You can't see this property unless you use advanced methods like XMLA tools.
- You can't change this property once it's set.
As described in the following article, a case-sensitive model can produce unexpected consequences that you should be aware of. Thanks to Akshai Mirchandani and Daniel Otykier for explaining collation to me.
CASE-SENSITIVE DATA MODELS
Most Power BI models are case-insensitive, meaning that “Bonk” is the same as “BONK”. However, Power BI data models can also be created as case-sensitive if you create a Direct Lake model in Fabric, or create a new model with external tools and enter a case-sensitive collation property. Two otherwise identical models which differ only in this case-sensitivity may produce different results, even though they’re using the same data, DAX, relationships, and tables.
It’s useful to know how case-sensitivity affects your model and its query results. You should also be able to identify and validate whether your model is case-sensitive. This is particularly important in the following scenarios:
The model was automatically generated: For example in the Fabric portal or by another application.
You have a requirement for case-sensitivity: For example if that’s expected in your data structure.
You observe strange query results: Case-sensitivity might explain unexpected query results in rare scenarios, for example when counting distinct values or observing peculiar column groupings that can’t be explained by other means.
The purpose of this article is twofold:
1. To demonstrate how a case-sensitive model can produce different results than a case-insensitive one.
2. To describe how you can change case-sensitivty by altering the model collation property.
Goblin note:
This article focuses primarily on the case-sensitivity of the model, not the DAX or Power Query languages, which can't be changed.
- DAX is not case-sensitive. "VAR" is the same as "var".
- Power Query (the M language) is case-sensitive. "let" is not the same as "Let".
HOW DOES CASE SENSITIVITY AFFECT YOUR MODEL
The following sections provide several examples of how a case-sensitive model might produce unexpected outcomes.
DISTINCT COUNTS
A common use-case is when you want to count the number of unique elements within a single column. For example, you might want to know how many customers there are, referring to the customer name as the most relevant business attribute.
The case-sensitivity doesn’t only affect the model metadata, but also queries of the model data, too. Reports connected to a case-sensitive model may have different results than a case-insensitive model that’s otherwise completely identical. A column containing values with different cases may have a different distinct count when considering the letter casing.
In the below example, users want to see the total number of customers in an area. The data source is customer master data unioned from two different source systems. The users expect to see three customers: eMerge, Join Solutions NV, and microTech. However, in a case-sensitive model, they’d see five, because the casing differs in two customer names.
Normally, a typical Power BI model that’s case-insensitive would only show 3 customers, ignoring differences in letter-casing.
Generally, it’s considered a good practice to have consistent casing within a single column.
Goblin note:
There are cases where you might want your model to be case-sensitive. However, often, these cases can be addressed with proper data modeling practices.
MODEL RELATIONSHIPS
In Power BI, you typically set up your relationships with a One-to-Many (1:*) cardinality, from the dimension table to the fact table. Continuing the example from the previous section, you’d link the ‘Customers’ table to the ‘Sales’ table with a One-to-Many relationship by using ‘Customers’[Customer_Key]. Typically, you’d expect each Customer Key to be unique in the ‘Customers’ table.
Often, developers might use the Remove Duplicates step in Power Query to do this. However, this can produce unexpected results if the columns participating in relationships have inconsistent casing.
Goblin note:
To clarify again, a case-sensitive collation of the model has no effect on Power Query. Collation only affects the model.
The following example shows how case-sensitivity in Power Query can affect a case-insensitive model... but not a case-sensitive one.
The above figure illustrates an example of how case-specificity of Power Query can affect a data model:
A typical 1:* relationship exists between a ‘Customer’ and ‘Sales’ table.
A new record is added to the ‘Customer’ table that contains a key with different casing.
Upon refresh, a Power Query Remove Duplicates step doesn’t remove the new record. Both 101263B and 101263b are considered unique.
An error is thrown by Power BI, as the 1:* relationship isn’t valid. In the model that’s case-insensitive, 101263B and 101263b aren’t unique. The ‘Customer’[Customer Key] column no longer contains unique values and the refresh can’t complete.
However, if the model was case-sensitive, this would be a valid relationship; there would be no error. In this case, the refresh wouldn’t fail, and the 1:* relationship would work as expected.
Generally, it’s a good practice to ensure that keys either use integers or have consistent casing in your model.
CONTROL LETTER CASE-SENSITIVITY: THE COLLATION PROPERTY
To specify whether a model is case-specific or not, you have to set the Collation property for the model object. Some considerations about this property:
This property is only visible for Power BI semantic models from XMLA tools like Tabular Editor or by inspecting the .pbip model metadata (TMSL; the model.bim file).
There’s different collations, some are case-specific and some aren’t.
You can’t change this property in a deployed model containing objects. You have to re-deploy it.
The default for Power BI models created in Power BI Desktop is no collation (no case-specificity)
CHANGE CASE-SPECIFIC COLLATION
To get rid of the collation, you need to change the property in the metadata by using an external tool like Tabular Editor, then re-deploy the case-insensitive model.
Connect to the case-sensitive model in Tabular Editor.
Save the model.bim (or another metadata format).
Open the model metadata (no workspace database) and delete the collation property value.
From the Model menu, deploy the model to a Fabric workspace.
Refresh the model (conduct a full refresh / process).
Goblin note:
If you're using a process or tool that automatically creates a Power BI semantic model with a collation, it might be better to consider creating this model yourself by using Power BI Desktop or external XMLA tools. That way, you can control this property for yourself.
TO CONCLUDE
Power BI datasets are usually case-insensitive, but modifying the collation property can change that. Some experiences automatically creating datasets can apply a collation property, resulting in unexpected query results. It’s important to know if your model is case-sensitive or not. However, note that this property can only be viewed and changed by using external tools.