Export all formulas from Power BI desktop

There are three kinds of formulas in Power BI desktop. We can create measures, calculated columns or calculated tables. Let's say we want to find all places where one of our columns is mentioned in formulas. For that to happen, we can export all formulas and than we can use Search to find all formulas with our column mentioned.

We'll use Tabular Editor program to extract all formulas from Power BI file. Tabular Editor can be downloaded from here:
https://github.com/TabularEditor/TabularEditor/releases/tag/2.16.1
After installing Tabular Editor, we should open it.

In order to be able to get all formulas we need to check option (1). This option is located in File > Preferences > Features in Tabular Editor. Next step is to attach to the PBID from the Tabular Editor. Go to File> Open> From DB (2). In the "local instance" we select our PBID file (3).  PBID file has to be already opened.

Tabular Editor will now look like this. In the left pane we will see the tables and columns from our Power BI file (1). Advanced Scripting (2) is the area where we will enter the code. After entering the code in (3), we will start the code by clicking on the green arrow (4).   

These are the building blocks of code we are going to use. First we need to declare file where our formulas will be saved. Part "new System.Text.UTF8Encoding (true))" is important. It adds BOM mark to our file. This mark helps programs understand what encoding the file is using. We will afterwards open this file in Excel. If our file contains unusual characters, like Č, Ž, Á, Ж; Excel will properly read and present such characters.
using System.IO;
var file = @"C:\Users\Sima\Desktop\A.csv";      //file where to export column names

using(var fileWriter = new StreamWriter(file,false, new System.Text.UTF8Encoding (true))  )

We will loop through all of Columns and if column is calculated we will print in our file its type, name of a column and formula that is used to create column. We will remove all line breaks from formula. Semicolon sign ";" is used as delimiter in final CSV file.

foreach(var Col in Model.AllColumns)
    if ( Convert.ToString( Col.Type ) == "Calculated" )
        {
            fileWriter.Write( "CalculatedColumn" + ";" + Col.Name + ";" + (Col as CalculatedColumn).Replace("\n", "") +"\n" );
        }

Next, we will loop through all tables, and through all of their Partitions. In this context, Partition is part of calculated tables that is created by table creating formula. Again, we will write to our file type of table, its name and its expression.

foreach(var Tab in Model.Tables)
foreach(var Part in Tab.Partitions )
        if ( Convert.ToString( Part.SourceType ) == "Calculated" )
             {
                     fileWriter.Write( "CalculatedTable" + ";" + Tab.Name + ";" + Part.Expression.Replace("\n", "") +"\n" );
            }

Last part of puzzle is similar. It is used to extract formulas for measures. When we combine all building blocks we'll get our final and complete code:

using System.IO;
var file = @"C:\Users\Sima\Desktop\A.csv";      //file where to export column names

using(var fileWriter = new StreamWriter(file,false,new System.Text.UTF8Encoding(true)) )
{
    foreach(var Col in Model.AllColumns)
        if ( Convert.ToString( Col.Type ) == "Calculated" )
            {
                fileWriter.Write( "CalculatedColumn" + ";" + Col.Name + ";" + (Col as CalculatedColumn).Expression.Replace("\n", "")  +"\n" );
            }

    foreach(var Tab in Model.Tables)
    foreach(var Part in Tab.Partitions )
        if ( Convert.ToString( Part.SourceType ) == "Calculated" )
            {
                fileWriter.Write( "CalculatedTable" + ";" + Tab.Name + ";" + Part.Expression.Replace("\n", "")  +"\n" );
            }


    foreach(var M in Model.AllMeasures)
            {
                fileWriter.Write( "Measure" + ";" + M.Name + ";" + M.Expression.Replace("\n", "")  +"\n" );
            }
}

After we run our code, we can open our CSV file in Excel. This is how it will looks like:

Here you can download sample PBIX file to test code.

SPSS data entry

We will see here how to manually enter data into SPSS, or automatically from Excel or from SQL Server. When we open SPSS, we can see Data View (1) and Variable View (2). Data View shows data and is like Excel spreadsheet table. Variable view is used to declare that "Data View" table. There we can declare columns, their content, formatting and possible values.

Manual entry

To enter data manually, it is enough to start typing in the cells in Data View. As we see, names of columns will be created automatically and we have to change them, together with other columns attributes.

Before explaining columns attributes let's recall of different measurement scales that are used in SPSS:
– Nominal scale is used for categorical data ( "man/woman/child" or "India/Japan/China" ).
– Ordinal scale is used for ordered data ( "good/neutral/bad" or "before/during/after" ).
– "Scale" is used in SPSS to label data that can be measured with some measuring unit ( height, weight, temperature ).

Data that is measured in Nominal and Ordinal scale has to be enter in SPSS as codes. This is is necessary so we can use all available statistical tools in SPSS. Coded means that each category has to be presented by number. For example "small, medium, big" can be presented with codes "1,2,3". Those codes are values that we enter into the program (1). Then, in the program itself, we assign one of the categories labels to each code. If we want, we can show those categories labels to user instead of codes (2).

By clicking on this button in the main toolbar, user can switch between the two views from the image above.

Declaration of code is done in "Variable View". Let's see what options are available in Variable View.

Variable View

"Variable View" is place where we enter columns attributes.
– In "Name" (1) we type correct name of a column. Name can have characters, numbers and underscores.
– In "Type" (2) we open new dialog (5) to choose between different data types. As we saw, because all categorical data should be coded, almost all of our columns should be declared as "Numeric".
– Width (3) is to limit how many characters can textual data has. Textual data longer than this will be truncated.
– Decimals (4) will limit number of decimal places presented in "Data View". This is just for visual representation. Real calculations will be conducted with all available decimal figures.

– In "Label" (1) we place short descriptions of our columns.
– In "Values" (2) we can set labels for data that is categorical in nature. This will open new dialog (5). So, if possible codes in column are "1, 2, 3" then we have to attribute label to each code. Our codes "1,2,3" can represent "Man, Woman ,Child". By clicking on button in toolbar, as explained earlier, user will be able to see those labels instead of incomprehensible codes.
– In "Missing" (3) we can determine values that are impossible or unacceptable. After we enter data, every value that is the same as those registered here, will be excluded from calculations as incorrect value. Such values will not be part of statistical calculations, SPSS will just ignore them. We can give three such discrete values (6). Other option is to give one interval and one discrete value (7).
– "Columns" (4) is visual width of column, measured in numbers of characters. We can also change width of columns with mouse on the same way as in Excel (8).

Last 3 column attributes are Align, Measure and Role. In align we can choose between Left, Right and Center alignment (1). Measure (2) is used to declare scale for data. This will not influence SPSS calculations but it is important to declare scale of data for other users of that data. In Role (3) we can just leave the default value ( "Input" ).

This process of declaring our columns should be done for data loaded from Excel or database, too.

Loading from Excel

File > Open > Data (1) in the main menu is option to open dialog (2). Dialog (2) needs from us to choose Excel type of files, folder where Excel file is, and concrete Excel file. After clicking "Open" in dialog (2) we will got dialog (3). There we choose one of the Sheets in the workbook and range of our data. If we don't supply range, automatically determined range will be used ( "A1:G44" on image ). After this our data will be loaded and we can see it in "Data View" (4).

Loading from Database

For getting data out of database, "IBM Data Access Pack" can be installed. This is IBM collection of drivers for different databases we can use. We don't have to use IBM drivers, but they will probably work the best, if we want to transfer data to SPSS. We start loading process by clicking on File > Open Database > New Query (1). Then we click button "Add ODBC Data Source" (2). In new dialog, in "User DSN" tab we should click on "Add" button (3).

"IBM Data Access Pack" will add many ODBC drivers whose names start with "IBM SPSS OEM" (1). We will choose "SQL Server Native Wire Protocol". In next screen we'll add credentials for our database (2). Now, we will close everything until we get back to our start screen (3), so we can click "Next" button.

Now we can select some table and its columns (1 => 2) and click on Finish. Those columns will be now presented into Data View (3).

Instead of Finish we can also use Next buttons to follow whole graphical wizard. This wizard will provide us with opportunity to define relations between tables, to filter data and to rename columns. This is all great, but the last screen is where we will be able to see and directly change SQL statement. I find it easiest to make changes here. After this step we have to click on Finish button, wizard will exit, and we will see our data loaded into SPSS program.

Recursive functions in Power Query

In Power Query there are no loops. There are no "For each", "For next", "Do Until" blocks. In Power Query we can accomplish the same thing with recursive functions. Let's say that we want to sum elements of a list, from the start, until our sum reach 10 or more. Question is, after how many elements, our goal will be reached. We can see on the chart that our goal will be reached after 4-th element 2 + 0 + 3 + 5 = 10.

List = { 2,0,3,5,0 }

First we have to establish our initial state. We know that our list is { 2,0,3,5,0 } and that initial sum is zero. We also know that first element in the list has index zero. We can write that into query. This query will call recursive function which will supply the final result.

let 
	List = { 2,0,3,5,0 }
	, Sum = 0
	, ElementIndex = 0
	, NoOfElements = RecursiveFunction ( List, Sum, ElementIndex )
in
	NoOfElements

Second, what is our logic? Our logic will go like this:
0) Does our initial state (Sum=0) meet the condition? It doesn't, so we'll go to next step.
1) We'll add one element to sum. We got 2, this doesn't meet the condition so we go to next step.
2) We'll add another element to sum. We got 2 (2+0), this doesn't meet the condition so we go to next step.
3) We'll add another element to sum. We got 5 (2+3), this doesn't meet the condition so we go to next step.
4) We'll add another element to sum. We got 10 (5+5), this does meet the condition so our answer is 4 elements.

Step 0) is already described in our query. Steps 1-4 are pretty similar. They have the same logic, but initial state for each step is different. All we have to do is to wrap this logic into function and then to call that function with different arguments each time.  Our recursive function is bellow. If condition is satisfied, we will return "Sum", otherwise we will call function again, this time with different arguments. That will repeat until condition is met.

( List, Sum, ElementIndex ) => 
let
  SubTotal = if Sum >= 10 then Sum else
    RecursiveFunction( List, Sum + List{ ElementIndex }, ElementIndex + 1)
in
    SubTotal

Every recursive function has the same logic. First we establish initial state and then we repeat this two steps until condition is met:
– Does current state satisfied the condition? If does, then we have final result.
– If it doesn't, we'll change the state and new state will give as arguments for another call of function itself.

Let's do one more complex example. Now we have a bunch of nested lists. Our goal is to sum all scalar values in those lists. Our condition is that we use all scalar values, so we are looking for the sum of 2 + 7 + 8 + 9 + 3 + 5 + 5 + 4 + 4 + 11 = 58.


NestedLists = {   { { 2, 7 }, { 8, 9 } }
                , { 3 }
                , { { 5, 5 }, { 4, 4 } } 
                , 11     
}

First, we will establish initial state. We know that initial sum is zero, and we have our list. This query will call recursive function which will supply final result.

let 
          NestedLists = {  { { 2, 7 }, { 8, 9 }  }
                         , { 3 }
                         , { { 5, 5 }, { 4, 4 } }   	
                         , 11
          }
        , Sum = 0
        , ElementIndex = 0
	, Total = NestedListsRecursiveFunction ( NestedLists, Sum, ElementIndex )
in
	Total

All we need more is a recursive function. This function is complex. Let's try to make it easier to understand by thinking about simplier NestedLists = { 1, { 2 }, 2, { 3 } }. We can present this list by picture:

First we will count how many toothbrushes are in each package. After that is easy to sum the whole list.

Here is the trick. Addition of all toothbrushes in one package is similar to addition of all toothbrushes in the whole list. This mean that we can use same logic to individual package and to whole list. That means we can use recursion. Here is recursion function.

(NestedLists, Sum, ElementIndex) =>
  let
    NewSum = 
      if ElementIndex < List.Count(NestedLists) then
        if Value.Is(NestedLists{ElementIndex}, List.Type) then
          NestedListsRecursiveFunction(
              NestedLists 
            , Sum + NestedListsRecursiveFunction(NestedLists{ElementIndex}, 0, 0) 
            , ElementIndex + 1
          )
        else
          NestedListsRecursiveFunction(
              NestedLists 
            , Sum + NestedLists{ElementIndex} 
            , ElementIndex + 1
          )
      else
        Sum
  in
    NewSum

With purple code we are passing through all elements of a List. If we reach the final element, function will return sum of that List as final result. For each element we are using red code to determin if element is List or not.  If element is scalar, we will use green line above to add such element. Problem is that beside scalar elements, we can also have subList elements.

Orange code is used to add such elements. It is similar to green code. Green code is adding scalar values, and orange code is adding sums of subLists. Blue code above is used to sum every subList. Blue code is using recursion.

So this is our goal, we want to replace every subList with its sum. When we reach subList element, we will dive in it with recursive call to the function. If that subList element has only scalars, green line of code will give us sum of that subList. If that is not true, we will dive deeper until we find subList that contains only scalars. When we get sums of lowest subLists, we can use those to calculate sums for subLists that are higher in hierarchy.

Excel file with sample in Power Query:

Parts of Pivot Style

This is the standard dialog for changing the pivot table style. Below we will list the elements of this style that can be changed, with the corresponding images.

Some of these changes require that the appropriate Pivot Style Options are enabled.

"Whole Table" will affect all parts of the pivot table, including filters.

"Report Filter Labels" and "Report Filter Values" will only affect the filter section.

"First and Second Column Stripe", "First and Second Row Stripe", will alternately color columns and rows. If we specify both rows and columns, they will overlap, but the rows will have priority. It is possible for the same color to repeat multiple columns/rows. For example, we can have two green and then two orange columns and so on alternately.

The "First Column" and "Header Row" can also overlap, in the corner cell. Here too, the color given to the row will take precedence.

"The First Header Cell" will color the top left cell in the pivot body. The name of the measure used is usually written there.

"Subtotal Column" and "Subtotal Row" will color the subtotals by columns and rows. There are "Subtotal Column 1,2,3" and "Subtotal Row 1,2,3", so the first three levels of subtotals can have their own colors.

If we turn on the "Blank Row" option then we will have an empty line after each subtotal. The "Blank Row" part of pivot style will color all rows below the first blank row.

"Column Subheading" and "Row Subheading" will color the row and column headers. There are "Column Subheading 1,2,3" as well as "Row Subheading 1,2,3" so we can have up to three colors for different levels of headers.

For totals we have the opportunity to design "Grand Total Column" and "Grand Total Row".

Sample Excel file can be downloaded here:

Custom format all columns in Power BI desktop

We can do this manually, but the problem is that the number of columns can be too large. We need an automatic solution. This automated solution should allow us to specify custom formatting for each column. We will use the Tabular Editor for this.

Tabular Editor can be downloaded from this page:
https://github.com/TabularEditor/TabularEditor/releases/tag/2.16.1
After installing Tabular Editor, open it.

In order to be able to change the formatting of the columns we need to check option (1). This option is located in File > Preferences > Features in Tabular Editor. Next step is to attach to the PBID from the Tabular Editor. Go to File> Open> From DB (2). In the "local instance" we select our PBID file (3).  PBID file has to be already opened.

Tabular Editor will now look like this. In the left pane we will see the tables and columns from our Power BI file (1). Advanced Scripting (2) is the area where we will enter the code. After entering the code in (3), we will start the code by clicking on the green arrow (4).   

This is the code by which we will extract the names of all the columns. We will skip columns that are String typed because we do not need to adjust the format for them.

using System.IO;
var file = @"C:\Users\Sima\Desktop\Columns.csv";      //file where to export column names

using(var fileWriter = new StreamWriter(file))  
foreach(var Tbl in Model.Tables )                  
foreach(var Col in Tbl.Columns)
if ( Convert.ToString(Col.DataType) != "String" )     //exclude String columns
{
    fileWriter.Write( Tbl.Name + ";" + Col.Name + ";" + Col.DataType +  "\n" );
}

The result will be a three-column CSV file. In each row we will see the name of the table, the name of the column and the type of the column (1). We will use this information to decide the desired format for each column (2). Using the Excel formula (3), we will create a code by which we will assign the desired format to each column in the PBID file (4).

After running this code, you still need to save these changes to the BPID file itself.

Model.Tables["SampleTable"].Columns["OrderDate"].FormatString = "dd/mm/yyyy";
Model.Tables["SampleTable"].Columns["Units"].FormatString = "#,##0";
Model.Tables["SampleTable"].Columns["Unit Cost"].FormatString = "#,##0.00";
Model.Tables["SampleTable"].Columns["Total"].FormatString = "#,##0.0";
Model.Tables["SampleTable"].Columns["% of Total"].FormatString = "0.00%";

This is done by clicking on Save icon or by typing Ctrl+S.

Now all columns will have the format we specified.

PBID sample file and Tabular Editor scripts: