Power Query

Combiner functions in Power Query

In Power Query "Table.CombineColumns" function is used to combine values from several columns (1) into one column (2).  This function takes 4 arguments. First argument is source table. Second argument is list of columns to combine. Last argument is name of newly created column.

Table.CombineColumns( Source, { "C1","C2","C3","C4" }, Combiner.CombineTextByDelimiter("<>"), "CombineByDelimiter" )

Third argument is what interests us. That argument explains how to combine column values. That argument is given by using one of this five functions:

  • Combiner.CombineTextByDelimiter
  • Combiner.CombineTextByEachDelimiter
  • Combiner.CombineTextByLengths
  • Combiner.CombineTextByPositions
  • Combiner.CombineTextByRanges

Some of these functions are simple, but others are not so obvious.

Combiner.CombineTextByDelimiter

This version is simple. It concatenate all values by placing delimiter between them. Combiner. CombineTextByDelimiter function is used just to give that delimiter. In thise example, delimiter is "–".

Table.CombineColumns( Source, { "C1","C2","C3","C4" }, Combiner.CombineTextByDelimiter("-"), " CombinedByDelimiter" )
C1C2C3C4CombinedByDelimiter
ABBBBBCDDDD=>ABBBBCDDDD

This is similar to TextJoin function in Excel:

TextJoin
=TEXTJOIN(" – ";TRUE;A2:D2)=>ABBBBCDDDD

Combiner.CombineTextByDelimiter can have one more argument. That argument could take value QuoteStyle.None or QuoteStyle.Csv. Quote.Csv will change every column which has quotes, so that column content will be delimited with quotes and all original quotes will be doubled.

ABB"BBCDDDD=> ABB"BBCDDDD QuoteStyle.None
ABB"BBCDDDD=> A"BB""BB"CDDDD QuoteStyle.Csv

Combiner.CombineTextByDelimiter( "-", QuoteStyle.Csv or QuoteStyle.None)

Combiner.CombineTextByEachDelimiter

This version is similar to previous. This time we give a list of delimiters. Those delimiters are used sequentially to separate column values.

Table.CombineColumns( GiveType, { "C1","C2","C3","C4" }, Combiner.CombineTextByEachDelimiter( { "_1_", "_2_", "_3_" }, QuoteStyle.Csv), "CombineByDelimiter" )
C1C2C3C4CombinedByEachDelimiter
ABBBBCDDDD=>A_1_BBBB_2_C_3_DDDD

The question is what if have too many or too less of delimiters. If we have five delimiters, but four columns, the result will be unchanged "A_1_BBBB 2_C_3_DDDD". Surplus delimiters will not be used. If we have only two delimiters, but four columns, the result be "A_1_BBBB_2_CDDDD". There will be no delimiters between "C" and "DDDD".

This function also accepts QuoteStyle second argument.

Combiner.CombineTextByLengths

We define list of numbers. From each column only so many characters will be presented.

Table.CombineColumns( GiveType, { "C1","C2","C3","C4" }, Combiner.CombineTextByLengths( { 2, 3, 3, 3 } ), "CombinedByLenghts" )

From first column we will take only two characters, from second column only three characters and so on. If we don't have enough characters in our column, column content will be padded with spaces to reach reserved space. If we have too many characters in column, rest of characters will be trimed. If our argument is { 2, 3, 3 } then the last column will not be presented.

C1C2C3C4 CombinedByLenghts
ABBBBCDDDD=> A␣BBBC␣␣DDD

This function has another argument called "template". Template is "background" text that we can use to create mask for our text.

Table.CombineColumns( GiveType, { "C1","C2","C3","C4" }, Combiner.CombineTextByLengths( { 2, 3, 3, 6 }, "*******" ), "CombineByDelimiter" )

If our "background" mask is "*******" then this will combine with previous result "A␣BBBC␣␣DDD" to create "A*BBBC*DDD".

Combiner.CombineTextByPositions

In this case, start of every column content is determined by positions list.

Table.CombineColumns( GiveType, { "C1","C2","C3","C4" }, Combiner.CombineTextByPositions( { 2, 4, 6, 9, 12 } ), "CombineByPositions" )

So, content of first column will start on position (2), of second column on position (4), third column on position (6), last column on position (9). If our list contains only numbers { 2, 4, 6 } then there will be no last column values in final result.

0123456789101112
A…B…C…D…

Then, we'll add rest of letters to see what is happening. Columns will fill as much space as possible. If column content is not enough, reserved space will be filled with spaces. If our list only has values { 2, 4, 6, 9 }, last column would not be trimed to three "DDD".

0123456789101112
ABBCDDD

This is final result.

C1C2C3C4 CombinedByPositions
ABBBBCDDDD=>␣A␣BBC␣␣DDD

This function also accepts Template second argument.

Combiner.CombineTextByRanges

This version is combination of Positions and Lengths versions. Our argument is sequence of pairs. First value in pair is position. The second is length.

Table.CombineColumns( GiveType, { "C1","C2","C3","C4" }, Combiner.CombineTextByRanges( { { 2,2 },{ 4, 4},{ 5,2 },{ 7,3 } } ), "CombineByPositions" )

We can see bellow that content of every column starts on positions 2, 4, 5, 7 respectively. Each columns content has maximal length 2, 4, 2, 3 characters respectively. Latter columns will overwrite former columns. For example, column 3 will overwrite column 2 on position 5 ("C" instead of "B").

0123456789
C1A
C2BBBB
C3C
C4DDD
resultABCBDDD

If we only use pairs { { 2,2 },{ 4, 4},{ 5,2 } }, then last column will not be presented. Length of range can not be zero.

This function also accepts Template second argument.

Excel sample file:

PQ execute queries asynchronously

Power Query is great tool for preparing data to be written into CSV file. CSV files are usually used to import data into databases but the requirement is that they must be formatted in specific way. We can export formatted data to a spreadsheet, and then we can use VBA to write already formatted data to a CSV file.

Data connections in Power Query are by default set to be asynchronous. If we start a query with "Workbook.RefreshAll" from VBA procedure, the rest of VBA procedure won't wait for the query to complete. Instead of writing new data from the spreadsheet to a CSV file, VBA procedure will write the old ones, because there will be no time for the query to complete.

The solution is to change the execution of the query to synchronous. This is done in Excel Connection Properties. We can find our connection properties in Data tab (1) > Workbook Connections Properties (2). There is a checkbox "Enable background refresh". This checkbox should be disabled.

The checkbox can be disabled directly from the VBA. That possibility allow us to fully control the execution of a query in procedure. This procedure will temporarily change BackgroundQuery properties to False so that our Procedure will wait for the query to complete.

Sub Refresh_All_Data_Connections()
    For Each objConnection In ThisWorkbook.Connections
        'Get current background-refresh value
        bBackground = objConnection.OLEDBConnection.BackgroundQuery
        'Temporarily disable background-refresh
        objConnection.OLEDBConnection.BackgroundQuery = False
        'Refresh this connection
        objConnection.Refresh
        'Set background-refresh value back to original value
        objConnection.OLEDBConnection.BackgroundQuery = bBackground
    Next
End Sub

Excel sample table: