Power BI: extract keyword or everything after, before or between a key character

When I have a column with sentences like a description column, I may want to extract everything before or after a keyword or simply to extract the keyword. Let´s take an example, I have this column and I want to extract everything before or after this delimiter “:”:

power bi
  • Left: LEFT('table'[argument],SEARCH("value",'table'[argument])-1)
    power bi
  • Right: RIGHT('table'[argument],LEN('table'[argument])-(SEARCH("value",'table'[argument])+1))
    power bi

NOTE:

  • Replace “table”, “argument” and “value” by yours
  • For the right, after “:”, there is a space, if I want to include it, for the “right” formula, just remove “+1”
  • If searching for “?”, put “~?”

If my column has some empty cells and/or some cells don’t have the value I am looking for, I will get an error:

power bi

To resolve it, read Power BI: error.

Let´s take another example, I have this column and I want to extract this word “new” (I can also do it in Power Query Editor, information more below):

power bi
  • Left: same formula as above
    power bi
  • Right: same formula as above except to have “+1”, I have “+3”
    NOTE: if I want to include the space, change “+3” by “+2”
    power bi
  • Word: MID('table'[argument],SEARCH("value",'table'[argument],,0),3)
    NOTE: if I want to include the space, change “3” by “4”
    power bi

Now, I have this column and I want to extract “new” (after, before and word) but only if “extract” is found:

power bi

  • IF(CONTAINSSTRING('table'[argument],"value1"),second example formula,"value2")
    power bi
    power bi
    power bi
    NOTE: if the word is not found, to put a blank result, change "value2" by ""
    power bi

If I want to extract words between 2 delimiters, for instance “ - “:

PATHITEM(SUBSTITUTE('table'[argument],"delimiter","|"),2)

power bi

I want to extract only x characters after finding a word. For this, I have to go to the Power Query Editor then “add column -> custom column”:

power bi

Put this formula:

Text.Start(Text.AfterDelimiter([argument],"value_find"),number_extract)

power bi

In my example, I am asking to extract the 4 characters after finding the word “new”

NOTE:

  • A space is considered as a character
  • To extract the 4 characters before, change Text.Start(Text.AfterDelimiter by Text.End(Text.BeforeDelimiter
  • “value_find” can be a sentence and not only a single word

As said more above, I can extract the word “new” with this formula:

Text.Combine(List.Select(Text.Split([argument], " "), each Text.StartsWith(Text.Lower(_), "value_find")), " ")

power bi

NOTE: for case sensitive, change “Text.Lower(_)” by “_”

The formula doesn’t work when the word is inside a string like in row 3:

power bi

To remediate, there are 2 options:

  • Using this formula: if Text.Contains([argument], "value_find") then "value_find" else "value_no_find"
    power bi
  • Or using the “conditional column” option:
    power bi
    power bi
power bi

To remove the word “new” and to keep the rest, use this formula:

Text.Combine(List.Select(Text.Split([argument], " "), each not Text.StartsWith(Text.Lower(_), "value_find")), " ")

power bi

Again the formula doesn’t work when the word is inside a string like in row 3:

power bi

To remediate, I will use the “replace values” option:

power bi
power bi
power bi

I can use those formula to keep or to remove duplicate key characters, for instance:

  • “keep duplicate characters” is the same formula as “keep”
  • “remove duplicate characters” is the same formula as “remove”
power bi

To get the result in small characters, I will adapt those formula like that:

  • For “keep”: Text.Combine(List.Select(Text.Split(Text.Lower([argument]), " "), each Text.StartsWith(_, "value_find")), " ")
  • For “remove”: Text.Combine(List.Select(Text.Split(Text.Lower([argument]), " "), each not Text.StartsWith(_, "value_find")), " ")

NOTE: “value_find” should be in small characters

power bi

Interesting Topics