TL;DR – Testing the new ingestion time transformation features in Microsoft Sentinel.
When I read the word “transform“, I immediately think of Optimus Prime and Bumblebee, you know “robots in disguise” (I can hear the song too!), doesn’t everyone? But that’s not the transformation I am blogging about today. This week I’ve been testing a new feature in Microsoft Sentinel that allows you to configure rules to transform data upon ingestion. It’s a feature many of my partners have requested previously, so I gave it a shot and I was really amazed how easy it was to configure.
Transforming AWSCloudTrail
AWSCloudTrail is the largest data source I have in my tenant, so I figured I would try to test the scenario where I filter out some data that may not be as useful for security purposes. However, while I was at it, I might as well test adding of some custom fields, because that may come in handy as well. If you are looking for the current list of tables that supports this feature, it’s here.
You can find this new features by navigating to the shiny new “Tables” blade within the Log Analytics Workspaces menu, as shown below:

In my case I am testing with the AWSCloudTrail table, so I chose to “Create transformation” within the menu of the table, and then I gave my new rule a very creative name “TransformationDCR“.

I then specified the columns I don’t want to ingest and the ones I want to add within the “Transformation editor“. In my case, I don’t want to see “OperationName” because all the rows have the same value, “CloudTrail“, so I use project-away to filter that out. But just for testing purposes, I am adding two new columns, “MFAUsed_CF” and “User_CF“. The ‘_CF‘, which I *think* stands for custom field, is only needed for non-custom tables.

The KQL query I am using above is:
source | project-away Type | extend MFAUsed_CF = tostring(parse_json(AdditionalEventData).MFAUsed) | extend User_CF = iif(UserIdentityUserName == "", UserIdentityType, UserIdentityUserName)
By the way, if you don’t add the “_CF”, you will see the following error: “New columns added to the table must end with ‘_CF’ (for example, ‘Category_CF’),” as shown below:

And that’s it!

Once the change takes effect, I can start to see where the data in “OperationName” is no longer showing and I can start seeing values in my new columns, “MFAUsed_CF” and “User_CF“, as shown below:

Masking data during ingestion
You can also configure Data Collection Rules for custom tables, by using the “New custom log” option under “Create“, as shown below:

I am choosing the ingenious name of “SampleData” for my custom table and the DCR will be have the innovative name of “SampleDataMaskDCR“, as shown below:

Please note, the data I am using here is just sample data I use for DLP tests, it’s not valid data.
When I upload the json file, it automatically warns me about the missing timestamp.

It then automatically adds that first line (TimeGenerated), which I will leave as is, since I just need it for testing purposes. Then I added one of the masking samples found in the library of transformations to mask the SSN values, as shown below:

And that’s it!

I can see my new table is created, as shown below:

Also, if I want to check the DCRs created, I can see that information from the Monitor portal, as shown below:

More info
If you want to give this a shot, there are a couple tutorials in the Microsoft documentation that I found very useful. There is also a Tech Community blog and a library of transformations that gives you a head start with some of the most common scenarios, not just for filtering, but also for masking data, which some of my partners have previously requested. Happy testing!
One thought on “Disguising data”