Welcome to our third and final article in our series on improving Azure security. Today’s article will explore the log activity architecture of log analytics/Azure Data Explorer, which are some of the products that rely on Kusto Query Language (KQL) to retrieve data quickly and efficiently from the logs/telemetry collected. At this stage, we know how to configure our log analytics, create our first query using the query editor, send diagnostic settings to our log analytics. Time to go deeper and understand the log analytics architecture and two key operators that will follow us through the entire series.
An overview of log analytics architecture
In the first and second parts of this series, we covered the basics to create our first query, but there is more on the architecture side of log analytics and Kusto Query Language. We will explore the basic concepts to get our queries off the ground, but before getting there, we need to explore the entities that comprise the KQL universe and some tools and Azure services that may be used as part of our process.
For starters, Kusto Query Language is not just for log analytics. It is the language used by most Microsoft Azure products to query data (Microsoft has a global vision — PowerShell was never just for Windows, get the idea?).
One of these services is Azure Data Explorer, a fast and highly scalable data exploration service for log and telemetry. It allows several sources, such as IoT, websites, applications, and so forth. We also have other products like Azure Data Studio, Power BI, and so much more coming down the pipe. The goal is to analyze the data using Kusto Query Language.
Azure Data Explorer is important because it implements the concept of entities, which is native for Kusto Query Languages. When creating using cmdlets, we will see New-AzKustoCluster, and we will see Kusto string in the cmdlet to build the Azure service.
When using Azure Data Explorer, we have the concept of a cluster where the databases are stored. The cluster can have more than one database.
A database is where all Kusto Query Languages are executed, and a database can contain one or more tables. The tables can be seen using log analytics, and they are highlighted in the image depicted below.
A table can contain one or more columns. Each column has a data type associated with it, which we will use to perform searches and use operators that match the data type. The actual data is stored in the rows, and they are laid out based on the columns defined in any given table. The icon in front of each column is a graphical representation of the data type.
There are many details around the entities, but you must know that the entities are case-sensitive, including tables and columns. When you start writing your first queries, you will realize that looking for data in the “azureActivity” table will not work, but the same query when using “AzureActivity” will work like a charm.
House rules to get on the right track with log analytics and query editor
This section is not a complete list of all the gotchas, but the idea is to help you get up to speed. Here are some rules:
- Be specific as much as you can, and broad queries will take longer. Clean-up is required on the results.
- Comments in KQL is “//“ in front of the string. Add that, and that line will not be executed.
- When troubleshooting your queries, you will find the need to run a subset of your query. There are a couple of ways to tackle that. You can select the desired query and hit Shift + enter. A second option is to create a new tab, place your code there, and use the Run button.
- Use indentation — it does help a lot!
- KQL is not SQL. You can pile up several lines of “where” statements, and the following statement will be based on the entries provided by the previous statement (similar to PowerShell using pipeline). For example, if in your first where statement, you narrow to just the last five minutes, the second where statement will work with those values. So play smart with the where statements to go from the more extensive data to very specific information that you are looking for.
- In theory, when using several where statements, try to narrow down the results on the first one using the TimeGenerated column.
Getting started with queries: Finding stuff
When we run KQL, we are in the database context, so we can use a straightforward statement to search the entire database to find any piece of information we need, meaning searching on all tables.
Note: Of course, the performance here is not going to be good, and unless you are receiving it by the hour, you always will try to be more specific when looking for information on your data. The following command will look for my resource group name on the current log analytics workspace.
Let’s go to our playground to start our queries. You will first see that the time range is defined outside of the query (Item 1). We have two separate queries in the editor (Item 2). The first one is the most straightforward operator and string. In the second query, we are changing the behavior to be case-sensitive.
In the same screen, we have the time and how many records were retrieved Item 3) from our query (we are executing line 2). The results of the query can be seen at the bottom of the blade. Since we have the exact string in several tables (Item 5), we can see which specific tables they are. If we expand any given record, we will have a column with the table name (Item 4), which helps when looking for something.
If we want to find something within a specific database, we can narrow it down by the table name. There are a couple of ways to do that. We are going to show three formats that will provide similar results.
|search “ResourceGroupName”||Terrible. It will look on the entire database for the string.|
| search “ResourceGroupName”
|Bad. It will look at the entire table for the string.|
|search in (AzureActivity) “ResourceGroupName”||Bad. It will look at the entire table for the string.|
| search ResourceGroup==”ResourceGroupName”
|Good. It will look in a specific column for the string.|
Although the “search” statement does a good job, my tool of choice is the where operator, which gives more flexibility and proper structure to start more advanced queries.
AzureActivity | where ResourceGroup== "RG-CAC-AP6"
You can combine conditions within the same query to narrow down your results.
AzureActivity | where ( ResourceGroup== "RG-CAC-AP6" and ActivityStatusValue=="Start" )
Your log analytics architecture is up and running
In our process of exploring Kusto Query Language and log analytics workspace to increase our security posture, we reviewed the entities available when creating our queries, and it helps define boundaries. We also added some essential tools in our toolbox: the “search” and “where” operators, how we can improve performance, and some best practices when creating the habit of writing our queries.
Featured image: Shutterstock