KQL arg_max and arg_min

The arg_max and arg_min functions in Kusto Query Language (KQL) are used to find rows with the maximum or minimum values of a specific column, while also retrieving other columns in those rows. These functions are helpful when you want to find rows with the highest or lowest values in one column and need details from other columns in those rows.


What are arg_max and arg_min?

  • arg_max: This function returns the row(s) with the maximum value of a specified column, along with the values from other columns in that row.
  • arg_min: Similarly, arg_min returns the row(s) with the minimum value of a specified column, also including values from other columns in that row.

Use Cases

  • Finding the latest (or earliest) event in a log by a specific category.
  • Retrieving the row with the highest (or lowest) score, revenue, or metric in a dataset.

Syntax

The syntax for both arg_max and arg_min is similar:

summarize arg_max(MaxColumn, Column1, Column2, ...) by GroupColumn
  • MaxColumn: The column you want to find the maximum (or minimum) value of.
  • Column1, Column2, …: Other columns to include in the result.
  • GroupColumn: (Optional) A column to group the data by, returning the maximum (or minimum) value for each group.

Usage and Examples

Let’s explore some examples to see how arg_max and arg_min work.

Example 1: Find the Row with the Latest Timestamp

Imagine you have a Logs table with event data, including EventID, Timestamp, and EventType. If you want to find the latest event (the row with the highest Timestamp), you can use arg_max.

Logs 
| summarize arg_max(Timestamp, EventID, EventType)
  • Explanation: This query finds the row with the maximum value in the Timestamp column and returns the EventID and EventType columns for that row.

Example 2: Find the Latest Event for Each EventType

Using arg_max with a grouping column allows you to find the latest event for each type.

Logs
| summarize arg_max(Timestamp, EventID) by EventType
  • Explanation: Here, arg_max finds the latest Timestamp for each EventType. For each event type, it returns the row with the most recent Timestamp along with the EventID column.

Example 3: Find the Row with the Minimum Value in a Numeric Column

If you have a table called Sales with ProductID, SalesAmount, and Date, you can use arg_min to find the row with the minimum SalesAmount.

Sales
| summarize arg_min(SalesAmount, ProductID, Date)
  • Explanation: This query returns the row with the minimum SalesAmount, including the ProductID and Date of that sale.

Example 4: Find the Earliest Sale for Each Product

To get the first sale record for each product, use arg_min with a grouping column.

Sales
| summarize arg_min(Date, SalesAmount) by ProductID
  • Explanation: This query groups the data by ProductID and returns the earliest Date for each product, along with the SalesAmount for that first sale.

Example 5: Find the Highest Score Per Student

Imagine you have a StudentScores table with columns StudentID, Score, and Subject. To find each student’s highest score, use arg_max.

StudentScores
| summarize arg_max(Score, Subject) by StudentID
  • Explanation: This query groups by StudentID and finds the highest Score for each student, returning the associated Subject for that top score.

  • arg_max and arg_min are powerful functions for finding rows with maximum or minimum values while retrieving related data.
  • Use these functions when you want to find records with extreme values (highest or lowest) and need details from those records.
  • They work well with grouping, allowing you to find max/min values for each group in a dataset.

With arg_max and arg_min, you can quickly retrieve specific, meaningful data points from large datasets, making them essential tools in KQL for log and data analysis.

Author: tonyhughes