Get top row of each category in mysql năm 2024

The function row_number[] returns the position of a record in an ordered group. To obtain the first record in a group of records, you can use the with clause to create common table expression including row_number[] to obtain the position of each row in the group. Later with an outer query you can filter rows by position.

with rows_and_position as   
  [   
    select emp_id,  
    last_name,   
    salary,  
    dep_id,  
    row_number[] over [partition by dep_id order by salary desc] as position  
    from employee  
  ]  
select dep_id, last_name, salary  
from  rows_and_position  
where position = 1;

The approach presented here can be used to obtain records in any position in a group.

It is a very common task to select top N rows from each group in MySQL when doing data analysis and reporting. You might use this to get the top 3 selling products for each product category, get the highest paid employee in each department, find out the top 10 popular posts in your blog, and so on. We will go over modern SQL techniques of answering top N rows per group questions using Row_Number, Rank, and Dense_Rank window functions.

Example Data

A school has collected student score data for Math and English classes. We would like to perform analysis of top student scores in each class.

create table student_score [
  class_name VARCHAR[9],
  student_login VARCHAR[50],
  score INT
];

insert into student_score [class_name, student_login, score] values ['math', 'gacreman0', 100];
insert into student_score [class_name, student_login, score] values ['english', 'ewoodington1', 87];
insert into student_score [class_name, student_login, score] values ['math', 'ctilliard2', 72];
insert into student_score [class_name, student_login, score] values ['english', 'gzorer3', 81];
insert into student_score [class_name, student_login, score] values ['english', 'swyre4', 75];
insert into student_score [class_name, student_login, score] values ['math', 'ppadgett8', 82];
insert into student_score [class_name, student_login, score] values ['math', 'mreynalds9', 100];
insert into student_score [class_name, student_login, score] values ['math', 'dlettena', 100];
insert into student_score [class_name, student_login, score] values ['math', 'dbartellib', 86];
insert into student_score [class_name, student_login, score] values ['english', 'tlorinezc', 87];
insert into student_score [class_name, student_login, score] values ['english', 'ddeftied', 81];

Use

DB-Fiddle to execute SQL scripts on sample data.

Select Top N Rows Per Group with Row_Number

If the business requirements are to always return exactly 3 top students in each class, Row_Number window function should be used. When there are tied scores, sort student_login ascendingly to determine which student_login to pick. Since the PARTITION BY clause is present, the rankings are reset for rows in each class.

SELECT *
  ,row_number[] OVER [
    PARTITION BY class_name ORDER BY score DESC
      ,student_login ASC
    ] AS row_num
FROM student_score

Here is the completed script to get exactly 3 top students in each class with Row_Number.

WITH score_analysis
AS [
  SELECT *,
    row_number[] OVER [
      PARTITION BY class_name ORDER BY score DESC,
        student_login ASC
      ] AS row_num
  FROM student_score
  ]
SELECT class_name,
  student_login,
  score
FROM score_analysis
WHERE row_num 

Chủ Đề