PostgreSQL – 使用子查询删除重复行
PostgreSQL 有多种删除重复行的技术。子查询可用于相同目的。
出于演示的目的,让我们设置一个存储水果的示例表(例如,篮子),如下所示:
CREATE TABLE basket(
id SERIAL PRIMARY KEY,
fruit VARCHAR(50) NOT NULL
);
现在让我们向新创建的篮子表添加一些数据。
INSERT INTO basket(fruit) values('apple');
INSERT INTO basket(fruit) values('apple');
INSERT INTO basket(fruit) values('orange');
INSERT INTO basket(fruit) values('orange');
INSERT INTO basket(fruit) values('orange');
INSERT INTO basket(fruit) values('banana');
现在让我们使用以下语句验证篮子表:
SELECT * FROM basket;
这应该导致以下结果:
现在我们已经设置了示例表,我们将使用以下命令查询重复项:
SELECT
fruit,
COUNT( fruit )
FROM
basket
GROUP BY
fruit
HAVING
COUNT( fruit )> 1
ORDER BY
fruit;
这应该导致以下结果:
以下语句使用子查询删除重复行并保留具有最低 id 的行。
DELETE FROM basket
WHERE id IN
(SELECT id
FROM
(SELECT id,
ROW_NUMBER() OVER( PARTITION BY fruit
ORDER BY id ) AS row_num
FROM basket ) t
WHERE t.row_num > 1 );
在此示例中,子查询返回重复行,但重复组中的第一行除外。和外部DELETE 语句删除了子查询返回的重复行。
如果要保留具有最高 id 的重复行,只需更改子查询中的顺序:
DELETE FROM basket
WHERE id IN
(SELECT id
FROM
(SELECT id,
ROW_NUMBER() OVER( PARTITION BY fruit
ORDER BY id ) AS row_num
FROM basket ) t
WHERE t.row_num > 1 );
如果您想根据多列的值删除重复项,这里是查询模板:
DELETE FROM table_name
WHERE id IN
(SELECT id
FROM
(SELECT id,
ROW_NUMBER() OVER( PARTITION BY column_1,
column_2
ORDER BY id ) AS row_num
FROM table_name ) t
WHERE t.row_num > 1 );
在这种情况下,该语句将删除column_1和column_2列中具有重复值的所有行。要验证上述内容,请使用以下查询:
SELECT
fruit,
COUNT( fruit )
FROM
basket
GROUP BY
fruit
HAVING
COUNT( fruit )> 1
ORDER BY
fruit;
输出: