📜  PostgreSQL – 使用子查询删除重复行

📅  最后修改于: 2022-05-13 01:57:16.015000             🧑  作者: Mango

PostgreSQL – 使用子查询删除重复行

PostgreSQL 有多种删除重复行的技术。子查询可用于相同目的。

出于演示的目的,让我们设置一个存储水果的示例表(例如,篮子),如下所示:

CREATE TABLE basket(
    id SERIAL PRIMARY KEY,
    fruit VARCHAR(50) NOT NULL
);

现在让我们向新创建的篮子表添加一些数据。

INSERT INTO basket(fruit) values('apple');
INSERT INTO basket(fruit) values('apple');

INSERT INTO basket(fruit) values('orange');
INSERT INTO basket(fruit) values('orange');
INSERT INTO basket(fruit) values('orange');

INSERT INTO basket(fruit) values('banana');

现在让我们使用以下语句验证篮子表:

SELECT * FROM basket;

这应该导致以下结果:



现在我们已经设置了示例表,我们将使用以下命令查询重复项:

SELECT
    fruit,
    COUNT( fruit )
FROM
    basket
GROUP BY
    fruit
HAVING
    COUNT( fruit )> 1
ORDER BY
    fruit;

这应该导致以下结果:

以下语句使用子查询删除重复行并保留具有最低 id 的行。

DELETE FROM basket
WHERE id IN
    (SELECT id
    FROM 
        (SELECT id,
         ROW_NUMBER() OVER( PARTITION BY fruit
        ORDER BY  id ) AS row_num
        FROM basket ) t
        WHERE t.row_num > 1 );

在此示例中,子查询返回重复行,但重复组中的第一行除外。和外部DELETE 语句删除了子查询返回的重复行。

如果要保留具有最高 id 的重复行,只需更改子查询中的顺序:

DELETE FROM basket
WHERE id IN
    (SELECT id
    FROM 
        (SELECT id,
         ROW_NUMBER() OVER( PARTITION BY fruit
        ORDER BY  id ) AS row_num
        FROM basket ) t
        WHERE t.row_num > 1 );

如果您想根据多列的值删除重复项,这里是查询模板:

DELETE FROM table_name
WHERE id IN
    (SELECT id
    FROM 
        (SELECT id,
         ROW_NUMBER() OVER( PARTITION BY column_1,
         column_2
        ORDER BY  id ) AS row_num
        FROM table_name ) t
        WHERE t.row_num > 1 );

在这种情况下,该语句将删除column_1column_2列中具有重复值的所有行。要验证上述内容,请使用以下查询:

SELECT
    fruit,
    COUNT( fruit )
FROM
    basket
GROUP BY
    fruit
HAVING
    COUNT( fruit )> 1
ORDER BY
    fruit;

输出: