什么是最简单的SQL语句,该语句将返回给定列的重复值及其在Oracle数据库表中的出现次数?
例如:我有一个JOBS表,其列为JOB_NUMBER。 如何确定我是否有重复的JOB_NUMBER,以及它们被重复了多少次?
1 2 3 4
| SELECT column_name, COUNT(column_name)
FROM TABLE_NAME
GROUP BY column_name
HAVING COUNT(column_name) > 1; |
其他方式:
1 2 3 4 5 6 7
| SELECT *
FROM TABLE A
WHERE EXISTS (
SELECT 1 FROM TABLE
WHERE COLUMN_NAME = A.COLUMN_NAME
AND ROWID < A.ROWID
) |
column_name上有索引时,工作正常(足够快)。这是删除或更新重复行的更好方法。
我能想到的最简单的方法是:
1 2 3 4
| SELECT job_number, COUNT(*)
FROM jobs
GROUP BY job_number
HAVING COUNT(*) > 1; |
如果您不需要知道重复的实际数量,则甚至不需要在返回的列中进行计数。例如
1 2 3 4
| SELECT column_name
FROM TABLE
GROUP BY column_name
HAVING COUNT(*) > 1 |
怎么样:
1 2 3
| SELECT <column>, COUNT(*)
FROM <table>
GROUP BY <column> HAVING COUNT(*) > 1; |
为了回答上面的示例,它看起来像:
1 2 3
| SELECT job_number, COUNT(*)
FROM jobs
GROUP BY job_number HAVING COUNT(*) > 1; |
如果多列标识唯一行(例如,关系表),则可以使用以下命令
使用行编号
例如emp_dept(empid,deptid,startdate,enddate)
假设empid和deptid是唯一的,并在这种情况下标识行
1 2 3 4 5 6 7 8
| SELECT oed.empid, COUNT(oed.empid)
FROM emp_dept oed
WHERE EXISTS ( SELECT *
FROM emp_dept ied
WHERE oed.rowid <> ied.rowid AND
ied.empid = oed.empid AND
ied.deptid = oed.deptid )
GROUP BY oed.empid HAVING COUNT(oed.empid) > 1 ORDER BY COUNT(oed.empid); |
并且如果该表具有主键,则使用主键而不是rowid,例如id为pk,
1 2 3 4 5 6 7 8
| SELECT oed.empid, COUNT(oed.empid)
FROM emp_dept oed
WHERE EXISTS ( SELECT *
FROM emp_dept ied
WHERE oed.id <> ied.id AND
ied.empid = oed.empid AND
ied.deptid = oed.deptid )
GROUP BY oed.empid HAVING COUNT(oed.empid) > 1 ORDER BY COUNT(oed.empid); |
1 2 3 4 5
| SELECT SocialSecurity_Number, COUNT(*) no_of_rows
FROM SocialSecurity
GROUP BY SocialSecurity_Number
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC |
干
1 2 3 4
| SELECT COUNT(j1.job_number), j1.job_number, j1.id, j2.id
FROM jobs j1 JOIN jobs j2 ON (j1.job_numer = j2.job_number)
WHERE j1.id != j2.id
GROUP BY j1.job_number |
将为您提供重复的行的ID。
我通常使用Oracle Analytic函数ROW_NUMBER()。
假设您要检查关于基于列(c1,c2,c3)的唯一索引或主键的重复项。
然后,您将采用这种方式,调出ROWID的行,其中ROW_NUMBER()带来的行数为>1:
1 2 3 4 5 6 7 8 9 10
| SELECT * FROM Table_With_Duplicates
WHERE Rowid IN
(SELECT Rowid
FROM (SELECT Rowid,
ROW_NUMBER() OVER (
Partition BY c1 || c2 || c3
ORDER BY c1 || c2 || c3
) nbLines
FROM Table_With_Duplicates) t2
WHERE nbLines > 1) |
我知道它是一个旧线程,但这可能会帮助一些人。
如果在检查以下重复使用时需要打印表的其他列:
1 2 3
| SELECT * FROM TABLE WHERE column_name IN
(SELECT ing.column_name FROM TABLE ing GROUP BY ing.column_name HAVING COUNT(*) > 1)
ORDER BY column_name DESC; |
如果需要,还可以在where子句中添加一些其他过滤器。
这是执行此操作的SQL请求:
1 2 3 4
| SELECT column_name, COUNT(1)
FROM TABLE
GROUP BY column_name
HAVING COUNT (column_name) > 1; |
1.解决方案
1 2 3
| SELECT * FROM emp
WHERE rowid NOT IN
(SELECT MAX(rowid) FROM emp GROUP BY empno); |
你也可以尝试这样的事情来列出表中所有重复的值说reqitem
1 2 3 4 5 6 7 8 9 10 11
| SELECT COUNT(poid)
FROM poitem
WHERE poid = 50
AND rownum < any (SELECT COUNT(*) FROM poitem WHERE poid = 50)
GROUP BY poid
MINUS
SELECT COUNT(poid)
FROM poitem
WHERE poid IN (50)
GROUP BY poid
HAVING COUNT(poid) > 1; |