首先,我们将获得重复的列名列表。然后,我们将其重新连接到ColumnsV上,并在这些列上获取所需的任何信息。
with cols as (
select
columnname,count (*) as cnt
from
dbc.columnsv
group by columnname
having count (*) > 1)
select
columnsv.*
from
dbc.columnsv
inner join cols
on columnsv.columnname = cols.columnname
,
Andrews查询简化:
SELECT DatabaseName,TableName,ColumnName,Count(*) Over (PARTITION BY ColumnName) AS Cnt
FROM dbc.ColumnsV
QUALIFY Cnt > 1 -- only repeated columns
我认为这足以处理数据,但是如果您真的想要成对的表,则需要进行自我联接:
WITH cte AS
(
SELECT DatabaseName,Count(*) Over (PARTITION BY ColumnName) AS Cnt
FROM dbc.ColumnsV
WHERE databasename = 'open_data'
QUALIFY Cnt > 1 -- only repeated columns
)
SELECT *
FROM cte AS t1
JOIN cte AS t2
ON t1.ColumnName = t2.ColumnName -- same column
WHERE t1.DatabaseName || '.' || t1.TableName < t2.DatabaseName || '.' || t2.TableName
当然,这将大大增加行数,它返回每个表名一次,因此n*(n-1)/2
行的n
行具有相同的列名。
如果将条件更改为<>
而不是<
,则会得到所有组合和两倍的行数,即table1,table2
和table2,table1
。
本文链接:https://www.f2er.com/3149459.html