在数据表中找到重复项,然后比较重复项

我有一个重复的行的数据表。我需要获取重复项,并比较重复的行以获得某些列中的最佳值。

DataTable dt = new DataTable();

dt.Rows.Add(1,"Test1","584",12);
dt.Rows.Add(2,"Test2","32",123);
dt.Rows.Add(3,"Test3","425",54);
dt.Rows.Add(4,"4",755);
dt.Rows.Add(5,"Test5","854",879);
dt.Rows.Add(6,"1",null);
dt.Rows.Add(7,"999",3);

通知测试1和2具有重复项。

(1,12)
(4,755)

(2,123)
(6,null)
(7,3)

现在我有重复的。我需要制作一条具有最佳价值的线。新的数据表应显示:

Test1 = "Test1",755
Test2 = "Test2",123
Test3 = "Test3",54
Test5 = "Test5",879
nothingzz 回答:在数据表中找到重复项,然后比较重复项

//GroupBy(x => x[1]) = groupby the second column
//Where(x => x.Count() > 1) = only get groups that have a count greater than 1,so duplicates
var duplicates = dt.Rows.OfType<DataRow>().GroupBy(x => x[1]).Where(x => x.Count() > 1).ToList();

//enumerate all duplicates
foreach (var duplicate in duplicates)
{
    //enumerate each row of the duplicate
    foreach (var dataRow in duplicate)
    {
        //do something…
        //I don't know your rules why a row is better than the other,so that part you have to figure out yourself,or extend your question
    }
}

也许您正在寻找这个:

DataTable dt = new DataTable();
dt.Columns.Add("Id",typeof(int));
dt.Columns.Add("Text",typeof(string));
dt.Columns.Add("Value1",typeof(string));
dt.Columns.Add("Value2",typeof(int));

dt.Rows.Add(1,"Test1","584",12);
dt.Rows.Add(2,"Test2","32",123);
dt.Rows.Add(3,"Test3","425",54);
dt.Rows.Add(4,"4",755);
dt.Rows.Add(5,"Test5","854",879);
dt.Rows.Add(6,"1",null);
dt.Rows.Add(7,"999",3);

var duplicates = dt.Rows.OfType<DataRow>().GroupBy(x => x[1]).Where(x => x.Count() > 1).ToList();

//get the current highestId (first column) so that when we remove duplicates and a new row the new row will get the next available id
var highestId = dt.Rows.OfType<DataRow>().Max(x => (int)x[0]);

//enumerate all duplicates
foreach (var duplicate in duplicates)
{
    //get the highest value of each column
    var newId = ++highestId;
    var newText = duplicate.Key;
    var newValue1 = duplicate.Max(x => x[2]); //this does a string comparison,instead of a numeric one,this means that for example that 2 is bigger then 10

    // use this if you need numeric comparison
    var newValue1AsNumeric = duplicate.Select(x =>
    {
        if (int.TryParse(Convert.ToString(x[2]),out var value))
            return value;

        return (int?)null;
    }).OfType<int>().Max(); 

    var newValue2 = duplicate.Select(x => x[3]).OfType<int>().Max();

    //enumerate each row of the duplicate
    foreach (var dataRow in duplicate)
        dt.Rows.Remove(dataRow);

    dt.Rows.Add(newId,newText,newValue1,newValue2);
}

您可以在此处查看运行中的代码: https://dotnetfiddle.net/rp1DUc

,

使用DataTable.AsEnumerable()// LINQ 然后使用GroupBy(),// LINQ 过滤记录, 处理他们, 创建新的数据表/从同一数据表中删除不需要的记录 完成。

,

我为您的列命名是为了使事情更简单:

DataTable dt = new DataTable();
dt.Columns.Add("id",Type.GetType("System.Int32"));
dt.Columns.Add("group",Type.GetType("System.String"));
dt.Columns.Add("first",Type.GetType("System.String"));
dt.Columns.Add("second",Type.GetType("System.Int32"));

dt.Rows.Add(1,3);

然后您可以使用Linq分组并找到最大值(假设这是“最佳”的意思):

var group = dt.AsEnumerable().GroupBy(row => row.Field<string>("group")).Select(g => new
{
    group = g.Key,first = g.Max(row => int.Parse(row.Field<string>("first"))).ToString(),second = g.Max(row => row.Field<int?>("second") ?? 0)
}).ToList();

这将为您提供与所需输出匹配的列表。 我假设应该将null值视为0。然后可以将这些值放回原始DataTable中:

dt.Clear();
var rowCount = 1;
foreach (var x in group)
    dt.Rows.Add(rowCount++,x.group,x.first,x.second);

本文链接:https://www.f2er.com/3110857.html

大家都在问