有没有办法使用R中的ODBC包来构建不使用NEWID()的分层示例?

我有一个客户数据库,我想编写一个查询,根据我的客户的特征向我提供客户的分层样本。我正在使用一个调查猴子使用的公式(代码中的URL),该公式根据z得分,误差幅度和百分比值给我样本数量。我还使用R中的odbc包将此数据直接发送到R。

stratified_sample <- function() {
  library(DBI)
  library(odbc)
  con <- dbConnect(odbc(),Driver = "ODBC_DRIVER",Server = "SERVER",Database = "DATABASE",Authentication = "Authentication")

dbGetQuery(con,"--Parameters for sample size calc
DeclARE @ZScore DECIMAL(4,3) = 1.96,@MOE DECIMAL(4,3) = 0.05,@PctVal DECIMAL(4,3) = 0.5;

--All customers meeting criteria
IF OBJECT_ID('tempdb..#CustPop') IS NOT NULL
    DROP TABLE #CustPop;
SELECT DISTINCT
    c.CustomerIDNumber,c.Gender,c.Generation,c.CharactertisticA,c.CharactertisticB,c.CharactertisticC
INTO #CustPop
FROM [DATABASE].TableC C")

dbGetQuery(con,"--Find sample sizes by demographic combination
IF OBJECT_ID('tempdb..#SampleSizes') IS NOT NULL
    DROP TABLE #SampleSizes;
SELECT c.Gender,c.CharactertisticC,COUNT(DISTINCT c.CustomerIDNumber) AS DistinctCust,--Formula courtesy of SurveyMonkey: https://www.surveymonkey.com/mp/sample-size-calculator/
       ROUND(
                ((SQUARE(@ZScore) * (@PctVal * (1 - @PctVal))) / SQUARE(@MOE))
                / (1
                   + ((SQUARE(@ZScore) * (@PctVal * (1 - @PctVal)))
                      / (SQUARE(@MOE) * COUNT(DISTINCT c.CustomerIDNumber))
                     )
                  ),0
            ) AS SampleSize
INTO #SampleSizes
FROM #CustPop c
GROUP BY c.Gender,c.CharactertisticC;")

dbGetQuery(con,"--Assign row numbers randomly for each customer         
IF OBJECT_ID('tempdb..#CustRowNum') IS NOT NULL
    DROP TABLE #CustRowNum;
SELECT c.CustomerIDNumber,ROW_NUMber() OVER (PARTITION BY c.Gender,c.CharactertisticC
                          ORDER BY NEWID()
                         ) AS RowNum,c.CharactertisticC
INTO #CustRowNum
FROM #CustPop c;")

dbGetQuery(con,"--Find only customers with RowNum <= SampleSize
IF OBJECT_ID('tempdb..#CustSample') IS NOT NULL
    DROP TABLE #CustSample;
SELECT DISTINCT
    c.CustomerIDNumber
INTO #CustSample
FROM #CustRowNum c
    JOIN #SampleSizes s
        ON s.CharactertisticA= c.CharactertisticA
           AND s.Gender = c.Gender
           AND s.Generation = c.Generation
           AND s.CharactertisticB= c.CharactertisticB
           AND s.CharactertisticC = c.CharactertisticC
WHERE c.RowNum <= s.SampleSize;")

get_query_data <- dbGetQuery(con,"select * from #CustSample")

return(get_query_data)
}

问题是,当我尝试ORDER BY NEWID()时,我的函数在倒数第二个子句上失败,因为它不是公认的内置函数名。也许我需要避免在order by子句中使用NEWID。我以前使用过RODBC包,但从未遇到过此问题。还有其他人遇到过这个问题吗?

谢谢

cuiqiongling8888 回答:有没有办法使用R中的ODBC包来构建不使用NEWID()的分层示例?

暂时没有好的解决方案,如果你有好的解决方案,请发邮件至:iooj@foxmail.com
本文链接:https://www.f2er.com/3107238.html

大家都在问