Linux群集上的结构化主题建模(STM软件包)失败

我正在尝试在Linux集群上使用stm包对超过130,000个文档的主体进行建模。

在本地运行代码(macOS和Windows)可以正常工作并产生所需的结果。

stm_model <- stm(documents = out$documents,vocab = out$vocab,K = 75,prevalence = ~ s(year),max.em.its = 75,data = out$meta,init.type = "Spectral")

但是,在群集上,我不断收到以下错误消息:

stm v1.3.4 successfully loaded. See ?stm for help. 
 Papers,resources,and other materials at structuraltopicmodel.com
Beginning Spectral Initialization 
     Calculating the gram matrix...
     Using only 10000 most frequent terms during initialization...
     Finding anchor words...
    ...........................................................................
     Recovering initialization...
    ....................................................................................................
Initialization complete.

 *** caught illegal operation ***
address 0x7f2cce823e07,cause 'illegal operand'

Traceback:
 1: fn(par,...)
 2: (function (par) fn(par,...))(c(0,0))
 3: optim(par = eta,fn = lhoodcpp,gr = gradcpp,method = method,control = control,doc_ct = doc.ct,mu = mu,siginv = siginv,beta = beta)
 4: logisticnormalcpp(eta = init,mu = mu.i,beta = beta.i,doc = doc,sigmaentropy = sigmaentropy)
 5: estep(documents = documents,beta.index = betaindex,update.mu = (!is.null(mu$gamma)),beta$beta,lambda,mu$mu,sigma,verbose)
 6: stm.control(documents,vocab,settings,model)
 7: stm(documents = out$documents,prevalence = ~s(year),init.type = "Spectral")
An irrecoverable exception occurred. R is aborting now ...

集群上的会话信息:

R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /apps/sandybridge/software/OpenBLAS/0.2.20-GCC-6.4.0-2.28/lib/libopenblas_sandybridgep-r0.2.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONetaRY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] stm_1.3.4       tm_0.7-6        NLP_0.2-0       qdapRegex_0.7.2
 [5] textclean_0.9.3 tidytext_0.2.2  janitor_1.2.0   forcats_0.4.0  
 [9] stringr_1.4.0   dplyr_0.8.3     purrr_0.3.2     readr_1.3.1    
[13] tidyr_0.8.3     tibble_2.1.3    ggplot2_3.2.0   tidyverse_1.2.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.1        pillar_1.4.2      compiler_3.6.1    cellranger_1.1.0 
 [5] tokenizers_0.2.1  tools_3.6.1       jsonlite_1.6      lubridate_1.7.4  
 [9] gtable_0.3.0      nlme_3.1-140      lattice_0.20-38   pkgconfig_2.0.2  
[13] rlang_0.4.0       Matrix_1.2-17     cli_1.1.0         rstudioapi_0.10  
[17] parallel_3.6.1    haven_2.1.1       janeaustenr_0.1.5 withr_2.1.2      
[21] xml2_1.2.0        httr_1.4.0        generics_0.0.2    hms_0.4.2        
[25] grid_3.6.1        tidyselect_0.2.5  data.table_1.12.6 glue_1.3.1       
[29] R6_2.4.0          readxl_1.3.1      modelr_0.1.4      magrittr_1.5     
[33] snowballC_0.6.0   backports_1.1.4   scales_1.0.0      assertthat_0.2.1 
[37] rvest_0.3.4       colorspace_1.4-1  stringi_1.4.3     lazyeval_0.2.2   
[41] munsell_0.5.0     slam_0.1-45       broom_0.5.2       crayon_1.3.4   

您知道什么可能导致此问题吗?

williaz 回答:Linux群集上的结构化主题建模(STM软件包)失败

暂时没有好的解决方案,如果你有好的解决方案,请发邮件至:iooj@foxmail.com
本文链接:https://www.f2er.com/3101604.html

大家都在问