我正在使用以下代码训练我的 NER 模型。

代码开始：

def train_spacy(nlp,training_data,iterations):
    
    if "ner" not in nlp.pipe_names:
        ner = nlp.create_pipe('ner')
        nlp.add_pipe("ner",last = True)
    
    training_examples = []
    faulty_dataset = []
    
    for text,annotations in training_data:
        doc = nlp.make_doc(text)
        try:
            training_examples.append(Example.from_dict(doc,annotations)) #creating examples for training as per spaCy v3.
        except:
            faulty_dataset.append([doc,annotations])        
        for ent in annotations['entities']:
            ner.add_label(ent[2])
    
    other_pipes = [pipe for pipe in nlp.pipe_names if pipe!= 'ner']
    
    with nlp.disable_pipes(*other_pipes):
        optimizer = nlp.begin_training()
    
        for iter in range(iterations):
    
            print('Starting iteration: ' + str(iter))
            random.shuffle(training_examples)
            losses = {}
            batches = minibatch(training_examples,size=compounding(4.0,32.0,1.001))
            for batch in batches:
                nlp.update(
                            batch,drop = 0.2,sgd = optimizer,losses = losses
                            )
            print(losses)
    
            for i in range(deviceCount): #to see how much GPU cores I am using:
                handle = nvmlDeviceGetHandleByIndex(i)
                util = nvmlDeviceGetUtilizationRates(handle)
                print(util.gpu)
    
    return nlp,faulty_dataset,training_examples

spacy.require_gpu() #this returns "True"

nlp = spacy.blank('en')
word_vectors = 'w2v_model.txt'
model_name = "nlp"
load_word_vectors(model_name,word_vectors) #I have some trained word vectors that I try to load them here.

test = train_spacy(nlp,30) #training for 30 iterations

代码结束。

问题：

问题是每次迭代大约需要 30 分钟 - 我有 8000 条训练记录，其中包括很长的文本和 6 个标签。

所以我希望使用更多 GPU 内核来减少它，但似乎只使用了一个内核 - 当我在上面的代码中执行 print(util.gpu) 时，只有第一个内核返回非零值。

问题 1：有什么方法可以在训练过程中使用更多 GPU 内核以使其更快？我将不胜感激任何线索。

经过更多研究，似乎 spacy-ray 旨在实现并行训练。但是我在 nlp.update 中找不到关于使用 Ray 的文档，因为我发现的只是关于使用“python -m spacy ray train config.cfg --n-workers 2”。

问题 2：Ray 是否支持使用 GPU 进行并行处理，是否仅适用于 CPU 内核？
问题 3：如何将 Ray 集成到我使用 nlp.update 而不是使用“python -m spacy ray train config.cfg --n-workers 2”的 python 代码中。 ?

谢谢！

环境：

以上所有代码都在 AWS Sagemaker 上使用 ml.p3.2xlarge EC2 实例的一个 conda_python3 笔记本中。
使用的 Python 版本：3
使用的 spaCy 版本：3.0.6

在多个 GPU（不仅仅是一个）上训练 spaCy NER 模型代码开始：代码结束。环境：

代码开始：

代码结束。

问题：

环境：

iCMS 回答：在多个 GPU（不仅仅是一个）上训练 spaCy NER 模型代码开始：代码结束。环境：

在多个 GPU（不仅仅是一个）上训练 spaCy NER 模型 代码开始：代码结束。 环境：

代码开始：

代码结束。

问题：

环境：

iCMS 回答：在多个 GPU（不仅仅是一个）上训练 spaCy NER 模型 代码开始：代码结束。 环境：

大家都在问

在多个 GPU（不仅仅是一个）上训练 spaCy NER 模型代码开始：代码结束。环境：

iCMS 回答：在多个 GPU（不仅仅是一个）上训练 spaCy NER 模型代码开始：代码结束。环境：