使用en_trf_bertbaseuncased_lg模型训练NER SpaCy

我目前正在研究NER项目,我想通过尝试新的SpaCy模型en_trf_bertbaseuncased_lg来提高NER性能,但它给了我错误KeyError: "[E001] No component 'trf_tok2vec' found in pipeline. Available names: ['ner']"。 SpaCy当前是否不支持该语言模型的NER?谢谢!

   # get names of other pipes to disable them during training
    other_pipes = [pipe for pipe in nlp.pipe_names if pipe != 'ner']
    with nlp.disable_pipes(*other_pipes):  # only train NER
        for itn in tqdm(range(n_iter)):
            random.shuffle(train_data_list)
            losses = {}
            # batch up the examples using spaCy's minibatch
            batches = minibatch(train_data_list,size=compounding(8.,64.,1.001))
            for batch in batches:
                texts,annotations = zip(*batch)
                nlp.update(texts,annotations,sgd=optimizer,drop=0.35,losses=losses)
            tqdm.write('Iter: ' + str(itn + 1) + ' Losses: ' + str(losses['ner']))
            if itn == 30 or itn == 40:
                output_dir = Path(output_dir)
                if not output_dir.exists():
                    output_dir.mkdir()
                nlp.to_disk(Path(output_dir))

它在

上给出了错误
nlp.update(texts,losses=losses)
mengfanxiang123 回答:使用en_trf_bertbaseuncased_lg模型训练NER SpaCy

根据该模型在spaCy here上的文档,该模型尚不支持命名实体识别。它仅支持:

  • sentencizer
  • trf_wordpiecer
  • trf_tok2vec

您可以像这样获得给定模型的可用管道:

>>> import spacy

>>> nlp = spacy.load("en_trf_bertbaseuncased_lg")
>>> nlp.pipe_names
[sentencizer,trf_wordpiecer,trf_tok2vec]
本文链接:https://www.f2er.com/3126907.html

大家都在问