我正在尝试使用SpaCY并使用构造函数实例化Doc对象:
words = ["hello","world","!"]
spaces = [True,False,False]
doc = Doc(nlp.vocab,words=words,spaces=spaces)
但是当我这样做时,如果我尝试使用依赖项解析器:
for chunk in doc.noun_chunks:
print(chunk.text,chunk.root.text,chunk.root.dep_,chunk.root.head.text)
我得到了错误:
ValueError: [E029] noun_chunks requires the dependency parse,which requires a statistical model to be installed and loaded. For more info,see the documentation:
当我使用不会发生的方法nlp("Hello world!")
时。
之所以这样做,是因为我使用了从第三方应用程序中提取的实体,我希望将其令牌化和实体提供给SpaCy。
类似的东西:
## Convert tokens
words,spaces = convert_to_spacy2(tokens_)
## Creating a new document with the text
doc = Doc(nlp.vocab,spaces=spaces)
## Loading entities in the spaCY document
entities = []
for s in myEntities:
entities.append(Span(doc=doc,start=s['tokenStart'],end=s['tokenEnd'],label=s['type']))
doc.ents = entities
我该怎么办?我自己在文档中加载管道,例如排除令牌生成器?
提前谢谢