我有一个用于NLP的Azure LUIS实例, 尝试使用RegEx表达式提取字母数字值。它运行良好,但是输出以小写字母输出。
例如:
案例1 *
我的输入:“为AE0002运行作业” RegExCode = [a-zA-Z]{2}\d+
输出:
{
"query": " run job for AE0002","topScoringIntent": {
"intent": "Run Job","score": 0.7897274
},"intents": [
{
"intent": "Run Job","score": 0.7897274
},{
"intent": "None","score": 0.00434472738
}
],"entities": [
{
"entity": "ae0002","type": "Alpha Number","startIndex": 15,"endIndex": 20
}
]
}
我需要保持输入的大小写。
案例2
我的输入:“仅提取诸如HP和IBM之类的缩写” RegExCode = [A-Z]{2,}
输出:
{
"query": "extract only abreaviations like hp and ibm",// Query accepted by LUIS test window
"query": "extract only abreaviations like HP and IBM",// Query accepted as an endpoint url
"prediction": {
"normalizedQuery": "extract only abreaviations like hp and ibm","topIntent": "None","intents": {
"None": {
"score": 0.09844558
}
},"entities": {
"Abbre": [
"extract","only","abreaviations","like","hp","and","ibm"
],"$instance": {
"Abbre": [
{
"type": "Abbre","text": "extract","startIndex": 0,"length": 7,"modelTypeId": 8,"modelType": "Regex Entity Extractor","recognitionSources": [
"model"
]
},{
"type": "Abbre","text": "only","startIndex": 8,"length": 4,....
{
"type": "Abbre","text": "ibm","startIndex": 39,"length": 3,"recognitionSources": [
"model"
]
}
]
}
}
}
}
这使我怀疑整个培训是否都以小写形式进行。令我震惊的是,最初针对其各自实体培训的所有单词都被重新培训为 Abbre
任何输入都会有很大帮助:)
谢谢