Google视觉|越南语:OCR结果质量低下

背景

使用Google Vision API(带有Node)识别越南文字,结果缺乏质量。有一些(不是全部,但有一些)音调标记和元音表示缺少。

与他们的在线演示相比,该演示返回了不错的结果(向下滚动进行实时演示):

https://cloud.google.com/vision/

(由于我没有他们的公司帐户,因此我无法直接询问Google。)

问题

我可以调整请求以获得更好的结果吗?

我已经将语言提示设置为“ vi”,并尝试将其与“ en”结合使用。我还尝试了更具体的“ vi-VN”。

示例图片

https://www.tecc.org/Slatwall/custom/assets/images/product/default/cache/j056vt-_800w_800h_sb.jpg

示例代码

const fs = require("fs");
const path = require("path");
const vision = require("@google-cloud/vision");

async function quickstart() {
  let text;
  const fileName = "j056vt-_800w_800h_sb.jpg";
  const imageFile = fs.readFileSync(fileName);
  const image = Buffer.from(imageFile).toString("base64");
  const client = new vision.ImageAnnotatorClient();

  const request = {
    image: {
      content: image
    },imageContext: {
      languageHints: ["vi",'en']
    }
  };

  const [result] = await client.textDetection(request);

  for (const tmp of result.textAnnotations) {
    text += tmp.description + '\n';
  }

  const out = path.basename(fileName,path.extname(fileName)) + ".txt";
  fs.writeFileSync(out,text);
}

quickstart();

解决方案

// $env:GOOGLE_APPLICATION_CREDENTIALS="[PATH]"

const fs = require("fs");
const path = require("path");
const vision = require("@google-cloud/vision");

async function quickstart() {
  let text = '';
  const fileName = "j056vt-_800w_800h_sb.jpg";
  const imageFile = fs.readFileSync(fileName);
  const image = Buffer.from(imageFile).toString("base64");
  const client = new vision.ImageAnnotatorClient();

  const request = {
    image: {
      content: image
    },imageContext: {
      languageHints: ["vi-VN"]
    }
  };

  const [result] = await client.documentTextDetection(request);

  // OUTPUT METHOD A

  for (const tmp of result.textAnnotations) {
      text += tmp.description + "\n";
  }

  console.log(text);

  const out = path.basename(fileName,text);

  // OUTPUT METHOD B

  const fullTextAnnotation = result.fullTextAnnotation;
  console.log(`Full text: ${fullTextAnnotation.text}`);
  fullTextAnnotation.pages.forEach(page => {
    page.blocks.forEach(block => {
      console.log(`Block confidence: ${block.confidence}`);
      block.paragraphs.forEach(paragraph => {
        console.log(`Paragraph confidence: ${paragraph.confidence}`);
        paragraph.words.forEach(word => {
          const wordText = word.symbols.map(s => s.text).join("");
          console.log(`Word text: ${wordText}`);
          console.log(`Word confidence: ${word.confidence}`);
          word.symbols.forEach(symbol => {
            console.log(`Symbol text: ${symbol.text}`);
            console.log(`Symbol confidence: ${symbol.confidence}`);
          });
        });
      });
    });
  });

}

quickstart();
q58603432 回答:Google视觉|越南语:OCR结果质量低下

这个问题已经in this one了。

总而言之,在这种情况下,演示程序可能使用DOCUMENT_TEXT_DETECTION,在使用TEXT_DETECTION时,有时可以进行更彻底的字符串提取。

您可以尝试发出一个client.document_text_detection请求而不是client.textDetection,这样您可能会获得更接近演示的结果。

如果您想阅读相关文档,可以找到here.

我希望这能解决您的问题!

本文链接:https://www.f2er.com/3133301.html

大家都在问