将文件读取流传递到Node.js中的readline.createInterface时读取的整个文件

我正在对一个大文件创建文件读取流,并将其传递到readline.createInterface。其目标是在此之后,使用for await ... of从文件中获取行,而不将其全部读取到内存中。

但是,即使我什么也没读,整个流似乎仍在处理中。我知道发生这种情况是因为我试图监听事件(事件被发出),并且如果我使用很大的文件,我的脚本还需要一段时间才能完成。

是否有避免这种行为的方法?我希望按需消费流。

如果我不是使用readline而是从流中读取大块并自己搜索行,则可以成功实现此行为,但是如果可能的话,我希望避免这种情况。

MWE:

var readline = require('readline');
var fs = require('fs');
var file = 'really_big_file.txt'; // 2GB is what I used

readline.createInterface({input: fs.createReadStream(file)});

// this takes a while to finish because the file is being read,//  even if I'm not doing anything with the stream
c274667915 回答:将文件读取流传递到Node.js中的readline.createInterface时读取的整个文件

来自readline软件包的官方documentation

const fs = require('fs');
const readline = require('readline');

async function processLineByLine() {
  const fileStream = fs.createReadStream('/tmp/input.txt');
  // If you want to start consuming the stream after some point
  // in time then try adding the open even to the readstream
  // and explictly pause it
  fileStream.on('open',() => {
    fileStream.pause();


    // Then resume the stream when you want to start the data flow
    setInterval(() => {
      fileStream.resume();
    },1000)
  });

  const rl = readline.createInterface({
    input: fileStream,crlfDelay: Infinity
  });
  // Note: we use the crlfDelay option to recognize all instances of CR LF
  // ('\r\n') in input.txt as a single line break.

  for await (const line of rl) {
    // Each line in input.txt will be successively available here as `line`.
    console.log(`Line from file: ${line}`);
  }

  console.log("reading finished");
}

processLineByLine();

尝试将接口存储在变量中,并将其进一步用于迭代。这里发生的事情是readline开始并一直持续到没有人要求它等待为止

进一步阅读:https://wanago.io/2019/03/04/node-js-typescript-4-paused-and-flowing-modes-of-a-readable-stream/

本文链接:https://www.f2er.com/3087732.html

大家都在问