我有一个复杂的RDD变量,称为 receipts ,类型为RDD [Array [Test]]
case class Test(
header: TestHeader,number: String,amount: Double,description: String)
case class TestHeader(
id: Long,description: String,barcode: Option[String],date: LocalDate,currency: String,vendorId: Long,vendorSiteId: Long,source: String,payGroup: String,creationDate: LocalDate)
Test和TestHeader的toJSON函数定义为:
def toJSON: JSONObject = {
new JSONObject()
.put("header",header.toJSON)
.put("number",number)
.put("amount",amount)
.put("description",description)
}
我能够输出一个JSON,但是现在所有收据都被粘贴到一个json中,所以我不知道它来自哪个收据。我需要附加一个索引号。
用于输出JSON的代码
receipts
.flatMap(receipt => receipt.map(test => test.toJSON.toString))
.saveAsTextFile("s3://test/")
当前输出
{"number":"2","amount":100,"header":{"date":"2019-09-30","vendorSiteId":12345,"description":"Some text","vendorId":123,"source":"Manual Entry","creationDate":"2019-10-15","payGroup":"ABCD","number":"B201909","currency":"JPY","id":999999,"barcode":"1111111"},"description":"some text"}
{"number":"1","amount":200,"vendorSiteId":768,"description":"some text","vendorId":345,"id":99999,"barcode":"11111"},"amount":300,"header":{"date":"2019-10-12","vendorSiteId":567,"vendorId":987,"source":"test","creationDate":"2019-10-12","payGroup":"KDP","number":"b1935b6859a196d6b5e7d68b95c209d4649d645f","currency":"USD","id":951574663,"barcode":"None"},"description":"some text"}
预期产量
{"blocks":{"1":[{"number":"2","description":"some text"},{"number":"1","description":"some text"}],"2":{"number":"1","description":"some text"}}}
任何指针将不胜感激。预先感谢!