python-是否可以将gzip api响应文件拆分为更少的GB

2024-05-21 • 问答

大家好，我能够从api响应中提取数据，但文件太大，超过4 GB，所以我想问一下是否有办法将gzip文件中的数据拆分为较小的块？

我尝试使用curl命令，发现数据正在下载并且工作正常然后尝试在python

中使用相同的curl逻辑

curl -H "X-Risk-Token: $token" "https://api.nyc3.us.thisismyurl.com/vulnerabilities/download_data_zip" -o file.gz -vv

这是我的python代码：

import requests
import  gzip
import json
import csv


# url ='https://api.thisismyurl.com/vulnerabilities/download_data_zip'
token = 'blahblahblah'
# 'Content-Type': 'application/json'
headers = {'X-Risk-Token': token,'accept': 'application/json'}
response = requests.get(url,headers=headers)
print(response.status_code)
json_format = json.loads(response.text)
print(json_format)

这是我的输出： enter image description here 你能举个例子吗？

谢谢

要执行与卷发完全相同的操作，必须执行以下操作：

import requests
import shutil

# curl -H "X-Risk-Token: $token" "https://api.nyc3.us.thisismyurl.com/vulnerabilities/download_data_zip" -o file.gz -vv

url ='https://api.thisismyurl.com/vulnerabilities/download_data_zip'
token = 'blahblahblah'
fname = "file.gz"
# 'Content-Type': 'application/json'
headers = {'X-Risk-Token': token } # exactly same header as curl.

# memory friendly way to download big files
with requests.get(url,headers=headers,stream=True) as resp:
    print(res.status_code)
    with open(fname,'wb') as fout:
        shutil.copyfileobj(resp.raw,fout)

原始代码的

json.loads（）将消耗大量RAM，并且如果没有足够的可用RAM，可能会使您的进程崩溃。

您到底想对数据做什么？

可视化数据可以通过以下方式完成：

import gzip
import shutil
with gzip.open(fname) as fin:
    shutil.copyfileobj(fin,sys.stdout)

我建议仍然没有成功，我也没有得到任何信息，这说明了为什么“ quota”会导致python脚本出现问题，而不会给curl带来问题，我建议进行更多测试。

1。）下载但不存储结果（是网络被阻止/中止了吗？）

url ='https://api.thisismyurl.com/vulnerabilities/download_data_zip'
token = 'blahblahblah'
headers = {'X-Risk-Token': token } # exactly same header as curl.

# memory friendly way to download big files
with requests.get(url,stream=True) as resp:
    print(res.status_code)
    downloaded = 0
    try:
        for chunk in resp.iter_content(chunk_size=1024): 
            downloaded += len(chunk)
    except Exception:
        print("Downloaded %d bytes and got an exception" % downloaded)
        raise
print("Downloaded %d bytes" % downloaded)

请检查结果是否始终相同或获得的数字是否不同

python-是否可以将gzip api响应文件拆分为更少的GB

y48108320 回答：python-是否可以将gzip api响应文件拆分为更少的GB

大家都在问