使用Flask处理大文件上传

问题描述

我认为解决该问题的超级简单方法只是将文件分成许多小部分/大块发送。因此,要完成这项工作将需要两个部分,即前端(网站)和后端(服务器)。对于前端部分,你可以使用类似的东西Dropzone.js,它没有附加的依赖关系,并且包含不错的CSS。你所要做的就是将类添加dropzone到表单,它会自动将其变成其特殊的拖放字段之一(你也可以单击并选择)。

但是,默认情况下,dropzone不会对文件进行分块。幸运的是,它确实很容易启用。下面是一个示例文件上传形式DropzoneJSchunking启用:

  1. <html lang="en">
  2. <head>
  3. <Meta charset="UTF-8">
  4. <link rel="stylesheet"
  5. href="https://cdnjs.cloudflare.com/ajax/libs/dropzone/5.4.0/min/dropzone.min.css"/>
  6. <link rel="stylesheet"
  7. href="https://cdnjs.cloudflare.com/ajax/libs/dropzone/5.4.0/min/basic.min.css"/>
  8. <script type="application/javascript"
  9. src="https://cdnjs.cloudflare.com/ajax/libs/dropzone/5.4.0/min/dropzone.min.js">
  10. </script>
  11. <title>File Dropper</title>
  12. </head>
  13. <body>
  14. <form method="POST" action='/upload' class="dropzone dz-clickable"
  15. id="dropper" enctype="multipart/form-data">
  16. </form>
  17. <script type="application/javascript">
  18. Dropzone.options.dropper = {
  19. paramName: 'file',
  20. chunking: true,
  21. forceChunking: true,
  22. url: '/upload',
  23. maxFilesize: 1025, // megabytes
  24. chunkSize: 1000000 // bytes
  25. }
  26. </script>
  27. </body>
  28. </html>

这是使用flask的后端部分:

  1. import logging
  2. import os
  3. from flask import render_template, Blueprint, request, make_response
  4. from werkzeug.utils import secure_filename
  5. from pydrop.config import config
  6. blueprint = Blueprint('templated', __name__, template_folder='templates')
  7. log = logging.getLogger('pydrop')
  8. @blueprint.route('/')
  9. @blueprint.route('/index')
  10. def index():
  11. # Route to serve the upload form
  12. return render_template('index.html',
  13. page_name='Main',
  14. project_name="pydrop")
  15. @blueprint.route('/upload', methods=['POST'])
  16. def upload():
  17. file = request.files['file']
  18. save_path = os.path.join(config.data_dir, secure_filename(file.filename))
  19. current_chunk = int(request.form['dzchunkindex'])
  20. # If the file already exists it's ok if we are appending to it,
  21. # but not if it's new file that would overwrite the existing one
  22. if os.path.exists(save_path) and current_chunk == 0:
  23. # 400 and 500s will tell dropzone that an error occurred and show an error
  24. return make_response(('File already exists', 400))
  25. try:
  26. with open(save_path, 'ab') as f:
  27. f.seek(int(request.form['dzchunkbyteoffset']))
  28. f.write(file.stream.read())
  29. except OSError:
  30. # log.exception will include the traceback so we can see what's wrong
  31. log.exception('Could not write to file')
  32. return make_response(("Not sure why,"
  33. " but we couldn't write the file to disk", 500))
  34. total_chunks = int(request.form['dztotalchunkcount'])
  35. if current_chunk + 1 == total_chunks:
  36. # This was the last chunk, the file should be complete and the size we expect
  37. if os.path.getsize(save_path) != int(request.form['dztotalfilesize']):
  38. log.error(f"File {file.filename} was completed, "
  39. f"but has a size mismatch."
  40. f"Was {os.path.getsize(save_path)} but we"
  41. f" expected {request.form['dztotalfilesize']} ")
  42. return make_response(('Size mismatch', 500))
  43. else:
  44. log.info(f'File {file.filename} has been uploaded successfully')
  45. else:
  46. log.debug(f'Chunk {current_chunk + 1} of {total_chunks} '
  47. f'for file {file.filename} complete')
  48. return make_response(("Chunk upload successful", 200))

解决方法

用Flask处理超大文件上传(1 GB以上)的最佳方法是什么?

我的应用程序实际上需要多个文件,为它们分配一个唯一的文件号,然后根据用户选择的位置将其保存在服务器上。

我们如何将文件上传作为后台任务运行,以使用户在1小时内没有浏览器旋转,而是可以立即进入下一页?

  • Flask开发服务器能够处理大量文件(50gb需要1.5个小时,上传速度很快,但将文件写入空白文件的速度却很慢)
  • 如果我用Twisted包装应用程序,则该应用程序在大文件上崩溃
  • 我曾尝试将Celery与Redis结合使用,但这似乎不是发布上传的选项
  • 我在Windows上,网络服务器的选择较少

猜你在找的技术问答相关文章

如何检查配对的蓝牙设备是打印机还是扫描仪(Android)
是否允许实体正文进行HTTP DELETE请求?
如何将ZipInputStream转换为InputStream?
java.util.logging Java 8中的变量
PowerMockito.doReturn返回null
Java中的RESTful调用
Swing / Java:如何正确使用getText和setText字符串
特殊字符和重音字符
Android Studio中的ndk.dir错误
错误“找不到主类”