因此,要下载此csv文件,我需要先登录到网站。 我设法找到了csv文件的链接(url),并且我还编写了一个代码,该代码使我可以使用python登录该站点,并且可以正常工作。但是当我尝试在pycharm中下载文件时,它不起作用。
def download_file(x):
fileopen = request.urlopen(x)
file_info = fileopen.read()
file_info_str = str(file_info)
file_lines = file_info_str.split("\\n")
newfile = open("trial.txt","w")
for info in file_lines:
newfile.write(info + "\n")
newfile.close()
file_url = "(csv file link)"
login_data = { .....(i wrote them correctly)..}
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML,like Gecko) Chrome/78.0.3904.97 Safari/537.36"},with requests.session() as s:
url = "example/login.com"
r = s.get(url,headers=headers)
soup = BeautifulSoup(r.content,"html.parser")
login_data["__EVENTTARGET"] = soup.find('input',attrs={"name": "__EVENTTARGET"})["value"]
login_data["__EVENTARGUMENT"] = soup.find('input',attrs={"name": "__EVENTARGUMENT"})["value"]
login_data["__LASTFOCUS"] = soup.find('input',attrs={"name": "__LASTFOCUS"})["value"]
login_data["__VIEWSTATE"] = soup.find('input',attrs={"name": "__VIEWSTATE"})["value"]
login_data["__VIEWSTATEGENERATOR"] = soup.find('input',attrs={"name": "__VIEWSTATEGENERATOR"})["value"]
login_data["__EVENTVALIDATION"] = soup.find('input',attrs={"name": "__EVENTVALIDATION"})["value"]
r = s.post(url,data=login_data,headers=headers)
download_file(file_url)
运行代码时,我得到了(那不是csv文件中的内容):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\r
<html id="ctl00_htmlDocument" xmlns="http://www.w3.org/1999/xhtml" lang="de">\r
<head>
(我复制了前几行)