我正在将文本文件列表加载到dask数据框中。每个文本文件都有多行字典(用换行符分隔)。对于文本文件的每一行,我都会进行“ remove_escapes”中定义的一些小处理并返回一个列表。我打电话给flatten以确保我有一个列表(而不是列表列表)。
input_file_list = self.get_file_list()
posts_db = db.from_sequence(input_file_list)
posts_db = posts_db.map(self.remove_escapes).flatten()
posts_df = posts_db.to_dataframe()
posts_df = posts_df.compute()
def remove_escapes(self,chunk_file):
json_list = []
with open(chunk_file,'r') as fp:
for line in fp:
line = line.strip()
if line:
line = line.replace("\\\\","\\")
json_data = json.loads(line)
json_list.append(json_data)
return json_list
我明白了:
Body Comments Id Title
0 <p>It depends on the context:</p>

<ol... side note: Hash#fetch is not exactly Hash#[]. ... 13935 None
1 <p>It depends on the context:</p>

<ol... @tokland `:c` not found 13935 None
2 <p>It depends on the context:</p>

<ol... "There is also a convention that it is used as... 13935 None
3 <p>I'd like to have a python program alert me ... `import os; os.system('say "Beer time."'); pri... 13941 Python Sound ("Bell")
4 <p>I'd like to have a python program alert me ... the question is answered but... you do need qu... 13941 Python Sound ("Bell")
5 <p>I'd like to have a python program alert me ... Does not seem to be working for me on Mojave 13941 Python Sound ("Bell")
6 <p>Have you tried :</p>

<pre><code>im... I'm on ubuntu,it doesn't work for me. Any idea? 13949 None
7 <p>Have you tried :</p>

<pre><code>im... @kecske it's common [to disable the audible-be... 13949 None
8 <p>Have you tried :</p>

<pre><code>im... Works on Windows XP as well (in a console app). 13949 None
9 <p>I had to turn off the "Silence terminal bel... Seems to work with python 2 only.... 13959 None
0 <p>I want to use a track-bar to change a form'... Also,Decimal can't represent as wide a value ... 4 Convert Decimal to Double?
1 <p>Given a <code>DateTime</code> representing ... what all of the answers so far have missed is ... 9 How do I calculate someone's age in C#?
2 <p>Given a <code>DateTime</code> representing ... No one has considered leap years? or checking ... 9 How do I calculate someone's age in C#?
3 <p>Given a <code>DateTime</code> representing ... Note that for someone less than one year old,... 9 How do I calculate someone's age in C#?
4 <p>Given a <code>DateTime</code> representing ... why nobody is using TimeSpan? 9 How do I calculate someone's age in C#?
5 <p>Given a specific <code>DateTime</code> valu... What if you want to calculate a relative time ... 11 Calculate relative time in C#
6 <p>Given a specific <code>DateTime</code> valu... moment.js is a very nice date parsing library.... 11 Calculate relative time in C#
7 <p>Given a specific <code>DateTime</code> valu... There is the .net package https://github.com/N... 11 Calculate relative time in C#
8 <p>Here's how I do it</p>

<pre class=... "< 48*60*60s" is a rather unconventional defin... 12 None
9 <p>Here's how I do it</p>

<pre class=... Since all those If..else are just timeslabs,y... 12 None
0 <p>Best solution is to let IIS do it.</p>
... Jeff Atwood List some problems he’s run into... 17068 None
1 <p>use <code>system.xml.Linq.XElement</code> a... I'm working with NET 2.0 17093 None
2 <p>We are developing an application that invol... I fail to see answers for this questions which... 17106 How to generate sample XML documents from thei...
3 <p><a href="http://netbeans.org" rel="nofollow... That era is now over... 17110 None
4 <p><a href="http://www.altova.com/xmlspy.html"... XMLSpy looked good but generated xml that then... 17114 None
5 <p>How do you run an external program and pass... I think you need to rewrite your question - op... 17140 How do you spawn another process in C?
6 <pre><code>#include <stdlib.h>

... Never use system. It is far from multithreadin... 17148 None
7 <p>I know that IList is the interface and List... If anyone is still wondering,I find the best ... 17170 When to use IList and when to use List
8 <p>I don't think there are hard and fast rules... why not make it a just a List in the first pla... 17177 None
9 <p>Here's how I do it</p>

<pre class=... But currently SO only show the "Time ago" form... 12 None
.. ... ... ... ...
0 <p>I'm going to continue my habit of going aga... No,I'm not talking about apps that are that s... 10448 None
1 <p>I'm going to continue my habit of going aga... I don't see how moving business logic into sto... 10448 None
2 <p>If you were on Windows,I'd tell you to use... +1 I've used this named pipe methodology seve... 10450 None
3 <p>The 'click sound' in question is actually a... I had a problem with this line: isEnabled = v... 10456 HowTo Disable WebBrowser 'Click Sound' in your...
4 <p>Ideally,I'm looking for a templated logica... @d03boy: Well it has HashSet<T> now,but after... 10458 Is there a "Set" data structure in .Net?
5 <p>Ideally,I'm looking for a templated logica... See [this question](https://stackoverflow.com/... 10458 Is there a "Set" data structure in .Net?
6 <p>Ideally,I'm looking for a templated logica... Possible duplicate of [C# Set collection?](htt... 10458 Is there a "Set" data structure in .Net?
7 <p><a href="http://msdn.microsoft.com/en-us/li... Matt,+1. That sounds like exactly what he ask... 10459 None
8 <p>I've noticed that if you use WebBrowser.Doc... your suggested solution prevents the control f... 10463 None
如上所示,索引被重复。有没有办法确保索引排序正确并不断增加?