如何重复该过程并将结果存储在新的数据框熊猫中

我有2个数据集borderdf

部分1:

df = 

     id_easy    ordinal latitude longitude      epoch   day_of_week
0   e35f652a         68  22.1111    7.2222 1465084811   Sunday
1   e35f652a         69  22.1111    7.2222 1465084870   Sunday
2   e35f652a         70  22.1111    7.2222 1465084930   Sunday
3   e35f652a         71  22.1111    7.2222 1465084990   Sunday
4   e35f652a         72  22.1111    7.2222 1465085050   Sunday

turin = df.loc[df['ordinal'] == 1]

crs = {'init':'epsg:4326'}
geometry = [Point(xy) for xy in zip(turin.longitude,turin.latitude)]
turin_point = gpd.GeoDataFrame(turin,crs=crs,geometry=geometry) #to get geometry

PART 2:

border.shape = (931,674) 列名称中的第一个数字显示区域名称。例如,在12_longitude_1 =区域12中,经度为1st。如您所见,我有随机区域(12、14、23 ...依此类推)

这是示例数据帧:

border = 

12_longitude_1  12_latitude_1   14_longitude_2  14_latitude_2   23_longitude_3  23_latitude_3
            11             12               13             14               15            16
            11             12               13             14               15            16
            11             12               13             14               15            16

最终部分:

我想检查turin_point区域内的12。 我正在对前两列进行以下操作:

12_longitude_112_latitude_1的代码:

border = border[['longitude_1','latitude_1']].dropna()
border.longitude_1 = border.longitude_1.replace(r'[()]','',regex=True)
border.latitude_1 = border.latitude_1.replace(r'[()]',regex=True)
border.longitude_1 = pd.to_numeric(border.longitude_1,errors='coerce')
border.latitude_1 = pd.to_numeric(border.latitude_1,errors='coerce')
geometry2 = [Point(xy) for xy in zip(border.longitude_1,border.latitude_1)]
border_point = gpd.GeoDataFrame(border,geometry=geometry2)
turin_final = Polygon([[p.x,p.y] for p in border_point.geometry])
within_turin = turin_point[turin_point.geometry.within(turin_final)]
long_lat_1 = len(within_turin)

最后long_lat_12给了我1697


我想针对整个数据集(针对所有列对)自动执行此过程吗?


所需的输出:

如何重复该过程并将结果存储在新的数据框熊猫中

要使用的库:

import numpy as np
import pandas as pd

import geopandas as gpd
from shapely.geometry import Point,Polygon

尝试:

pd_out = pd.DataFrame({'zone': [],'number': []})

for col_num in range(0,len(border.columns)-1,2):
    curr_lon_name = border.columns[col_num]
    curr_lat_name = border.columns[col_num + 1]
    num = curr_lon_name.split("_")[-1]
    border = border[[curr_lon_name,curr_lat_name]].dropna()
    border[curr_lon_name] = border[curr_lon_name].replace(r'[()]',regex=True)
    border[curr_lat_name] = border[curr_lat_name].replace(r'[()]',regex=True)
    border[curr_lon_name] = pd.to_numeric(border[curr_lon_name],errors='coerce')
    border[curr_lat_name] = pd.to_numeric(border[curr_lat_name],errors='coerce')
    geometry2 = [Point(xy) for xy in zip(border[curr_lon_name],border[curr_lat_name])]
    border_point = gpd.GeoDataFrame(border,geometry=geometry2)
    turin_final = Polygon([[p.x,p.y] for p in border_point.geometry])
    within_turin = turin_point[turin_point.geometry.within(turin_final)]
    curr_len = len(within_turin)
    pd_out = pd_out.append({'zone': "long_lat_{}".format(num),'number': curr_len},ignore_index=True)

只给我1行:

    zone         number
0   long_lat_1  1697.0

我想要照片中显示的所有行和名称

p.s。数据集的值已更改

red0123450 回答:如何重复该过程并将结果存储在新的数据框熊猫中

您正在覆盖for循环中的border数据框。从边框数据框制作一系列序列或覆盖它:

pd_out = pd.DataFrame({'zone': [],'number': []})

for col_num in range(0,len(border.columns)-1,2):
    curr_lon_name = border.columns[col_num]
    curr_lat_name = border.columns[col_num + 1]
    num = curr_lon_name.split("_")[0]
    zone_border = border[[curr_lon_name,curr_lat_name]].dropna()
    zone_border[curr_lon_name] = zone_border[curr_lon_name].replace(r'[()]','',regex=True)
    zone_border[curr_lat_name] = zone_border[curr_lat_name].replace(r'[()]',regex=True)
    zone_border[curr_lon_name] = pd.to_numeric(zone_border[curr_lon_name],errors='coerce')
    zone_border[curr_lat_name] = pd.to_numeric(zone_border[curr_lat_name],errors='coerce')
    geometry2 = [Point(xy) for xy in zip(zone_border[curr_lon_name],zone_border[curr_lat_name])]
    border_point = gpd.GeoDataFrame(zone_border,crs=crs,geometry=geometry2)
    turin_final = Polygon([[p.x,p.y] for p in border_point.geometry])
    within_turin = turin_point[turin_point.geometry.within(turin_final)]
    curr_len = len(within_turin)
    pd_out = pd_out.append({'zone': "{}".format(num),'number': curr_len},ignore_index=True)
本文链接:https://www.f2er.com/3169621.html

大家都在问