Hướng dẫn how to separate rows in python - cách tách hàng trong python
Cải thiện bài viết Show Lưu bài viết Cải thiện bài viết Lưu bài viết Đọc Python3Bàn luận Chúng ta có thể thử các cách tiếp cận khác nhau để chia dữ liệu để có được kết quả mong muốn. Hãy cùng lấy một ví dụ về một bộ dữ liệu kim cương. & Nbsp; & nbsp; # .loc DataFrame method # filtering rows and selecting columns by label # format # ufo.loc[rows, columns] # row 0, all columns ufo.loc[0, :]4 # .loc DataFrame method # filtering rows and selecting columns by label # format # ufo.loc[rows, columns] # row 0, all columns ufo.loc[0, :]5 # .loc DataFrame method # filtering rows and selecting columns by label # format # ufo.loc[rows, columns] # row 0, all columns ufo.loc[0, :]4 # .loc DataFrame method # filtering rows and selecting columns by label # format # ufo.loc[rows, columns] # row 0, all columns ufo.loc[0, :]7 City Ithaca Colors Reported NaN Shape Reported TRIANGLE State NY Time 6/1/1930 22:00 Name: 0, dtype: object5 Output:
Python3City Ithaca Colors Reported NaN Shape Reported TRIANGLE State NY Time 6/1/1930 22:00 Name: 0, dtype: object0 City Ithaca Colors Reported NaN Shape Reported TRIANGLE State NY Time 6/1/1930 22:00 Name: 0, dtype: object1 City Ithaca Colors Reported NaN Shape Reported TRIANGLE State NY Time 6/1/1930 22:00 Name: 0, dtype: object2 City Ithaca Colors Reported NaN Shape Reported TRIANGLE State NY Time 6/1/1930 22:00 Name: 0, dtype: object3 City Ithaca Colors Reported NaN Shape Reported TRIANGLE State NY Time 6/1/1930 22:00 Name: 0, dtype: object4 Phương pháp 1: Tách gấu trúc DataFrame theo lệnh Row Inin Mã dưới đây, DataFrame được chia thành hai phần, 1000 hàng đầu tiên và các hàng còn lại. Chúng ta có thể thấy hình dạng của các khung dữ liệu mới được hình thành là đầu ra của mã đã cho. & NBSP; # rows 0, 1, 2 # all columns ufo.loc[[0, 1, 2], :] # more efficient code ufo.loc[0:2, :]6 # rows 0, 1, 2 # all columns ufo.loc[[0, 1, 2], :] # more efficient code ufo.loc[0:2, :]7 # rows 0, 1, 2 # all columns ufo.loc[[0, 1, 2], :] # more efficient code ufo.loc[0:2, :]8 # rows 0, 1, 2 # all columns ufo.loc[[0, 1, 2], :] # more efficient code ufo.loc[0:2, :]9 # if you leave off ", :" pandas would assume it's there # but you should leave it there to improve code readability ufo.loc[0:2]0 # if you leave off ", :" pandas would assume it's there # but you should leave it there to improve code readability ufo.loc[0:2]1 Output:
Python3# rows 0, 1, 2 # all columns ufo.loc[[0, 1, 2], :] # more efficient code ufo.loc[0:2, :]1 City Ithaca Colors Reported NaN Shape Reported TRIANGLE State NY Time 6/1/1930 22:00 Name: 0, dtype: object1 # rows 0, 1, 2 # all columns ufo.loc[[0, 1, 2], :] # more efficient code ufo.loc[0:2, :]3 City Ithaca Colors Reported NaN Shape Reported TRIANGLE State NY Time 6/1/1930 22:00 Name: 0, dtype: object9 # rows 0, 1, 2 # all columns ufo.loc[[0, 1, 2], :] # more efficient code ufo.loc[0:2, :]5 Phương pháp 2: Tách dữ liệu gấu trúc theo các nhóm được hình thành từ Valueshere cột duy nhất, trước tiên chúng tôi sẽ nhóm dữ liệu theo giá trị cột. DataFrame mới được hình thành bao gồm dữ liệu được nhóm với Color = Hồi E Tiết. & NBSP; # all rows # column: City ufo.loc[:, 'City']0 Output:
Python3# if you leave off ", :" pandas would assume it's there # but you should leave it there to improve code readability ufo.loc[0:2]5 City Ithaca Colors Reported NaN Shape Reported TRIANGLE State NY Time 6/1/1930 22:00 Name: 0, dtype: object1 # if you leave off ", :" pandas would assume it's there # but you should leave it there to improve code readability ufo.loc[0:2]7 # if you leave off ", :" pandas would assume it's there # but you should leave it there to improve code readability ufo.loc[0:2]8 City Ithaca Colors Reported NaN Shape Reported TRIANGLE State NY Time 6/1/1930 22:00 Name: 0, dtype: object4 0 Ithaca 1 Willingboro 2 Holyoke 3 Abilene 4 New York Worlds Fair 5 Valley City 6 Crater Lake 7 Alma 8 Eklutna 9 Hubbard 10 Fontana 11 Waterloo 12 Belton 13 Keokuk 14 Ludington 15 Forest Home 16 Los Angeles 17 Hapeville 18 Oneida 19 Bering Sea 20 Nebraska 21 NaN 22 NaN 23 Owensboro 24 Wilderness 25 San Diego 26 Wilderness 27 Clovis 28 Los Alamos 29 Ft. Duschene ... 18211 Holyoke 18212 Carson 18213 Pasadena 18214 Austin 18215 El Campo 18216 Garden Grove 18217 Berthoud Pass 18218 Sisterdale 18219 Garden Grove 18220 Shasta Lake 18221 Franklin 18222 Albrightsville 18223 Greenville 18224 Eufaula 18225 Simi Valley 18226 San Francisco 18227 San Francisco 18228 Kingsville 18229 Chicago 18230 Pismo Beach 18231 Pismo Beach 18232 Lodi 18233 Anchorage 18234 Capitola 18235 Fountain Hills 18236 Grant Park 18237 Spirit Lake 18238 Eagle River 18239 Eagle River 18240 Ybor Name: City, dtype: object0 Output: TAM GIÁC NY
In [3]: url = 'http://bit.ly/uforeports' ufo = pd.read_csv(url) In [5]: # show first 3 shows ufo.head(3) Out[5]:
Willingboro In [6]: # .loc DataFrame method # filtering rows and selecting columns by label # format # ufo.loc[rows, columns] # row 0, all columns ufo.loc[0, :] Out[6]: City Ithaca Colors Reported NaN Shape Reported TRIANGLE State NY Time 6/1/1930 22:00 Name: 0, dtype: object In [10]: # rows 0, 1, 2 # all columns ufo.loc[[0, 1, 2], :] # more efficient code ufo.loc[0:2, :] Out[10]:
In [12]: # if you leave off ", :" pandas would assume it's there # but you should leave it there to improve code readability ufo.loc[0:2] Out[12]:
In [13]: # all rows # column: City ufo.loc[:, 'City'] Out[13]: 0 Ithaca 1 Willingboro 2 Holyoke 3 Abilene 4 New York Worlds Fair 5 Valley City 6 Crater Lake 7 Alma 8 Eklutna 9 Hubbard 10 Fontana 11 Waterloo 12 Belton 13 Keokuk 14 Ludington 15 Forest Home 16 Los Angeles 17 Hapeville 18 Oneida 19 Bering Sea 20 Nebraska 21 NaN 22 NaN 23 Owensboro 24 Wilderness 25 San Diego 26 Wilderness 27 Clovis 28 Los Alamos 29 Ft. Duschene ... 18211 Holyoke 18212 Carson 18213 Pasadena 18214 Austin 18215 El Campo 18216 Garden Grove 18217 Berthoud Pass 18218 Sisterdale 18219 Garden Grove 18220 Shasta Lake 18221 Franklin 18222 Albrightsville 18223 Greenville 18224 Eufaula 18225 Simi Valley 18226 San Francisco 18227 San Francisco 18228 Kingsville 18229 Chicago 18230 Pismo Beach 18231 Pismo Beach 18232 Lodi 18233 Anchorage 18234 Capitola 18235 Fountain Hills 18236 Grant Park 18237 Spirit Lake 18238 Eagle River 18239 Eagle River 18240 Ybor Name: City, dtype: object In [15]: # all rows # column: City, State ufo.loc[:, ['City', 'State']] # similar code for City through State ufo.loc[:, 'City':'State'] Out[15]:
NY In [17]: # multiple rows and multiple columns ufo.loc[0:2, 'City':'State'] Out[17]:
In [18]: # show first 3 shows ufo.head(3)0 Out[18]:
In [20]: # show first 3 shows ufo.head(3)1 Out[20]:
In [21]: # show first 3 shows ufo.head(3)2 Out[21]: # show first 3 shows ufo.head(3)3 In [24]: # show first 3 shows ufo.head(3)4 Out[24]: # show first 3 shows ufo.head(3)3 Out[25]:
Thứ nd In [28]: # show first 3 shows ufo.head(3)6 Out[28]:
Thứ nd In [31]: # show first 3 shows ufo.head(3)7 Out[31]:
In [38]: # show first 3 shows ufo.head(3)8 Out[38]:
Thứ nd In [40]: # show first 3 shows ufo.head(3)9 Out[40]:
Thành phố In [41]: # .loc DataFrame method # filtering rows and selecting columns by label # format # ufo.loc[rows, columns] # row 0, all columns ufo.loc[0, :]0 Out[42]:
In [44]: # .loc DataFrame method # filtering rows and selecting columns by label # format # ufo.loc[rows, columns] # row 0, all columns ufo.loc[0, :]1 In [46]: # .loc DataFrame method # filtering rows and selecting columns by label # format # ufo.loc[rows, columns] # row 0, all columns ufo.loc[0, :]2 Out[46]:
In [48]: # .loc DataFrame method # filtering rows and selecting columns by label # format # ufo.loc[rows, columns] # row 0, all columns ufo.loc[0, :]3 Out[48]:
|