Hướng dẫn how to randomize dataframe in python - cách ngẫu nhiên hóa khung dữ liệu trong python

Question

TL; DR: np.random.shuffle(ndarray) có thể thực hiện công việc. Vì vậy, trong trường hợp của bạn: np.random.shuffle(ndarray) can do the job.
So, in your case

Nội dung chính Show

Kết quả điểm chuẩn
Khung dữ liệu
Mã đã sử dụng
Làm cách nào để xáo trộn một khung dữ liệu trong Python?
Làm thế nào để bạn xáo trộn một cột DataFrame Pandas?
Làm cách nào để xáo trộn hai khung dữ liệu trong gấu trúc?
Làm thế nào để bạn ngẫu nhiên một tập dữ liệu?

np.random.shuffle(DataFrame.values)

DataFrame, dưới mui xe, sử dụng Numpy Ndarray làm người giữ dữ liệu. (Bạn có thể kiểm tra từ mã nguồn DataFrame)

Vì vậy, nếu bạn sử dụng np.random.shuffle(), nó sẽ xáo trộn mảng dọc theo trục đầu tiên của một mảng đa chiều. Nhưng chỉ số của DataFrame vẫn không bị xáo trộn.

Mặc dù, có một số điểm để xem xét.

chức năng trả về không. Trong trường hợp bạn muốn giữ một bản sao của đối tượng gốc, bạn phải làm như vậy trước khi chuyển đến hàm.
```
nd = sklearn.utils.shuffle(nd)
```
0, như người dùng TJ89 đề xuất, có thể chỉ định
```
nd = sklearn.utils.shuffle(nd)
```
1 cùng với một tùy chọn khác để kiểm soát đầu ra. Bạn có thể muốn điều đó cho mục đích dev.
```
nd = sklearn.utils.shuffle(nd)
```
0 nhanh hơn. Nhưng sẽ xáo trộn thông tin trục (chỉ mục, cột) của DataFrame cùng với
```
nd = sklearn.utils.shuffle(nd)
```
4 mà nó chứa.

Kết quả điểm chuẩn

giữa

nd = sklearn.utils.shuffle(nd)

0 và np.random.shuffle().

ndarray

nd = sklearn.utils.shuffle(nd)

0.10793248389381915 giây. Nhanh hơn 8 lần8x faster

np.random.shuffle(nd)

0,8897626010002568 giây

Khung dữ liệu

df = sklearn.utils.shuffle(df)

0.3183923360193148 giây. Nhanh hơn 3 lần3x faster

np.random.shuffle(df.values)

0,9357550159329548 giây

Kết luận: Nếu thông tin trục không ổn (chỉ mục, cột) sẽ bị xáo trộn cùng với NDarray, hãy sử dụng ____10. Nếu không, hãy sử dụng np.random.shuffle()

Mã đã sử dụng

import timeit
setup = '''
import numpy as np
import pandas as pd
import sklearn
nd = np.random.random((1000, 100))
df = pd.DataFrame(nd)
'''

timeit.timeit('nd = sklearn.utils.shuffle(nd)', setup=setup, number=1000)
timeit.timeit('np.random.shuffle(nd)', setup=setup, number=1000)
timeit.timeit('df = sklearn.utils.shuffle(df)', setup=setup, number=1000)
timeit.timeit('np.random.shuffle(df.values)', setup=setup, number=1000)

Pythonbenchmark

Xem thảo luận

Cải thiện bài viết

Lưu bài viết

Đọc

Bàn luận

Xem thảo luận

Cải thiện bài viết

Lưu bài viết

Đọc

Bàn luận

Hãy cho chúng tôi xem làm thế nào để xáo trộn các hàng của DataFrame. Chúng tôi sẽ sử dụng phương pháp
```
nd = sklearn.utils.shuffle(nd)
```
9 của mô -đun Pandas để xáo trộn các hàng dữ liệu xáo trộn ngẫu nhiên trong gấu trúc.
Thuật toán:

Nhập

np.random.shuffle(nd)

0and

np.random.shuffle(nd)

1Modules.

In bản gốc và các khung dữ liệu xáo trộn.

np.random.shuffle(nd)

4

np.random.shuffle(nd)

5

np.random.shuffle(nd)

4

np.random.shuffle(nd)

7

np.random.shuffle(nd)

8

np.random.shuffle(nd)

9

df = sklearn.utils.shuffle(df)

0

df = sklearn.utils.shuffle(df)

1

df = sklearn.utils.shuffle(df)

2

df = sklearn.utils.shuffle(df)

3

df = sklearn.utils.shuffle(df)

4

df = sklearn.utils.shuffle(df)

5

df = sklearn.utils.shuffle(df)

4

df = sklearn.utils.shuffle(df)

7

df = sklearn.utils.shuffle(df)

8

df = sklearn.utils.shuffle(df)

9

np.random.shuffle(df.values)

0

df = sklearn.utils.shuffle(df)

4

np.random.shuffle(df.values)

2

df = sklearn.utils.shuffle(df)

4

np.random.shuffle(df.values)

4

df = sklearn.utils.shuffle(df)

8

df = sklearn.utils.shuffle(df)

9

np.random.shuffle(df.values)

7

df = sklearn.utils.shuffle(df)

4

np.random.shuffle(df.values)

9

df = sklearn.utils.shuffle(df)

4

import timeit
setup = '''
import numpy as np
import pandas as pd
import sklearn
nd = np.random.random((1000, 100))
df = pd.DataFrame(nd)
'''

timeit.timeit('nd = sklearn.utils.shuffle(nd)', setup=setup, number=1000)
timeit.timeit('np.random.shuffle(nd)', setup=setup, number=1000)
timeit.timeit('df = sklearn.utils.shuffle(df)', setup=setup, number=1000)
timeit.timeit('np.random.shuffle(df.values)', setup=setup, number=1000)

1

df = sklearn.utils.shuffle(df)

4

import timeit
setup = '''
import numpy as np
import pandas as pd
import sklearn
nd = np.random.random((1000, 100))
df = pd.DataFrame(nd)
'''

timeit.timeit('nd = sklearn.utils.shuffle(nd)', setup=setup, number=1000)
timeit.timeit('np.random.shuffle(nd)', setup=setup, number=1000)
timeit.timeit('df = sklearn.utils.shuffle(df)', setup=setup, number=1000)
timeit.timeit('np.random.shuffle(df.values)', setup=setup, number=1000)

3

import timeit
setup = '''
import numpy as np
import pandas as pd
import sklearn
nd = np.random.random((1000, 100))
df = pd.DataFrame(nd)
'''

timeit.timeit('nd = sklearn.utils.shuffle(nd)', setup=setup, number=1000)
timeit.timeit('np.random.shuffle(nd)', setup=setup, number=1000)
timeit.timeit('df = sklearn.utils.shuffle(df)', setup=setup, number=1000)
timeit.timeit('np.random.shuffle(df.values)', setup=setup, number=1000)

4

import timeit
setup = '''
import numpy as np
import pandas as pd
import sklearn
nd = np.random.random((1000, 100))
df = pd.DataFrame(nd)
'''

timeit.timeit('nd = sklearn.utils.shuffle(nd)', setup=setup, number=1000)
timeit.timeit('np.random.shuffle(nd)', setup=setup, number=1000)
timeit.timeit('df = sklearn.utils.shuffle(df)', setup=setup, number=1000)
timeit.timeit('np.random.shuffle(df.values)', setup=setup, number=1000)

5

import timeit
setup = '''
import numpy as np
import pandas as pd
import sklearn
nd = np.random.random((1000, 100))
df = pd.DataFrame(nd)
'''

timeit.timeit('nd = sklearn.utils.shuffle(nd)', setup=setup, number=1000)
timeit.timeit('np.random.shuffle(nd)', setup=setup, number=1000)
timeit.timeit('df = sklearn.utils.shuffle(df)', setup=setup, number=1000)
timeit.timeit('np.random.shuffle(df.values)', setup=setup, number=1000)

6

df = sklearn.utils.shuffle(df)

2

import timeit
setup = '''
import numpy as np
import pandas as pd
import sklearn
nd = np.random.random((1000, 100))
df = pd.DataFrame(nd)
'''

timeit.timeit('nd = sklearn.utils.shuffle(nd)', setup=setup, number=1000)
timeit.timeit('np.random.shuffle(nd)', setup=setup, number=1000)
timeit.timeit('df = sklearn.utils.shuffle(df)', setup=setup, number=1000)
timeit.timeit('np.random.shuffle(df.values)', setup=setup, number=1000)

8

df = sklearn.utils.shuffle(df)

4np.random.shuffle(ndarray)0

df = sklearn.utils.shuffle(df)

4np.random.shuffle(ndarray)2

df = sklearn.utils.shuffle(df)

4np.random.shuffle(ndarray)4

df = sklearn.utils.shuffle(df)

4np.random.shuffle(ndarray)6

df = sklearn.utils.shuffle(df)

8

np.random.shuffle(ndarray)8np.random.shuffle(ndarray)9

df = sklearn.utils.shuffle(df)

4DataFrame1

df = sklearn.utils.shuffle(df)

4DataFrame3

df = sklearn.utils.shuffle(df)

4DataFrame5

df = sklearn.utils.shuffle(df)

4DataFrame7DataFrame8

DataFrame9

np.random.shuffle(nd)

9 np.random.shuffle()1

np.random.shuffle()2np.random.shuffle()3np.random.shuffle()4np.random.shuffle()5

np.random.shuffle()2np.random.shuffle()7

‘

np.random.shuffle()2np.random.shuffle()3DataFrame6np.random.shuffle()5

np.random.shuffle()2np.random.shuffle()7

Đầu ra:

Hướng dẫn how to randomize dataframe in python - cách ngẫu nhiên hóa khung dữ liệu trong python

Làm cách nào để xáo trộn một khung dữ liệu trong Python?

Một trong những cách dễ nhất để xáo trộn một khung dữ liệu gấu trúc là sử dụng phương pháp mẫu Pandas. DF. Phương pháp mẫu cho phép bạn lấy mẫu một số hàng trong khung dữ liệu gấu trúc theo thứ tự ngẫu nhiên. Bởi vì điều này, chúng tôi chỉ cần chỉ định rằng chúng tôi muốn trả về toàn bộ DataFrame của Pandas, theo thứ tự ngẫu nhiên.use the Pandas sample method. The df. sample method allows you to sample a number of rows in a Pandas Dataframe in a random order. Because of this, we can simply specify that we want to return the entire Pandas Dataframe, in a random order.

Làm thế nào để bạn xáo trộn một cột DataFrame Pandas?

Shuffle DataFrame ngẫu nhiên bằng các hàng và cột bạn có thể sử dụng df.sample (frac = 1, trục = 1) .sample (frac = 1) .reset_index (drop = true) để xáo trộn các hàng và cột một cách ngẫu nhiên.df. sample(frac=1, axis=1). sample(frac=1). reset_index(drop=True) to shuffle rows and columns randomly.

Làm cách nào để xáo trộn hai khung dữ liệu trong gấu trúc?

Thuật toán:..

Nhập các mô -đun Gandas và Numpy ..

Tạo một khung dữ liệu ..

Shuffle Các hàng của DataFrame bằng phương thức mẫu () với tham số frac là 1, nó xác định phần nào của tổng số trường hợp cần được trả về ..

In bản gốc và các khung dữ liệu xáo trộn ..

Làm thế nào để bạn ngẫu nhiên một tập dữ liệu?

Randomize..

Chọn nhóm các cột của bộ dữ liệu bạn muốn xáo trộn ..

Chọn tỷ lệ của bộ dữ liệu bạn muốn xáo trộn ..

Sản xuất đầu ra có thể nhân rộng ..

Nếu áp dụng tự động được đánh dấu, các thay đổi được thực hiện tự động.Nếu không, bạn phải nhấn Áp dụng sau mỗi lần thay đổi ..

Tạo một báo cáo ..

programming python DataFrame sample Sample Python Shuffle DataFrame Split DataFrame pandas Shuffle pandas series

Hướng dẫn how to randomize dataframe in python - cách ngẫu nhiên hóa khung dữ liệu trong python

Kết quả điểm chuẩn

ndarray

Khung dữ liệu

Mã đã sử dụng

Làm cách nào để xáo trộn một khung dữ liệu trong Python?

Làm thế nào để bạn xáo trộn một cột DataFrame Pandas?

Làm cách nào để xáo trộn hai khung dữ liệu trong gấu trúc?

Làm thế nào để bạn ngẫu nhiên một tập dữ liệu?

Bài Viết Liên Quan

Quảng Cáo

Có thể bạn quan tâm

Toplist được quan tâm

Quảng cáo

Xem Nhiều

Quảng cáo

Chúng tôi

Điều khoản

Trợ giúp

Mạng xã hội