programming python DataFrame array

Multiple arrays to dataframe python

So, I am iterating through a dictionary and taking a bunch of values out as a array - Trying to make a Dataframe with each observation as a separate row.

X1 =[]
for k,v in DF_grp:
    date = v['Date'].astype[datetime]
    usage = v['Usage'].astype[float]
    comm = v['comm'].astype[float]
    mdf = pd.DataFrame[{'Id' : k[0],'date':date,'usage':usage, 'comm':comm}]
    mdf['used_ratio'] = [[mdf['used']/mdf['comm']].round[2]]*100
    ts = pd.Series[mdf['usage'].values, index=mdf['date']].sort_index[ascending=True]
    ts2 = pd.Series[mdf['used_ratio'].values, index = mdf['date']].sort_index[ascending=True]
    ts2 = ts2.dropna[]
    data = ts2.values.copy[]
    if len[data] == 10:
        X1 =np.append[X1,data, axis=0]
        print[X1]

[0,0,0,0,1,0,0,0,1]
[1,2,3,4,5,6,7,8,9]
[0,5,6,7,8,9,1,2,3]
....

similarly, so the question is how do I capture all these arrays in a single DataFrame so that it looks like below:

[[0,0,0,0,1,0,0,0,1]] --- #row 1 in dataframe 
[[1,2,3,4,5,6,7,8,9]] --- #row 2 in dataframe

If the same task can be divided further ? There are more thank 500K arrays in the dataset. Thank You

Pandas Numpy

We will create DataFrame by using 1-D and 2-D Numpy arrays [numpy ndarray].

DataFrame can be created by using Numpy arrays. We know that Numpy array can have one type of data only, so we will try to create different numpy arrays by using different types of data and finally we will create one DataFrame with name of the students [ string ] and their marks [ numbers ].

Our final DataFrame will have NAME [ String ] and marks in two subjects or numbers in MATH & ENGLISH [ integer].

Let us create one 1-D array to store marks of students. While creating the DataFrame we will add the column name as MATH. We are creating DataFrame for marks in MATH only for four students.

import pandas as pd
import numpy as np
my_np=np.array[[30,40,50,45]] # Numpy array
# print[my_np] # display the array 
my_pd=pd.DataFrame[data=my_np,columns=['MATH']]
print[my_pd]

Output

Using 2-D array to create the DataFrame

We will use one 2-D array to create the DataFrame. Here we will not add the column names.

import pandas as pd
import numpy as np
my_np1=np.array[[[30,40,50,45],
                 [50,60,50,55]]]
my_pd=pd.DataFrame[data=[my_np1[0],my_np1[1]]]
print[my_pd]

Output

    0   1   2   3
0  30  40  50  45
1  50  60  50  55

Adding columns

Before adding the columns we will transpose the DataFrame to make it two columns.

import pandas as pd
import numpy as np
my_np1=np.array[[[30,40,50,45],
                 [50,60,50,55]]]
# transpose the Dataframe				 
my_pd=pd.DataFrame[data=[my_np1[0],my_np1[1]]].T 
my_pd.columns=['MATH','ENGLISH']
print[my_pd]

Output

   MATH  ENGLISH
0    30       50
1    40       60
2    50       50
3    45       55

Here we got the marks of two subjects in our DataFrame. Let us add one string column to this to include the student Names.

import pandas as pd
import numpy as np
my_np1=np.array[[[30,40,50,45],
                 [50,60,50,55]]]
my_names=np.array[['Alex','Ron','Jack','King']]
my_pd=pd.DataFrame[data=[my_names,my_np1[0],my_np1[1]]].T
my_pd.columns=['NAMES','MATH','ENGLISH']
print[my_pd]

Output

  NAMES MATH ENGLISH
0  Alex   30      50
1   Ron   40      60
2  Jack   50      50
3  King   45      55

Adding new column to DataFrame

In above code we have two integer columns showing marks in two subjects. We can add one more column to show us sum of the marks or total marks. We will use sum[] for this.

import pandas as pd
import numpy as np
my_np1=np.array[[[30,40,50,45],
                 [50,60,50,55]]]
my_names=np.array[['Alex','Ron','Jack','King']]
my_pd=pd.DataFrame[data=[my_names,my_np1[0],my_np1[1]]].T
my_pd.columns=['NAMES','MATH','ENGLISH']
my_pd['Total']=my_pd['MATH'] + my_pd['ENGLISH']
print[my_pd]

Output

  NAMES MATH ENGLISH Total
0  Alex   30      50    80
1   Ron   40      60   100
2  Jack   50      50   100
3  King   45      55   100

We have used one 2-D array for two subjects. However it is better to use multiple 1-D arrays, one for each subject so it can be scaled up to include more subjects.

import pandas as pd
import numpy as np
my_math=np.array[[30,40,50,45]]
my_english=np.array[[50,60,50,55]]
my_names=np.array[['Alex','Ron','Jack','King']]

my_pd=pd.DataFrame[data=[my_names,my_math,my_english]].T
my_pd.columns=['NAMES','MATH','ENGLISH']
print[my_pd]

Output

  NAMES MATH ENGLISH
0  Alex   30      50
1   Ron   40      60
2  Jack   50      50
3  King   45      55

Removing index

my_pd=pd.DataFrame[data=[my_names,my_math,my_english]].T
my_pd.columns=['NAMES','MATH','ENGLISH']
print[my_pd]
# remove index
print[ my_pd.to_string[index=False]]

Output

  NAMES MATH ENGLISH
0  Alex   30      50
1   Ron   40      60
2  Jack   50      50
3  King   45      55
NAMES MATH ENGLISH
 Alex   30      50
  Ron   40      60
 Jack   50      50
 King   45      55

Using random integers

Create one DataFrame by using random integer Numpy array. We created here one student mark DataFrame using 5 students [ rows ] and two subjects [ columns ] , you can increase to include more number of columns [ subjects ] and rows [students].

import numpy as np
import pandas as pd
n=5 # Number of students 
my_math=np.random.randint[40,100,size=n]
my_english=np.random.randint[40,100,size=n]

my_pd=pd.DataFrame[data=[my_math,my_english]].T

my_pd.columns=['MATH','ENG']
print[my_pd]

Output

   MATH  ENG
0    76   91
1    53   40
2    69   60
3    47   67
4    73   91

We can add one more column as student ID

import numpy as np
import pandas as pd
n=5 # Number of students 
my_id=np.arange[1,n+1]

my_math=np.random.randint[40,100,size=n]
my_english=np.random.randint[40,100,size=n]

my_pd=pd.DataFrame[data=[my_id,my_math,my_english]].T

my_pd.columns=['ID','MATH','ENG']
print[my_pd.to_string[index=None]]

Output

 ID  MATH  ENG
  1    65   58
  2    58   97
  3    75   90
  4    42   69
  5    55   51

Pandas read_csv[] read_excel[] to_excel[]

How do I create a DataFrame from two arrays?

Creating a DataFrame From Arrays and Lists.

import pandas as pd..

import numpy as np..

d = np. random. normal[size=[2,3]].

print["The original Numpy array"].

print[d].

print["---------------------"].

Can DataFrame contain multiple series?

You can create a DataFrame from multiple Series objects by adding each series as a columns. By using concat[] method you can merge multiple series together into DataFrame.

Can you create a data frame with only Numpy?

DataFrame can be created by using Numpy arrays. We know that Numpy array can have one type of data only, so we will try to create different numpy arrays by using different types of data and finally we will create one DataFrame with name of the students [ string ] and their marks [ numbers ].

How do you create a DataFrame in an array?

Let us see how to create a DataFrame from a Numpy array. We will also learn how to specify the index and the column headers of the DataFrame..

Import the Pandas and Numpy modules..

Create a Numpy array..

Create list of index values and column values for the DataFrame..

Create the DataFrame..

Display the DataFrame..

Using 2-D array to create the DataFrame

Adding columns

Adding new column to DataFrame

Removing index

Using random integers

How do I create a DataFrame from two arrays?

Can DataFrame contain multiple series?

Can you create a data frame with only Numpy?

How do you create a DataFrame in an array?

Bài Viết Liên Quan

Toplist mới

Bài mới nhất

Chủ Đề