I am reading from an Excel sheet and I want to read certain columns: column 0 because it is the row-index, and columns 22:37. Now here is what I do:
import pandas as pd
import numpy as np
file_loc = "path.xlsx"
df = pd.read_excel[file_loc, index_col=None, na_values=['NA'], parse_cols = 37]
df= pd.concat[[df[df.columns[0]], df[df.columns[22:]]], axis=1]
But I would hope there is better way to do that! I know if I do parse_cols=[0, 22,..,37]
I can do it, but for large datasets this doesn't make sense.
I also did this:
s = pd.Series[0]
s[1]=22
for i in range[2,14]:
s[i]=s[i-1]+1
df = pd.read_excel[file_loc, index_col=None, na_values=['NA'], parse_cols = s]
But it reads the first 15 columns which is the length of s
.
MartyIX
26.9k27 gold badges129 silver badges200 bronze badges
asked Nov 11, 2015 at 16:28
4
You can use column indices [letters] like this:
import pandas as pd
import numpy as np
file_loc = "path.xlsx"
df = pd.read_excel[file_loc, index_col=None, na_values=['NA'], usecols="A,C:AA"]
print[df]
Corresponding documentation:
usecols : int, str, list-like, or callable default None
If None, then parse all columns.
If str, then indicates comma separated list of Excel column letters and column ranges [e.g. “A:E” or “A,C,E:F”]. Ranges are inclusive of both sides.
If list of int, then indicates list of column numbers to be parsed.
If list of string, then indicates list of column names to be parsed.
New in version 0.24.0.
If callable, then evaluate each column name against it and parse the column if the callable returns True.
Returns a subset of the columns according to behavior above.
New in version 0.24.0.
tdy
29.3k10 gold badges52 silver badges56 bronze badges
answered Nov 14, 2015 at 14:40
MartyIXMartyIX
26.9k27 gold badges129 silver badges200 bronze badges
1
parse_cols
is deprecated, use usecols
instead
that is:
df = pd.read_excel[file_loc, index_col=None, na_values=['NA'], usecols = "A,C:AA"]
Georgy
10.8k7 gold badges62 silver badges68 bronze badges
answered Mar 23, 2018 at 4:57
LeoliLeoli
6691 gold badge9 silver badges18 bronze badges
1
"usecols" should help, use range of columns [as per excel worksheet, A,B...etc.] below are the examples
1. Selected Columns
df = pd.read_excel[file_location,sheet_name='Sheet1', usecols="A,C,F"]
2. Range of Columns and selected column
df = pd.read_excel[file_location,sheet_name='Sheet1', usecols="A:F,H"]
3. Multiple Ranges
df = pd.read_excel[file_location,sheet_name='Sheet1', usecols="A:F,H,J:N"]
4. Range of columns
df = pd.read_excel[file_location,sheet_name='Sheet1', usecols="A:N"]
answered Apr 5, 2020 at 9:46
Uday KiranUday Kiran
5896 silver badges8 bronze badges
2
If you know the names of the columns and do not want to use A,B,D or 0,4,7. This actually works
df = pd.read_excel[url][['name of column','name of column','name of column','name of column','name of column']]
where "name of column" = columns wanted. Case and whitespace sensitive
answered Jun 23 at 20:28
Read any column's data in excel
import pandas as pd
name_of_file = "test.xlsx"
data = pd.read_excel[name_of_file]
required_colum_name = "Post test Number"
print[data[required_colum_name]]
answered Sep 11 at 12:57
MouneshMounesh
1531 silver badge10 bronze badges