This is my file.txt:
Egg and Bacon;
Egg, sausage and Bacon
Egg and Spam;
Spam Egg Sausage and Spam;
Egg, Bacon and Spam;
I wanna convert the newLine '\n' to ' $ '. I just used:
f = open[fileName]
text = f.read[]
text = text.replace['\n',' $ ']
print[text]
This is my output:
$ Spam Egg Sausage and Spam;
and my output must be like:
Egg and Bacon; $ Egg, sausage and Bacon $ Egg ...
What am I doing wrong? I'm using #-*- encoding: utf-8 -*-
Thank you.
asked Mar 25, 2015 at 9:04
4
It is possible that your newlines are represented as \r\n
. In order to replace them you should do:
text.replace['\r\n', ' $ ']
For a portable solution that works on both UNIX-like systems [which uses \n
] and Windows [which uses \r\n
], you can substitute the text using a regex:
>>> import re
>>> re.sub['\r?\n', ' $ ', 'a\r\nb\r\nc']
'a $ b $ c'
>>> re.sub['\r?\n', ' $ ', 'a\nb\nc']
'a $ b $ c'
answered Mar 25, 2015 at 9:14
enrico.bacisenrico.bacis
29.3k10 gold badges85 silver badges114 bronze badges
2
You can use splitlines.
lines = """Egg and Bacon;
Egg, sausage and Bacon
Egg and Spam;
Spam Egg Sausage and Spam;
Egg, Bacon and Spam;"""
print[" $ ".join[lines.splitlines[]]]
Egg and Bacon; $ Egg, sausage and Bacon $ Egg and Spam; $ Spam Egg Sausage and Spam; $ Egg, Bacon and Spam;
Or simply use rstrip and join on the file object without reading all into memory:
with open["in.txt"] as f:
print[" $ ".join[line.rstrip[] for line in f]]
Egg and Bacon; $ Egg, sausage and Bacon $ Egg and Spam; $ Spam Egg Sausage and Spam; $ Egg, Bacon and Spam;
Which is a much more efficient solution than reading all the file into memory and using a regex. You should also always use with
to open your files as it closes them automatically.
rstrip will remove \n
\r\n
etc..
In [41]: s = "foo\r\n"
In [42]: s.rstrip[]
Out[42]: 'foo'
In [43]: s = "foo\n"
In [44]: s.rstrip[]
Out[44]: 'foo'
answered Mar 25, 2015 at 9:48
text = text.replace['\\n', '']
answered Oct 9, 2020 at 15:19
1
By using replace[]
or fillna[]
methods you can replace NaN values with Blank/Empty string in Pandas DataFrame. NaN
stands for Not A Number
and is one of the common ways to represent the missing data value in Python/Pandas DataFrame. Sometimes we would be required to convert/replace any missing values with the values that make sense like replacing with zero’s for numeric columns and blank or empty for string-type columns.
In this panda DataFrame article, I will
explain how to convert single or multiple [all columns from the list] NaN
columns values to blank/empty strings using several ways with examples.
If you are in a hurry, below are some of the quick examples of how to replace NaN with a blank/empty string in Pandas DataFrame.
# Below are quick examples
# Replace all Nan values to empty string
df2 = df.replace[np.nan, '', regex=True]
print[df2]
# Using multiple columns
df2 = df[['Courses','Fee' ]] = df[['Courses','Fee' ]].fillna['']
print[df2]
# Using pandas.DataFrame.fillna[] to replace nan values
df2 = df.fillna[""]
print[df2]
# Using pandas replace nan with null
df2 = df.fillna['', inplace=True]
print[df2]
# Pandas single column using replace nan empty string
df2 = df.Courses.fillna['']
print[df2]
# Using Courses column replace nan with Zeros
df2 = df['Courses']=df['Courses'].fillna[0]
print[df2]
# Using Discount column to replace nan with Zeros
df2 = df['Discount']=df['Discount'].fillna[0]
print[df2]
# Remove the nan and fill the empty string
df2 = df.Courses.replace[np.nan,'',regex = True]
print[df2]
# Remove the nan and fill some values
df2 = df.Courses.replace[np.nan,'value',regex = True]
print[df2]
Now, let’s create a DataFrame with a few rows and columns and execute some examples and validate the results. Our DataFrame contains column names Courses
, Fee
, Duration
and Discount
.
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark",np.nan,"Hadoop","Python","pandas",np.nan,"Java"],
'Fee' :[20000,25000, np.nan,22000,24000,np.nan,22000],
'Duration':[np.nan,'40days','35days', np.nan,'60days','50days','55days'],
'Discount':[1000,np.nan,1500,np.nan,2500,2100,np.nan]
}
df = pd.DataFrame[technologies]
print[df]
Yields below output.
Courses Fee Duration Discount
0 Spark 20000.0 NaN 1000.0
1 NaN 25000.0 40days NaN
2 Hadoop NaN 35days 1500.0
3 Python 22000.0 NaN NaN
4 pandas 24000.0 60days 2500.0
5 NaN NaN 50days 2100.0
6 Java 22000.0 55days NaN
2. Convert Nan to Empty String in Pandas
Use df.replace[np.nan,'',regex=True]
method to replace all NaN values to an empty string in the Pandas DataFrame column.
# All DataFrame replace empty string
df2 = df.replace[np.nan, '', regex=True]
print[df2]
Yields below output.
Courses Fee Duration Discount
0 Spark 20000.0 1000.0
1 25000.0 40days
2 Hadoop 35days 1500.0
3 Python 22000.0
4 pandas 24000.0 60days 2500.0
5 50days 2100.0
6 Java 22000.0 55days
3. Multiple Columns Replace Empty String
In order to replace NaN
values with Blank strings on multiple columns or all columns from a list, use df[['Courses','Fee']] = df[['Courses','Fee']].fillna['']
. This replaces NaN values on Courses and Fee column.
# Using multiple columns
df2 = df[['Courses','Fee' ]] = df[['Courses','Fee' ]].fillna['']
print[df2]
Yields below output.
Courses Fee
0 Spark 20000.0
1 25000.0
2 Hadoop
3 Python 22000.0
4 pandas 24000.0
5
6 Java 22000.0
4. Using fillna[] to NaN/Null Values With Empty String
Use pandas.DataFrmae.fillna[] to Replace NaN/Null values with an empty string. This replaces each NaN in pandas DataFrame with an empty string.
# Using pandas.DataFrame.fillna[] to nan values
df2 = df.fillna[""]
print[df2]
Yields below output.
Courses Fee Duration Discount
0 Spark 20000.0 1000.0
1 25000.0 40days
2 Hadoop 35days 1500.0
3 Python 22000.0
4 pandas 24000.0 60days 2500.0
5 50days 2100.0
6 Java 22000.0 55days
5. fillna[] with inplace=True
If you notice the above output after applying fillna[] function, it returns a new DataFrame, In order
to update the current/referring DataFrame in place use df.fillna['',inplace=True]
. When using this, fillna[]
method returns None type.
# Using pandas replace nan with null
df2 = df.fillna['', inplace=True]
print[df2]
Yields below output.
None
6. Replacing NaN with Empty String on a Specific Column
If you want to fill a single column, you can use df.Courses.fillna['']
.
# Pandas single column using replace nan empty string
df2 = df.Courses.fillna['']
print[df2]
Yields below output.
0 Spark
1
2 Hadoop
3 Python
4 pandas
5
6 Java
Name: Courses, dtype: object
7. Replace NaN with Zeros
These examples replace NaN values with zeroes in a column.
# Using Courses column replace nan with Zeros
df2 = df['Courses']=df['Courses'].fillna[0]
print[df2]
# Using Discount column to replace nan with Zeros
df2 = df['Discount']=df['Discount'].fillna[0]
print[df2]
Yields below output.
0 Spark
1 0
2 Hadoop
3 Python
4 pandas
5 0
6 Java
Name: Courses, dtype: object
8. Remove the NaN and Fill the Empty String
Use df.Courses.replace[np.nan,'',regex=True]
to remove the NaN and fill the empty string on a Courses column.
# Remove the nan and fill the empty string
df2 = df.Courses.replace[np.nan,'',regex = True]
print[df2]
Yields below output.
0 Spark
1
2 Hadoop
3 Python
4 pandas
5
6 Java
Name: Courses, dtype: object
9. Remove the NaN and Fill some Values
Use df.Courses.replace[np.nan,'value',regex=True]
to remove the NaN and fill Value
.
# Remove the nan and fill some values
df2 = df.Courses.replace[np.nan,'value',regex = True]
print[df2]
Yields below output.
0 Spark
1 value
2 Hadoop
3 Python
4 pandas
5 value
6 Java
Name: Courses, dtype: object
Conclusion
In this article, you have learned how to replace NaN with blank/empty
strings in Pandas using DataFrame.fillna[], DataFrame.replace[]
functions, you have also learned how to replace single and multiple columns.
Happy Learning !!
You May Also Like
- How to Check If a Value is NaN in a Pandas DataFrame
- Combine Two Columns of Text in Pandas DataFrame
- How to Drop Rows with NaN Values in Pandas DataFrame
- Add an Empty Column to a Pandas DataFrame
- Pandas Select DataFrame Columns by Label or Index
References
- //pandas.pydata.org/docs/reference/api/pandas.DataFrame.fillna.html