Hướng dẫn dùng dedup list python

Duplicates generally mean exactly like something else, especially through having been copied. So you want to Python Remove Duplicates from List, there are basically 5 methods by which you can achieve the duplication or copy free Python list.

So talking about the ways to remove the duplicates from Python list there are generally following 5 techniques or methods:

  1. Using set() function
  2. Using a temporary list
  3. By using Enumerate
  4. Using itertools groupby
  5. By Maintaining the order using OrderedDict

The easiest way is to convert it to a set, because sets, by definition, dont have duplicates:

As we know thatsetis the unordered collection of unique elements. Python internally uses the hash technique to process sets. It is a quite simple and fast method to remove duplicate items from Pythonlist.

The original order of the sequence (list) may not be preserved using this approach. However, this method will also not work if you havedictor similar object in thelistbecausedictobject is not hashable in Python.

The set is formed by placing all the items inside the curly braces {} and the items are separated by comma or by using the built-in functionset().

The set can have any number of items, and they may be of different types (integer, float, tuple, string, etc.). But the set cannot have a mutable element, likea list, ordictionary, as its element.

numbers = [1, 2, 1, 2, 3, 5, 3] numbers = list(set(numbers)) print('The list after removing duplicate elements',numbers)

Output: The list after removing duplicate elements [1, 2, 3, 5]

Here we use the brute-force approach to remove duplicate elements from a list. We will create a temporary list and append elements to it only if its not present.

ints_list = [1, 2, 1, 2, 3, 5, 3] temp = [] for x in ints_list: if x not in temp: temp.append(x) ints_list = temp print(f'The list after removing duplicate elements = {temp}')

  • Create a basic list with the elements separated by commas enclosed in square brackets.
  • After that lets create an empty or a temporary list.
  • Using the python for loop lets check each element in the initialized list that is the basic list.
  • After that letsappend elements to the temporary list if those elements are not present.
  • Now lets equate the basic list with the temporary list.

Output: The list after removing duplicate elements [1, 2, 3, 5]

A lot of times when dealing with iterators, we also get a need to keep a count of iterations. Python eases the programmers task by providing a built-in function enumerate() for this task.

So in this example, we will use enumerate to remove python remove duplicates from the list.
Enumerate() method adds a counter to an iterable and returns it in a form of enumerating object. This enumerate object can then be used directly in for loops or be converted into a list of tuples using list() method.

x = [1, 2, 1, 2, 3, 5, 3] print([v for i, v in enumerate(x) if i == 0 or v != x[i-1]])

Output: The list after removing duplicate elements [1, 2, 3, 5]

The 4th way to Remove Duplicates from a Python list is using theitertoolslibrary.
In addition, the itertools module implements a number ofiteratorbuilding blocks. Each has been recast in a form suitable for Python.

However, this module standardizes a core set of fast, memory-efficient. In addition, they form an iterator algebra making it possible to construct specialized tools succinctly and efficiently in pure Python.

For instance, SML provides a tabulation tool:tabulate(f)which produces a sequencef(0),f(1),.... Similarly, the same effect can be achieved in Python by combiningmap.

These tools work well with the high-speed functions in theoperatormodule. For example, the multiplication operator.

In addition, we can say itertools groupby is one of the best methods to remove Duplicates from the Python list. As it uses the builtin module, so by default its speed will be higher than the above-mentioned methods to remove duplication in Python list.

from itertools import groupby x = [1, 2, 1, 2, 3, 5, 3] print([i[0] for i in groupby(x)])

Output: The list after removing duplicate elements [1, 2, 3, 5]

Must Read

By Maintaining the order using OrderedDict

In Python 3.5 and aboveOrderedDictofcollectionslibrary has C implementation.

AnOrderedDictis a dictionary subclass that remembers the order that keys were first inserted. However, the only difference betweendict()and OrderedDict() is that:

OrderedDictpreserves the orderin which the keys are inserted. As the regular dict doesnt track the insertion order. However, the order the items are inserted is remembered by OrderedDict.

This means it is a quitefast and best technique to remove duplication elements from the list in Python. You should also give a try too:

from collections import OrderedDict # initializing list test_list = [1, 2, 1, 2, 3, 5, 3] res = list(OrderedDict.fromkeys(test_list)) # printing list after removal print ("The list after removing duplicates : " + str(res))

Output: The list after removing duplicate elements [1, 2, 3, 5]

Conclusion

In conclusion, we hope you enjoyed out this article.

We have tried to cover all ways for Python Remove Duplicates From List. In addition let us know if you have a better way of doing this, through the comments.