Data Structures in Python

Website Developer
9 min readJan 21, 2023

1. Understanding Data Structures

Data structures are an integral part of programming applications, and understanding the fundamentals of data structures is universally applicable to all coding programs, regardless of the program the user chooses. The Merriam-Webster dictionary defines data structures as such; “any of various methods or formats (such as an array, file, or record) for organizing data in a computer.”

https://www.geeksforgeeks.org/data-structures/

However, that description is rather vague, leaving much for interpretation. In the programming language, data structures may be defined as storage used to organize and store data on a computer in an efficient manner, one in which the data can be easily accessed and updated in an efficient manner.

Still, that begs the question, how and why are these data structures used? In this tutorial, the four major data structures used in Python will be broken down (lists, sets, tuples, and dictionaries), explain how each one of these is used, and conditions why the data structure may be chosen, as well as providing built-in coding examples to help explain this process further.

2. Lists

One of the most commonly used four data structures is the lists in Python. This data type is characterized by its collection of items listed in an order uniquely identifiable by the user. Furthermore, it can be noted that part of a list’s characteristics is its ability to be nested and mutable at any point. The term nesting refers to the list that can potentially contain any object type as indicated by the user. The term mutable refers to a line of code that can be altered or manipulated after its initial creation at any given time. Elements may be added, removed, shifted, renumbered, or structured in any way the user likes. A list’s order of elements stays consistent throughout its life of a list. To put it into perspective, the following line of code is shown below.

INPUT:
safari_animals = ['elephant', 'lion', 'rhino', 'hippo', 'zebra']
print(safari_animals)

OUTPUT:
['elephant', 'lion', 'rhino', 'hippo', 'zebra']

Indexing Lists

The elements in the list ‘safari_animals’ are assigned by the order in which they were put in, reading from left-most to right-most entry. Elements in a list always assign the first entry/element to 0. The entries in the list above are indexed from 0 to 4, with 0 being assigned to the first entry, ‘elephant’, and four being assigned to the last entry, ‘zebra’.

Any element may be called out in a given list. The example below shows the indexed element’s assigned value being called out and printed.

INPUT:
print(safari_animals[2])

OUTPUT:
rhino

Modifying Lists

Looking at the example below, values can be manually assigned to any element of the list as such. Both positive and negative integers may be used. The element’s value and text are color-coded for the following example to show the assigned code. All but the last element has been changed regarding the number indexed.

INPUT:
safari_animals [0] = 'lion'
safari_animals [1] = 'elephant'
safari_animals [2] = 'hippo'
safari_animals [3] = 'rhino'
print(safari_animals)

OUTPUT:
['lion', 'elephant', 'hippo', 'rhino', 'zebra']

Slicing Lists

The term “slicing” refers to the desired range of elements in a given list indicated by the user. The user may choose a selected range of elements to be displayed at any interval of choice and what elements to be omitted.

The first of these “slices” can be written with a single colon between two indexed values, encased between two square brackets. This form of slicing displays the start and end of the elements in a list to be called for. For the following example, this format is shown below.

LIST VALUES: ['lion', 'elephant', 'hippo', 'rhino', 'zebra']
INPUT:
print(safari_animals[0:2])

OUTPUT:
['lion', 'elephant', 'hippo']

The second of these “slices” is written with two colons separating 3 indexed values encased between two square brackets. The index number before the first colon indicates the first element in the range of values chosen; the second index number after the first colon indicates the last element in the range of values chosen, except this element will be excluded; and the last number after the second colon indicates the interval at which the list of elements will display information. An example of this format is shown below, and the elements in the list, “my_list”, will be used for this demonstration.

LIST VALUES: [0,1,2,3,4,5,6,7,8,9,10]
INPUT:
print(my_list[0:10:2])

OUTPUT:
[0, 2, 4, 6, 8]

3. Sets

Unlike lists, sets are not constrained to follow a specific order, and as such, they are characterized by a unique collection of elements regardless of such order. The case in which one would prefer a set would exist in cases where the existence of an object in a contained collection is of higher priority than the order or number of times an object may reappear. An example of the following line of code is shown below, displaying the structure of a set.

INPUT:
safari_animals = {'elephant', 'lion', 'rhino', 'hippo', 'crocodile', 'zebra'}
print(safari_animals)

OUTPUT:
set(['elephant', 'lion', 'rhino', 'hippo', 'crocodile', 'zebra'])

Modifying a Set

Sets are mutable data types, allowing for the modification of an element, addition, or removal of such element. The set in the following example below sees the use of the set ‘safari_animals’, in which elements are added and removed.

INPUT:
safari_animals.add('hyena')
safari_animals.remove('crocodile')
print(safari_animals)

OUTPUT:
set(['elephant', 'lion', 'rhino', 'hippo', 'zebra', 'hyena'])

Set Operations

Sets are often used to test elements contained in a given set and both relationships and nesting conditions between two or more given sets at a time. The testing of these relationships is similar to the process used in a Venn diagram; elements may be subtracted or added from one another and manipulated in more ways to allow the user more advanced control than lists would allow.

There are a few notable operations to manipulate the data as such. The intersection operation allows the set to output shared elements within two or more sets. The union operation allows the set to output all elements between two or more sets and combine them. The difference operation allows for the set to output the difference of two or more sets, subtracting one from the other. The following example below shows how these set operations may be used.

INPUT:
Set_A = {1, 3, 5, 7}
Set_B = {5, 6, 7, 8}
intersection_op = Set_A.intersection(Set_B)
print(intersection_op)
union_op = Set_A.union(Set_B)
print(union_op)
difference_op = Set_A.difference(Set_B)
print(difference_op)

OUTPUT:
set([5, 7])
set([1, 3, 5, 6, 7, 8])
set([1, 3])

The example above shows the extrapolation of similar elements of Set_A and Set_B in the first code execution, the intersection operation. The second code execution shows the conjoining of Set_A and Set_B to output all elements found in both sets into one compiled list: the union operation. The third code execution shows the subtraction of elements contained in Set_A by those in Set_B, outputting all elements that remain from the operation into one compiled list: the difference operation.

4. Tuples

Tuples are very similar to that lists but have their uses and advantages over lists. Like lists, they are data structures of an ordered collection of stored objects but are immutable. Once a tuple is created, it cannot be removed or altered in any way, like that of a permanent fixture/object. Preferable with sensitive data, when the order of the data or the data itself is not meant to be modified to prevent any accidental alteration or loss of data. As such, tuples take up less memory space, leading to the more efficient execution of lines of code. In comparison, every time a new list is generated, new objects are created, leading to the quick accumulation of data and slower execution time of the code. To put into perspective of the general format of a tuple, the following line of code is shown below.

INPUT:
safari_animals = ('elephant', 'lion', 'rhino', 'hippo', 'zebra')
print(safari_animals)

OUTPUT:
('elephant', 'lion', 'rhino', 'hippo', 'zebra')

Indexing Tuples

The elements in the tuple ‘safari_animals’ are assigned by order in which they were put in, reading from left-most to right-most entry. Elements in a tuple always assign the first entry/element to 0. The entries in the tuple above are indexed from 0 to 4, with 0 being assigned to the first entry, ‘elephant’, and 4 being assigned to the last entry, ‘zebra’.

Any element may be called out in a given tuple. The following example below shows the indexed element with its assigned value being called out and printed.

INPUT:
print(safari_animals[4])

OUTPUT:
zebra

The values in a tuple can be structured similar to that of a list, however one must keep in mind that once created, it cannot be further modified. The values after the lines of code have been set are now permanent for the rest of the program’s life.

Slicing Tuples

The user may choose a selected range of elements to be displayed at any interval of choice and what elements to be omitted in a given tuple. The first of these “slices” can be written with a single colon between two indexed values, encased between two square brackets. This form of slicing displays the start and end of the elements in a list to be called for. For the following example, this format is shown below.

LIST VALUES: ['lion', 'elephant', 'hippo', 'rhino', 'zebra']
INPUT:
print(safari_animals[1:3])

OUTPUT:
['elephant', 'hippo', 'rhino']

The second of these “slices” is written with two colons separating 3 indexed values, encased between two square brackets. The index number before the first colon indicates the first element in the range of values chosen; the second index number after the first colon indicates the last element in the range of values chosen, except this element will be excluded; and the last number after the second colon indicates the interval at which the tuple of elements will display information. An example of this format is shown below and the elements in the tuple, “my_tuple”, will be used for this demonstration.

LIST VALUES: [10,20,30,40,50,60,70,80,90,100,110]
INPUT:
print(my_tuple[0:10:2])

OUTPUT:
[10, 30, 50, 70, 90]

Tuple Functions

Within the tuple command exists a few notable functions accessible to the user. The first of these being the len function, used in determining the number of elements that exist in a given tuple. The max and min should also receive notable mention as described, provide the user with the maximum and minimum values present in a given tuple. This feature is more frequently used in numerical data analysis and in computations. For the following example, the elements in the tuple, “my_tuple”, will be used for this demonstration.

LIST VALUES: [10,20,30,40,50,60,70,80,90,100,110]
INPUT:
print(len(my_tuple))
print(max(my_tuple))
print(min(my_tuple))

OUTPUT:
11
110
10

For a more in-depth explanation on lists, sets, and tuples, please refer to the following video for a brief overview on the subjects:

5. Dictionaries

The dictionary data type found in Python is the last type of main data structure. Dictionaries are indexed by their keys and most often refer to strings and numbers; they have a similar book-keeping format to that of a phonebook directory to put into a visual perspective. This format allows for simple and efficient directory lookup of data stored in a computer or program. It should be noted that dictionaries are mutable data structures.

Dictionary Structure

The values to the left of the colon in each entry are known as a “key”, and the value to the right of that colon is known as a value. Below, a dictionary was created containing information on a user’s current car for this demonstration.

INPUT:
current_car = {'make' : 'ford', 'model' : 'focus', 'year' : 2003, 'horespower' : 165, 'torque' : 145}
print(current_car)

OUTPUT:
{'make' : 'ford', 'model' : 'focus', 'year' : 2003, 'horespower' : 165, 'torque' : 145}

Retrieving Information

Individual dictionary keys and values may be called upon to retrieve information contained in a dictionary. Dictionary items may also be displayed.

INPUT:
print(current_car.items())
print(current_car.keys())
print(current_car.values())

OUTPUT:
dict_items([('make', 'ford'), ('model', 'focus'), ('year', 2003), ('horespower', 165), ('torque', 145)])
dict_keys(['make', 'model', 'year', 'horespower', 'torque'])
dict_values(['ford', 'focus', 2003, 165, 145])

Modifying Dictionaries

Since dictionaries are a mutable data type, their contents can be modified in terms of the addition of new items, modification of current items, as well as the removal of such entries. The example below follows the dictionary ‘current_car’ and adds an additional item to the database of contained items, the value for the key ‘horespower’ is also modified to display an updated value.

INPUT:
current_car['drivetrain'] = 'fwd'
current_car['horespower'] = 170
print(current_car)

OUTPUT:
{ 'drivetrain' : 'fwd', 'make' : 'ford', 'model' : 'focus', 'year' : 2003, 'horespower' : 170, 'torque' : 145 }

To delete an entry from the dictionary, the function del is used, followed by the name of the dictionary and the key to be deleted, of which is contained inside square brackets. https://www.dataquest.io/blog/python-dictionaries/

 INPUT:
del current_car['drivetrain'] = 'fwd'
print(current_car)

OUTPUT:
{ 'make' : 'ford', 'model' : 'focus', 'year' : 2003, 'horespower' : 170, 'torque' : 145 }

For a more in-depth explanation on dictionaries, please refer to the following video:

6. Wrap-up / Summary

Data structures are an important concept universal to all programming languages, as such it is imperative that all programmers learn the basic data structures to advance through their prospective language. It should also be noted that Python is not limited to these four data types (lists, sets, tuples, and dictionaries), rather these are the largest, most frequently used within this language. It is encouraged that the user familiarizes themselves with the concepts and continue to explore and learn for more efficient procedures as they advance through their programming journey.

--

--