When reading data from a file (like a CSV), it's common to want to keep an intact copy of the original data, even after processing it.
If we accidentally modify a variable containing important data, we risk losing information and making the program difficult to fix.
In Python, it's customary to write the names of variables we want to consider as constants entirely in uppercase.
Example:
ORIGINAL_DATA = ["Name", "Firstname", "Age"]
This indicates to code readers (and ourselves!) that this variable should not be modified.
A tuple is an immutable structure in Python: we cannot modify its content after creation.
Thus, to protect a set of fixed values, it's preferable to use a tuple:
EXPECTED_COLUMNS = ("Name", "Firstname", "Age")
When processing data (cleaning, sorting, typing...), it's useful to create new variables at each step, with precise names.
For example:
csv_data = load_csv("file.csv")
cleaned_data = clean(csv_data)
typed_data = type(cleaned_data)
This allows:
| Tip | Advantage |
|---|---|
| Uppercase for constants | Visually identifiable in code |
| Use a tuple | Protects against accidental modifications |
| Name steps clearly | Promotes clarity and security |