Tutorial: CSV > Markdown (1)
As a reminder, we saw in the tutorial how to approach manipulating a multi-line string containing CSV format data. The provided example corresponds to the code below:
csvData = '''
activity, duration, importance
homework,4h,very important
Valorant,4h, important
Reading,4h, important
Gardening,2h,important
'''
The stated objective is to obtain, from the code above, the equivalent in the form of a Markdown table as could be generated by the site Table Convert:
| activity | duration | importance |
|-----------|----------|----------------|
| homework | 4h | very important |
| Valorant | 4h | important |
| Reading | 4h | important |
| Gardening | 2h | important |
split()
First part:
Our objective, to reconstruct the Markdown table, is to extract useful information such as activity
, duration
, importance
, Gardening
, 2h
, etc. without considering the structural elements of CSV, which are the comma ,
and the line break \n
.
First split: line breaks
The csvData
string is a multi-line string which means it contains several line breaks. We will therefore start by separating these lines to be able to process them independently of each other.
The split()
function allows this: lines = csvData.split('\n')
We then obtain, instead of a single multi-line variable, a list where each element corresponds to a line:
# csvData is a single multi-line string
csvData = '''
activity, duration, importance
homework,4h,very important
Valorant,4h, important
Reading,4h, important
Gardening,2h,important
'''
lines = csvData.split('\n')
# To create the lines variable, we split csvData to get a list
# The list contains 7 elements, one for each line of csvData
# content of the lines variable:
[
'',
'activity, duration, importance',
'homework,4h,very important',
'Valorant,4h, important',
'Reading,4h, important',
'Gardening,2h,important',
''
]
Second split: commas
To understand our objective, we will for now process only one line. For example, let's consider the string 'activity, duration, importance'
This string contains three words separated by a comma then a space. We want to keep the words, and ignore the combination ,
(comma then space).
The solution: use the split()
function again, this time giving it "comma then space" as separator: line.split(', ')
.
Thus, the string will be divided into three parts, and we will get the list ['activity', 'duration', 'importance']
which is a list of three items.
line = 'activity, duration, importance'
items = line.split(', ')
# We have therefore this time split a line and obtained a list
# This list, called here items, contains our words
# content of the items variable:
[
'activity',
'duration',
'importance'
]
To summarize
- We had a single multi-line string
- We split it at each line break, to get a list of lines
- We know how to split lines at each
comma then space
, to get a list of words
for
Automate with
By using the for
keyword to create a loop, we were able to automate the process. In the previous part, we split one single line. But we have several, and depending on the example to choose we could have 5, 10, 100, etc.
By using the for item in list:
instruction, we create a loop that will perform an iteration for each object in the list. For example:
for item in ["cat", "dog", "bird"]:
print(item)
# This simple code will have the effect of displaying:
cat
dog
bird
list = ["fish", "bee", "leopard"]
for word in list:
print(word)
# This code will have the effect of displaying:
fish
bee
leopard
Based on this example, and taking into account what was explained in the two previous parts, we will be able, instead of splitting a single line, to split all lines one after the other. The following code is the complete final example seen in class on Tuesday:
csvData = '''
activity, duration, importance
homework,4h,very important
Valorant,4h, important
Reading,4h, important
Gardening,2h,important
'''
lines = csvData.split('\n')
for line in lines:
print(line.split(', '))
['']
['activity', 'duration', 'importance']
['homework,4h,very important']
['Valorant,4h', 'important']
['Reading,4h', 'important']
['Gardening,2h,important']
['']
We had also seen at the very end of the demonstration how to save values instead of simply displaying them, but each thing in its time: we already have enough things to review on this simple example.