Visualizing Data with Sparklines
quick text based charts
2024-02-20
Edward Tufte is one of my favorite designers. His print style has been emulated on the web via Tufte CSS.
https://edwardtufte.github.io/tufte-css/
He also advocates for the use of sparklines:
A sparkline is a small intense, simple, word-sized graphic with typographic resolution. Sparklines mean that graphics are no longer cartoonish special occasions with captions and boxes, but rather sparkline graphics can be everywhere a word or number can be: embedded in a sentence, table, headline, map, spreadsheet, graphic.
Here is a sparkline chart I made this morning.
2020: ββββββββββββ
2021: ββββββββββββ
2022: βββββββββ
βββ
2023: ββββββββββββ
2024: ββββββββββββ
I have a small database where I jot down notes on βthings I consumeβ such as movies, television shows, podcasts, and albums. (I made this database to complement my record keeping on goodreads, where I track all my reading. This is my βeverything elseβ database.) The graph above quickly, tersely shows post frequency by month for the years since Iβve been keeping notes.
The chart takes advantage of unicode characters U+2581 through U+2588: βββββ βββ. Eight bars. So if you map your data to a range of numbers 1 through 8, then you can print out the corresdonding unicode character.
This little ruby script does the trick just fine:
#!/usr/bin/env ruby
= ('β'..'β').to_a
bar = ARGV.map(&:to_f)
numbers = numbers.minmax
min, max = (max - min) / (bar.size - 1)
div puts min == max ? bar.last*numbers.size : numbers.map{|num| bar[((num - min) / div).to_i]}.join
Then you can:
$ sparkline 5 6 7 8 9 10 11 12 11 10 9 8 7 6 5
βββββ
ββββββ
ββββ
To make my chart, I first got a range of dates from my database:
$ # [[years]]
$ recsel db/database.rec -P created -C \
| xargs -I {} gdate -d"{}" +"%Y" \
| uniq
2020
2021
2022
2023
2024
Great! Iβll want to iterate over those to do something with themβ¦
for year in `[[years]]`
do
# do something..
done
(Iβll elide some code snippets in [[double square brackets]] for ease
of reading. Whe you read β[[years]]β here, you can mentally substitute
the recsel | xargs | uniq
pipeline above.)
Now, I know that I want a list of posts per month per year. I can start by getting entries from my database.
$ export year="2022" && recsel db/database.rec \
-P created \
-C \
-e "created >> '$(gdate -d"$year-01-01")' && created << '$(gdate -d"$year-12-31 +1 days")'"
Wed, 19 Jan 2022 10:59:19 -0600
Wed, 19 Jan 2022 11:00:57 -0600
Wed, 19 Jan 2022 11:01:34 -0600
...
Sat, 31 Dec 2022 11:25:54 -0700
Sat, 31 Dec 2022 11:25:54 -0700
Sat, 31 Dec 2022 11:25:54 -0700
Cool, I format those as YYYY-MM
and count them:
$ # [[recsel]]
$ export year="2022" && recsel db/database.rec \
-P created \
-C \
-e "created >> '$(gdate -d"$year-01-01")' && created << '$(gdate -d"$year-12-31 +1 days")'" \
| while read d; do gdate -d $d +"%Y-%m"; done \
| uniq -c \
| sed 's/^ *//'
6 2022-01
10 2022-02
7 2022-03
2 2022-04
2 2022-05
4 2022-06
9 2022-09
13 2022-10
11 2022-11
4 2022-12
Great! But I need to fill in the gaps: I need the months in which I
had zero months too. What Iβll do is seq 1 12
and the
format it:
$ # [[months]]
$ export year="2022" && \
for month in `seq 1 12`; do printf "%d %d-%02d\n" 0 $year $month; done
0 2022-01
0 2022-02
0 2022-03
0 2022-04
0 2022-05
0 2022-06
0 2022-07
0 2022-08
0 2022-09
0 2022-10
0 2022-11
0 2022-12
β¦ and then join
them!
$ join -j 2 -t' ' -e "0" -o 2.1 -a1 <([[months]]) <([[recsel]])
6
10
7
2
2
4
0
0
9
13
11
4
Quick join
breakdown:
-j 2
: join on the second column of the two files (the dates)-t' '
: the field separater is a space-a1
: in addition to the regular output, print a line for each unpairable line in file 1-e "0"
: if a field is present in the first file, but present in the second file, print a β0β-o 2.1
: print the first field in the second file<(bla bla bla) <(bla bla bla)
: redirect these commands to be the βfilesβ input forjoin
That gives us all twelve months!
So putting it all together, that gives us:
for year in `recsel db/database.rec -P created -C | xargs -I {} gdate -d"{}" +"%Y" | uniq`
do
printf "$year: "
join -j 2 -t' ' -e "0" -o 2.1 -a1 \
<(for month in `seq 1 12`; do printf "%d %d-%02d\n" 0 $year $month; done) \
<(recsel db/database.rec -P created -e "created >> '$(gdate -d"$year-01-01")' && created << '$(gdate -d"$year-12-31 +1 days")'" -C \
| while read d; do gdate -d $d +"%Y-%m"; done \
| uniq -c \
| sed 's/^ *//') \
| xargs sparkline
done
Which gives us the pretty little graphic we saw earlier:
2020: ββββββββββββ
2021: ββββββββββββ
2022: βββββββββ
βββ
2023: ββββββββββββ
2024: ββββββββββββ
Futher Reading:
- The Tao of Unicode Sparklines: https://blog.jonudell.net/2021/08/05/the-tao-of-unicode-sparklines/
- Sparkline in unicode: https://rosettacode.org/wiki/Sparkline_in_unicode