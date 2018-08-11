For a full list of BASHing data blog posts, see the index page.

48 sea levels and a trope for your terminal

The messiest dataset I've ever audited included a field that illustrates a key rule of Fussy Database Management: Never let users enter free-text data in a field unless absolutely necessary.

The field in question was for elevations and it contained 48 different strings that all meant "sea level":

Another 39 strings meant "near sea level":

How to bulk-replace all 48 sea levels with "0 m a.s.l." (for example)? One way would be to save those unique entries in a file (I called the file "sl1"; see first screenshot, above), then store them in an AWK array. When AWK finds one of those entries in the elevation field (field 27 in "table"), it replaces the entry:

awk 'BEGIN {FS=OFS="\t"} FNR==NR {a[$0]; next} $27 in a {$27="0 m a.s.l."} 1' sl1 table > newtable

(See also bulk replacement in A Data Cleaner's Cookbook)

In movies and on TV over the past 25 years or so, "ACCESS DENIED" usually pops up when the keyboard user fails in a login attempt. I used to think this trope was pure Americana, but you can find it in Australian productions as well (below).

Obviously we need this important message in our terminals, so here it is, obfuscated. Use it double-quoted with echo -e or with printf (add "

" at the end) for little-visited if/else conditions in your scripts.

\e[41m\x20\e[97m\e[1m\x41\x43\x43\x45\x53\x53\x20\x44\x45\x4e\x49\x45\x44\x21\x20\e[0m

Last update: 2018-08-11