<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Bryan Brattlof</title><link href="https://0x42.sh/" rel="alternate"/><link href="https://0x42.sh/feeds/all.atom.xml" rel="self"/><id>https://0x42.sh/</id><updated>2022-04-02T00:00:00+00:00</updated><entry><title>How to Proxy Git Connections</title><link href="https://0x42.sh/how-to-proxy-git-connections/" rel="alternate"/><published>2022-04-02T00:00:00+00:00</published><updated>2022-04-02T00:00:00+00:00</updated><author><name>Bryan Brattlof</name></author><id>tag:0x42.sh,2022-04-02:/how-to-proxy-git-connections/</id><summary type="html">&lt;p&gt;One of the first things you will notice when you start developing open
source software for a larger corporation is the company firewall or
proxy blocking all outgoing traffic not routed through their network
appliance for inspection and review first. This sadly includes all of
our &lt;code&gt;git fetch&lt;/code&gt; traffic.&lt;/p&gt;
&lt;p&gt;Before …&lt;/p&gt;</summary><content type="html">&lt;p&gt;One of the first things you will notice when you start developing open
source software for a larger corporation is the company firewall or
proxy blocking all outgoing traffic not routed through their network
appliance for inspection and review first. This sadly includes all of
our &lt;code&gt;git fetch&lt;/code&gt; traffic.&lt;/p&gt;
&lt;p&gt;Before we can do anything upstream we first need to configure Git to
pull in updates through these corporate firewalls.  This is where the
multipurpose relay tool, &lt;a class="reference external" href="http://www.dest-unreach.org/"&gt;socat&lt;/a&gt;, shines.&lt;/p&gt;
&lt;div class="section" id="proxy-http-s-remotes"&gt;
&lt;h2&gt;Proxy http(s) Remotes&lt;/h2&gt;
&lt;p&gt;Git uses the &lt;a class="reference external" href="https://git-scm.com/book/en/v2/Git-on-the-Server-The-Protocols#_the_http_protocols"&gt;Smart HTTP protocol&lt;/a&gt; to download updates from a remote
that starts with &lt;code&gt;https://*&lt;/code&gt; URI. These type of connections have
become, by far, the most popular way to use Git today and is the default
protocol used by many popular forges like GitHub or GitLab.&lt;/p&gt;
&lt;p&gt;Internally Git relies on the &lt;a class="reference external" href="https://curl.se/libcurl/"&gt;libcurl&lt;/a&gt; library to handle these HTTP
connections which means Git comes with the ability to proxy these
requests built-in.&lt;/p&gt;
&lt;p&gt;It also means that Git will respect your system's &lt;code&gt;http_proxy&lt;/code&gt; and
&lt;code&gt;https_proxy&lt;/code&gt; environment variables. Simply add something like
this to your &lt;code&gt;~/.bashrc&lt;/code&gt; or execute these lines in your shell
before you need to pull updates from your remote project and Git will
proxy the requests through your corporate firewall appliance
automatically.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$&lt;span class="w"&gt; &lt;/span&gt;grep&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;https\?_proxy&amp;#39;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;~/.bashrc
&lt;span class="nb"&gt;export&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;http_proxy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http://mega.co.proxy.example.com:8080/
&lt;span class="nb"&gt;export&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;https_proxy&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http://mega.co.proxy.example.com:8080/
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Alternatively, if you want to keep all your Git configurations in one
place, you can add this to our &lt;code&gt;~/.gitconfig&lt;/code&gt; to configure Git's
proxy settings:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;[http]
    proxy = http://mega.co.proxy.example.com:8080
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If you only need to proxy the repositories hosted &lt;strong&gt;outside&lt;/strong&gt; of the
company network, you can use a pattern like this to filter the domains
that use the &lt;code&gt;proxy&lt;/code&gt; configuration settings.&lt;/p&gt;
&lt;p&gt;For example any repository hosted on &lt;a class="reference external" href="https://git.kernel.org/"&gt;kernel.org&lt;/a&gt; servers, use:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;[http]
[http &amp;quot;https://git.kernel.org&amp;quot;]
    proxy = http://mega.co.proxy.example.com:8080
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For more details on proxy-ing http based Git projects, check-out the
&lt;code&gt;--proxy&lt;/code&gt; option in the &lt;a class="reference external" href="https://curl.se/docs/manpage.html"&gt;curl(1)&lt;/a&gt; man page, or the
&lt;code&gt;http.proxy&lt;/code&gt; entry in the &lt;a class="reference external" href="https://www.git-scm.com/docs/git-config"&gt;git-config&lt;/a&gt; documentation.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="proxy-git-remotes"&gt;
&lt;h2&gt;Proxy Git Remotes&lt;/h2&gt;
&lt;p&gt;Another common way to fetch updates from a remote tree is by using their
&lt;code&gt;git://*&lt;/code&gt; URI which uses the &lt;a class="reference external" href="https://git-scm.com/book/en/v2/Git-on-the-Server-The-Protocols"&gt;Git Server Protocol&lt;/a&gt;. Often used to
share projects that do not require any user authentication as it is the
fastest of the three transfer protocols available.&lt;/p&gt;
&lt;p&gt;To proxy these types of requests we will need Git to run a simple shell
script, like the example below, any time we fetch a remote repository.
This example script will proxy all repositories outside of the
&lt;code&gt;example.com&lt;/code&gt; domain.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$&lt;span class="w"&gt; &lt;/span&gt;cat&lt;span class="w"&gt; &lt;/span&gt;~/bin/gitproxy
&lt;span class="c1"&gt;#!/usr/bin/sh&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;*example.com&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;then&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;# the repository is outside the company network, proxy it&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;socat&lt;span class="w"&gt; &lt;/span&gt;tcp:mega.co.proxy.example.com:8080:&lt;span class="nv"&gt;$1&lt;/span&gt;:&lt;span class="nv"&gt;$2&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="c1"&gt;# no need to proxy internal network traffic&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;socat&lt;span class="w"&gt; &lt;/span&gt;tcp:&lt;span class="nv"&gt;$1&lt;/span&gt;:&lt;span class="nv"&gt;$2&lt;/span&gt;
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then, to have Git run this script, we can set the &lt;code&gt;gitProxy&lt;/code&gt;
setting in our &lt;code&gt;~/.gitconfig&lt;/code&gt; like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;core&lt;span class="o"&gt;]&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nv"&gt;gitProxy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;~/bin/gitproxy
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Because this is just a simple shell script, we can make this as
complicated or as simple as we want. For example, you can add more logic
if you use a laptop that may need work outside of the office.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="proxy-ssh-remotes"&gt;
&lt;h2&gt;Proxy SSH Remotes&lt;/h2&gt;
&lt;p&gt;Finally, if the remote starts with &lt;code&gt;ssh://*&lt;/code&gt; or uses a typical ssh
connection like &lt;code&gt;git&amp;#64;git.sr.ht:*&lt;/code&gt; we are connecting via the &lt;a class="reference external" href="http://git-scm.com/book/en/Git-on-the-Server-The-Protocols#The-SSH-Protocol"&gt;SSH
protocol&lt;/a&gt;. These connections are typically used when you maintain the
remote repository or otherwise need to authenticate yourself before you
can &lt;code&gt;git push&lt;/code&gt; changes to the server.&lt;/p&gt;
&lt;p&gt;In these situations, Git relies on SSH to handle the authentication and
connection to the remote, which means we will need to edit our
&lt;code&gt;~/.ssh/config&lt;/code&gt; file to proxy these types of connections.&lt;/p&gt;
&lt;p&gt;Inside OpenSSH's configuration file, we can use the &lt;code&gt;ProxyCommand&lt;/code&gt;
option for all remote excluding any URI inside our company's network.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Host&lt;span class="w"&gt; &lt;/span&gt;*&lt;span class="w"&gt; &lt;/span&gt;!*.mega.co.example.com
&lt;span class="w"&gt;    &lt;/span&gt;User&lt;span class="w"&gt; &lt;/span&gt;git
&lt;span class="w"&gt;    &lt;/span&gt;ProxyCommand&lt;span class="w"&gt; &lt;/span&gt;socat&lt;span class="w"&gt; &lt;/span&gt;tcp:mega.co.proxy.example.com:8080:%h:%p
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Or we can manually add the remotes we want proxied. For example any
remote at &lt;a class="reference external" href="https://git.kernel.org"&gt;git.kernel.org&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Host&lt;span class="w"&gt; &lt;/span&gt;git.kernel.org
&lt;span class="w"&gt;    &lt;/span&gt;User&lt;span class="w"&gt; &lt;/span&gt;git
&lt;span class="w"&gt;    &lt;/span&gt;ProxyCommand&lt;span class="w"&gt; &lt;/span&gt;socat&lt;span class="w"&gt; &lt;/span&gt;tcp:mega.co.proxy.example.com:8080:%h:%p
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For more information about OpenSSH's config file options, checkout the
&lt;a class="reference external" href="https://manpages.ubuntu.com/manpages/xenial/en/man5/ssh_config.5.html"&gt;ssh_config(5)&lt;/a&gt; man page.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="wrapping-up"&gt;
&lt;h2&gt;Wrapping Up&lt;/h2&gt;
&lt;p&gt;I hope these notes were of some use or helped you in some way. I
recently accepted my first open source role working for a larger company
(&amp;gt;10,000 people) and wanted to write these notes down in the hopes it
would help others working in open source projects behind walled gardens.&lt;/p&gt;
&lt;p&gt;As always if you see something that wasn't correct or I need to update
or clarify, please feel free to &lt;a class="reference external" href="https://0x42.sh/connect/"&gt;send me a note&lt;/a&gt; any kind of way that
is convenient to you.&lt;/p&gt;
&lt;p&gt;~Bryan&lt;/p&gt;
&lt;/div&gt;
</content><category term="Notes"/></entry><entry><title>My PGP Cheat Sheet</title><link href="https://0x42.sh/my-pgp-cheat-sheet/" rel="alternate"/><published>2021-11-02T00:00:00+00:00</published><updated>2021-11-02T00:00:00+00:00</updated><author><name>Bryan Brattlof</name></author><id>tag:0x42.sh,2021-11-02:/my-pgp-cheat-sheet/</id><summary type="html">&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;Getting Started with PGP?&lt;/p&gt;
&lt;p class="last"&gt;The Linux Foundation has a public GitHub wiki on &lt;a class="reference external" href="https://github.com/lfit/itpol/blob/master/protecting-code-integrity.md"&gt;Protecting Code
Integrity with PGP&lt;/a&gt; that goes into more detail than what
I've written here and can help you avoid the many pitfalls when
setting up a secure PGP key in the 21&lt;sup&gt;st&lt;/sup&gt; century.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;I use …&lt;/p&gt;</summary><content type="html">&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;Getting Started with PGP?&lt;/p&gt;
&lt;p class="last"&gt;The Linux Foundation has a public GitHub wiki on &lt;a class="reference external" href="https://github.com/lfit/itpol/blob/master/protecting-code-integrity.md"&gt;Protecting Code
Integrity with PGP&lt;/a&gt; that goes into more detail than what
I've written here and can help you avoid the many pitfalls when
setting up a secure PGP key in the 21&lt;sup&gt;st&lt;/sup&gt; century.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;I use &lt;code&gt;gpg&lt;/code&gt; about as often as &lt;code&gt;tar&lt;/code&gt; which is just enough
time to forget how to use it.&lt;/p&gt;
&lt;p&gt;What lies below are my (a Linux kernel contributor) notes on some of the
less used commands to keep your &lt;code&gt;gpg&lt;/code&gt; keys safe and your copy of
the Linux &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Web_of_trust"&gt;Web of Trust&lt;/a&gt; up to date.&lt;/p&gt;
&lt;p&gt;Please feel free to &lt;a class="reference external" href="https://0x42.sh/connect/"&gt;contact me&lt;/a&gt; if anything you read below
is wrong, out of date, or find something I should expand on. :)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Helpful Jump To Points:&lt;/strong&gt;&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;a class="reference internal" href="#why-pgp"&gt;Why PGP?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference internal" href="#convert-a-pgp-key-to-an-ssh-key"&gt;Convert A PGP Key To An SSH Key&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference internal" href="#locating-a-public-key"&gt;Locating A Public Key&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference internal" href="#updating-your-keyring"&gt;Updating Your Keyring&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference internal" href="#signing-a-public-key"&gt;Signing A Public Key&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference internal" href="#extending-an-expiration-date"&gt;Extending An Expiration Date&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference internal" href="#backing-up-a-gnupg-directory"&gt;Backing Up A GnuPG Directory&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="section" id="why-pgp"&gt;
&lt;h2&gt;Why PGP?&lt;/h2&gt;
&lt;p&gt;Today, the Linux kernel development cycle is solely (&lt;a class="reference external" href="https://lwn.net/Articles/860607/"&gt;for now&lt;/a&gt;) an email based work-flow involving pulling changes from
over ~300 different repositories.  &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt; These changes will eventually
make their way into Linus' main tree (&lt;a class="reference external" href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/"&gt;the mainline&lt;/a&gt;) for the final
approval before being distributed to the masses.&lt;/p&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-1" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;Not every repository needs to be pulled before each release -
though the &lt;a class="reference external" href="https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git"&gt;linux-next repository&lt;/a&gt;, the repository
subsystem maintainers use to resolve merge conflicts before they
reach Linus, &lt;a class="reference external" href="https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Next/Trees"&gt;tracks around 300&lt;/a&gt;.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;With the existence of &lt;a class="reference external" href="https://www.malwarebytes.com/phishing"&gt;phishing&lt;/a&gt;, &lt;a class="reference external" href="https://www.malwarebytes.com/spoofing"&gt;spoofing&lt;/a&gt; and other &lt;a class="reference external" href="https://www.malwarebytes.com/social-engineering"&gt;social-engineering&lt;/a&gt;
attacks, any one of these pull requests could conceivably bring
malicious code via a spoofed email, fooling either Linus or any one of
the subsystem maintainers.&lt;/p&gt;
&lt;p&gt;After the widespread credential stealing attack that &lt;a class="reference external" href="https://lwn.net/Articles/458099/"&gt;compromised much
of the core kernel.org infrastructure&lt;/a&gt; in 2011, the decision
was made to develop a PGP based &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Web_of_trust"&gt;Web of Trust&lt;/a&gt; to give developers a way
to independently verify other repositories without a central authority.&lt;/p&gt;
&lt;p&gt;Now days, pull requests must have a signed tag from a PGP key the
maintainer trusts not only to validate the pull request, but to attest
to the changes being made. Like a &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Chain_of_custody"&gt;chain of custody&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="convert-a-pgp-key-to-an-ssh-key"&gt;
&lt;h2&gt;Convert A PGP Key To An SSH Key&lt;/h2&gt;
&lt;p&gt;Each PGP key can have one or more roles; Signing, Encryption,
Authentication or Certification. One &amp;quot;neat&amp;quot; feature about authentication
keys is they can be used as an OpenSSH key, allowing you to store your
SSH and PGP keys together on the same &lt;a class="reference external" href="https://www.nitrokey.com/"&gt;NitroKey&lt;/a&gt;.&lt;/p&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;Be Careful&lt;/p&gt;
&lt;p class="last"&gt;Before setting up a NitroKey or any Smartcard make sure you create an
offline backup of your PGP keys (&lt;a class="reference internal" href="#backing-up-a-gnupg-directory"&gt;Backing Up A GnuPG Directory&lt;/a&gt;)
allowing you to recover after a catastrophe.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Any time you need to give out your public SSH key, just use this
command:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ gpg --export-ssh-key hello@bryanbrattlof.com
ssh-rsa AAAAB3NzaC1yc2E ...
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Linode has a good writup on how to get started &lt;a class="reference external" href="https://www.linode.com/docs/guides/gpg-key-for-ssh-authentication/"&gt;using your PGP key with
SSH&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="locating-a-public-key"&gt;
&lt;h2&gt;Locating A Public Key&lt;/h2&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;Do You Like Reading?&lt;/p&gt;
&lt;p class="last"&gt;The full documentation on how to use the Linux kernel's PGP keyring
can be found here on &lt;a class="reference external" href="https://korg.docs.kernel.org/pgpkeys.html"&gt;the docs.kernel.org website&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Around June of 2019 &lt;a class="footnote-reference" href="#footnote-2" id="footnote-reference-2"&gt;[2]&lt;/a&gt;, the &lt;a class="reference external" href="https://www.rossde.com/PGP/pgp_keyserv.html"&gt;public SKS Keyserver network&lt;/a&gt;
used to publish public PGP keys was &lt;a class="reference external" href="https://gist.github.com/rjhansen/67ab921ffb4084c865b3618d6955275f"&gt;attacked in a way that could not be
easily fixed&lt;/a&gt;. The attacker(s) attached thousands of valid but
useless signatures to well known PGP keys to bloat the complexity of the
Web of Trust graph and render &lt;code&gt;gpg&lt;/code&gt; unusable for anyone
unfortunate enough to download a &amp;quot;poisoned&amp;quot; key.&lt;/p&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-2" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-2"&gt;[2]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;This was a known problem for more than a decade.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;As a work-around, the Linux community published a repository of all the
well known contributers to the kernel called &lt;a class="reference external" href="https://git.kernel.org/pub/scm/docs/kernel/pgpkeys.git/"&gt;pgpkeys.git&lt;/a&gt;, that we can
use to send and receive updates to our PGP keys to the kernel community.&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;Before we can find a key, we need to clone the kernel's PGP keyring
repository:&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ git clone https://git.kernel.org/pub/scm/docs/kernel/pgpkeys.git korg-keys
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;For the Cautious:&lt;/p&gt;
&lt;p class="last"&gt;Each commit is signed by the maintainer, allowing us to &lt;code&gt;git verify-commit&lt;/code&gt;
to validate each change. (&lt;a class="reference external" href="https://git-scm.com/docs/git-verify-commit"&gt;git-verify-commit&lt;/a&gt;) If you already have
Linus' key in your keyring you should be able to verify Konstantin's,
the maintainer of the repository.&lt;/p&gt;
&lt;/div&gt;
&lt;ol class="arabic simple" start="2"&gt;
&lt;li&gt;To find someone's PGP key, &lt;code&gt;git grep&lt;/code&gt; for the key in the
repository:&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ git grep -l torvalds *.asc
keys/79BE3E4300411886.asc
&lt;/pre&gt;&lt;/div&gt;
&lt;ol class="arabic simple" start="3"&gt;
&lt;li&gt;Then import the key into our keyring:&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ gpg --import keys/79BE3E4300411886.asc
&lt;/pre&gt;&lt;/div&gt;
&lt;ol class="arabic simple" start="4"&gt;
&lt;li&gt;&lt;strong&gt;Alternatively:&lt;/strong&gt; we could import all keys currently in the
repository:&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ gpg --import keys/*.asc
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Done!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;With your key imported, we will need a way to update these keys from the
PGP repository. See the &lt;a class="reference internal" href="#updating-your-keyring"&gt;Updating Your Keyring&lt;/a&gt; section below.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="updating-your-keyring"&gt;
&lt;h2&gt;Updating Your Keyring&lt;/h2&gt;
&lt;p&gt;After the public SKS keyserver network was abandoned and the Linux
kernel developer's PGP keyring repository was created, any update to a
kernel developer's PGP key should be uploaded to the repository so they
can be shared with the community.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;To submit changes&lt;/strong&gt; to your keys, use this command to email your
updates to the mailing list:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ gpg -a --export [Email] | mail -s [Email] keys@linux.kernel.org
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;A Note on PGP Etiquette:&lt;/p&gt;
&lt;p class="last"&gt;It is considered poor form to publish updates to someone else's
public key.  When you sign someone's public key, encrypt it, so only
they can use it, and send your changes directly to them so they can
decide on how to publish your changes.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;To download updates&lt;/strong&gt; from the keyring repository, &lt;a class="reference external" href="https://git.kernel.org/pub/scm/docs/kernel/pgpkeys.git/tree/scripts/korg-refresh-keys"&gt;a helpful script
was added&lt;/a&gt;, allowing us to download any new changes
using cron, systemd, or invoking the script manually.&lt;/p&gt;
&lt;p&gt;I prefer to let systemd run the script:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ cat ~/.config/systemd/user/korg-refresh-keys.timer
[Timer]
OnCalendar=daily
Persistent=yes

[Install]
WantedBy=sockets.target
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ cat ~/.config/systemd/user/korg-refresh-keys.service
[Service]
ExecStart=%h/bin/korg-refresh-keys -q
Type=oneshot
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;So I can see how the script is doing at any time by running:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ journalctl --user -fu korg-refresh-keys
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="signing-a-public-key"&gt;
&lt;h2&gt;Signing A Public Key&lt;/h2&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;A Quick Remark:&lt;/p&gt;
&lt;p class="last"&gt;An entire essay should be (&lt;a class="reference external" href="https://carouth.com/articles/signing-pgp-keys/"&gt;and has been&lt;/a&gt;) written on how
and when to sign someone's public key. Most of them where written
before pandemics, which &lt;a class="reference external" href="https://lwn.net/Articles/831401/"&gt;has only complicated how key signing works&lt;/a&gt;. All of this is something I don't want to write about here.
(o.O)&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Once you feel confident signing someone's key or UID(s) and attesting to
the validity of their PGP key, you can sign it with the following
command:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ gpg --ask-cert-level --ask-cert-expire --sign-key someone@example.com
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Each situation is different, and everyone (and each community) has their
own methods for signing someone's key.  This is only the default
arguments I use, and adapt to the situation.&lt;/p&gt;
&lt;table class="docutils option-list" frame="void" rules="none"&gt;
&lt;col class="option" /&gt;
&lt;col class="description" /&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="option-group" colspan="2"&gt;
&lt;kbd&gt;&lt;span class="option"&gt;--ask-cert-level&lt;/span&gt;&lt;/kbd&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&amp;nbsp;&lt;/td&gt;&lt;td&gt;Prompt for a certification level allowing you to specify how confident
you are about this signature. Useful to signal the difference between
the random person you met at a conference versus your workmate.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class="option-group" colspan="2"&gt;
&lt;kbd&gt;&lt;span class="option"&gt;--ask-cert-expire&lt;/span&gt;&lt;/kbd&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&amp;nbsp;&lt;/td&gt;&lt;td&gt;Prompt for an expiration time, allowing your signature to expire after
a set amount of time. I like to put a expiration date on my signatures
when possible if only to stop an old email address having a valid
signature from me.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class="option-group" colspan="2"&gt;
&lt;kbd&gt;&lt;span class="option"&gt;--sign-key &lt;var&gt;&amp;lt;name&amp;gt;&lt;/var&gt;&lt;/span&gt;&lt;/kbd&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&amp;nbsp;&lt;/td&gt;&lt;td&gt;Signs a public key with your secret certification key. This is a
shortcut version of the subcommand &amp;quot;sign&amp;quot; from &lt;code&gt;--edit-key&lt;/code&gt;.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;If you're like me and store your certification key offline in &lt;a class="reference internal" href="#backing-up-a-gnupg-directory"&gt;an
encrypted USB drive&lt;/a&gt;, &lt;strong&gt;(something I
strongly encourage)&lt;/strong&gt; this process will be a little more complicated.&lt;/p&gt;
&lt;p&gt;Don't forget &lt;a class="reference internal" href="#backing-up-a-gnupg-directory"&gt;to update your backup&lt;/a&gt;
after you finish.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="extending-an-expiration-date"&gt;
&lt;h2&gt;Extending An Expiration Date&lt;/h2&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;Before We Start:&lt;/p&gt;
&lt;p class="last"&gt;To extend the expiration date of your PGP keys you will need access
to your secret certification key. If you are like me, this requires
your &lt;a class="reference internal" href="#backing-up-a-gnupg-directory"&gt;encrypted offline backup&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;By default, master certification keys have an expiration date set two
years from the date of their creation. This is for security reasons and
to let obsolete keys disappear from the Web of Trust.&lt;/p&gt;
&lt;p&gt;To add one year (from the current date) to the expiration, run:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ gpg --quick-set expire hello@bryanbrattlof.com 1y
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Make sure to &lt;a class="reference internal" href="#backing-up-a-gnupg-directory"&gt;backup your GnuPG directory&lt;/a&gt; as well as &lt;a class="reference internal" href="#updating-your-keyring"&gt;email the kernel developer's keyring mailing
list&lt;/a&gt; to let everyone know about the changes
you have made.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="backing-up-a-gnupg-directory"&gt;
&lt;h2&gt;Backing Up A GnuPG Directory&lt;/h2&gt;
&lt;p&gt;Now that we live squarely in the hyper-connected &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Information_Age"&gt;information age&lt;/a&gt;, the more sensitive data we can store, encrypted
and offline, the better. This includes keeping &lt;strong&gt;all the secret key
material&lt;/strong&gt; used in our PGP keys &lt;strong&gt;off our working computers&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;How much effort you spend backing up your private keys is ultimately
your decision. My goal is to have a &lt;strong&gt;good&lt;/strong&gt; story to tell the mailing
lists explaining how I lost control of my keys. I feel comfortable
having two &lt;a class="reference external" href="https://gitlab.com/cryptsetup/cryptsetup/"&gt;LUKS&lt;/a&gt; encrypted USB drives and a &lt;a class="reference external" href="https://www.jabberwocky.com/software/paperkey/"&gt;paperkey&lt;/a&gt; backup, storing
all of my PGP keys, and allowing me to restore my &lt;a class="reference external" href="https://www.nitrokey.com/"&gt;Nitrokey&lt;/a&gt; if that
device happens to die.&lt;/p&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;How to encrypt a USB drive?&lt;/p&gt;
&lt;p class="last"&gt;A full write up on how to setup a LUKS encrypted device can be found
on the &lt;a class="reference external" href="https://gitlab.com/cryptsetup/cryptsetup/-/wikis/FrequentlyAskedQuestions#2-setup"&gt;cryptsetup repository&lt;/a&gt;, though most Linux
distributions will have cryptsetup integrated already.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;If you are like me and prefer to keep your main certification key safely
stored in an encrypted drive and off our working computer, you will need
to mount your encrypted device first:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ mkdir /media/device/gnupg-backup
$ cryptsetup luksOpen [DEVICE] gnupg-backup-enc
$ mount /dev/mapper/gnupg-backup-enc /media/device/gnupg-backup
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then we can tell &lt;code&gt;gpg&lt;/code&gt; to use it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ export GNUPGHOME=/media/device/gnupg-backup
$ gpg --list-secret-keys
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You should now see that &lt;code&gt;sec#&lt;/code&gt; has been replaced with &lt;code&gt;sec&lt;/code&gt;
indicating that both the public and the secret key material are
available for your main certification key.&lt;/p&gt;
&lt;p&gt;Now is a good time to make changes that require our main key like
&lt;a class="reference internal" href="#extending-an-expiration-date"&gt;Extending An Expiration Date&lt;/a&gt;, &lt;a class="reference internal" href="#signing-a-public-key"&gt;Signing A Public Key&lt;/a&gt; or running
commands like &lt;code&gt;fsck&lt;/code&gt; on the decrypted volume.&lt;/p&gt;
&lt;p&gt;Once you have finished making your changes, we need to import these
changes back into our everyday working &lt;code&gt;.gnupg&lt;/code&gt; directory:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ gpg --export | gpg --homedir ~/.gnupg --import
$ unset GNUPGHOME
$ gpg --list-secret-keys
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You should now see &lt;code&gt;sec#&lt;/code&gt; and &lt;code&gt;ssb&amp;gt;&lt;/code&gt; have returned.&lt;/p&gt;
&lt;p&gt;Finally, unmount and close our encrypted volume, and return our USB
drive to its safe place.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ unmount /media/device/gnupg-backup
$ cryptsetup luksClose gnupg-backup-enc
$ rmdir /media/device/gnupg-backup
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="wrapping-up"&gt;
&lt;h2&gt;Wrapping Up&lt;/h2&gt;
&lt;p&gt;I will say &lt;code&gt;gpg&lt;/code&gt; is not the easiest tool to use. After 30 years of
development, the time and effort needed to maintain the Web of Trust
seems to be more than many are willing to endure.&lt;/p&gt;
&lt;p&gt;As difficult as it is, &lt;code&gt;gpg&lt;/code&gt; is a great open source tool, helping
developers from all around the world regardless timezone, language, or
access to git-forges like Github the opportunity to work on one of the
most widely used software projects in the world.&lt;/p&gt;
&lt;p&gt;I hope these notes helped you in some way. If you read anything that
needs to be updated or you feel like I should expand on something,
please feel free to &lt;a class="reference external" href="https://0x42.sh/connect/"&gt;write me an email&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
</content><category term="Notes"/></entry><entry><title>The Linux Kernel Coding Style</title><link href="https://0x42.sh/the-linux-kernel-coding-style/" rel="alternate"/><published>2021-05-28T00:00:00+00:00</published><updated>2021-05-28T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2021-05-28:/the-linux-kernel-coding-style/</id><summary type="html">&lt;p&gt;Like the (very) old saying goes: &lt;em&gt;&amp;quot;Beauty is in the eye of the
beholder&amp;quot;&lt;/em&gt;. This old (Greek maybe?) saying even holds true for the
software we write. The Linux kernel, like any sufficiently large
software project, with thousands of developers each having their own
ideas of beautifully readable code, the …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Like the (very) old saying goes: &lt;em&gt;&amp;quot;Beauty is in the eye of the
beholder&amp;quot;&lt;/em&gt;. This old (Greek maybe?) saying even holds true for the
software we write. The Linux kernel, like any sufficiently large
software project, with thousands of developers each having their own
ideas of beautifully readable code, the issue of how the code looks
eventually becomes a topic of discussion.&lt;/p&gt;
&lt;p&gt;These coding style standards, while sometimes annoying, are essential to
letting developers worry about developing rather than trying to decipher
another developer's &lt;em&gt;&amp;quot;beautifully&amp;quot;&lt;/em&gt; written code.&lt;/p&gt;
&lt;p&gt;So for the 4&lt;sup&gt;th&lt;/sup&gt; challenge in &lt;a class="reference external" href="https://eudyptula-challenge.org/"&gt;The Eudyptula Challenge&lt;/a&gt; we'll
return our focus on the &lt;a class="reference external" href="https://0x42.sh/the-hello-world-kernel-module/"&gt;Hello World Kernel Module&lt;/a&gt; we
created in the 1&lt;sup&gt;st&lt;/sup&gt; challenge, bringing it up to the (charmingly
unique) Linux kernel coding standard of what beautifully written C code
looks like.&lt;/p&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;Brief Aside:&lt;/p&gt;
&lt;p&gt;Before we begin, this is the 4&lt;sup&gt;th&lt;/sup&gt; write-up &lt;a class="reference external" href="https://0x42.sh/eudyptula-challenge/"&gt;in a series&lt;/a&gt; as I work through &lt;a class="reference external" href="https://eudyptula-challenge.org/"&gt;The Eudyptula Challenge&lt;/a&gt;.
If you wish to work on The Eudyptula Challenge yourself before you
read my notes (recommended), you can use &lt;a class="reference external" href="https://git.sr.ht/~bryanb/eudyptula"&gt;my git repository&lt;/a&gt; which has all 20 tasks and
the code I used to complete each one.&lt;/p&gt;
&lt;p class="last"&gt;Or you can start with the 1&lt;sup&gt;st&lt;/sup&gt; challenge of this series I've
published &lt;a class="reference external" href="https://0x42.sh/eudyptula-challenge/"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;&lt;em&gt;Enjoy!&lt;/em&gt;&lt;/p&gt;
&lt;div class="section" id="task-no-4"&gt;
&lt;h2&gt;Task No.4&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;Wonderful job in making it this far, I hope you have been having fun.
Oh, you're getting bored, just booting and installing kernels?  Well,
time for some pedantic things to make you feel that those kernel builds
are actually fun!&lt;/p&gt;
&lt;p&gt;Part of the job of being a kernel developer is recognizing the proper
Linux kernel coding style. The full description of this coding style
can be found in the kernel itself, in the &lt;code&gt;Documentation/CodingStyle&lt;/code&gt;
file.  I'd recommend going and reading that right now, it's pretty
simple stuff, and something that you are going to need to know and
understand.  There is also a tool in the kernel source tree in the
&lt;code&gt;scripts/&lt;/code&gt; directory called &lt;code&gt;checkpatch.pl&lt;/code&gt; that can be used
to test for adhering to the coding style rules, as kernel programmers are
lazy and prefer to let scripts do their work for them...&lt;/p&gt;
&lt;p&gt;And why a coding standard at all?  That's because of your brain (yes,
yours, not mine, remember, I'm just some dumb shell scripts).  Once your
brain learns the patterns, the information contained really starts to
sink in better.  So it's important that everyone follow the same
standard so that the patterns become consistent. In other words, you
want to make it really easy for other people to find the bugs in your
code, and not be confused and distracted by the fact that you happen to
prefer 5 spaces instead of tabs for indentation. Of course you would
never prefer such a thing, I'd never accuse you of that, it was just an
example, please forgive my impertinence!&lt;/p&gt;
&lt;p&gt;Anyway, the tasks for this round all deal with the Linux kernel coding
style. Attached to this message are two kernel modules that do not
follow the proper Linux kernel coding style rules. Please fix both of
them up, and send it back to me in such a way that does follow the
rules.&lt;/p&gt;
&lt;p&gt;What, you recognize one of these modules?  Imagine that, perhaps I was
right to accuse you of the using a &lt;em&gt;&amp;quot;wrong&amp;quot;&lt;/em&gt; coding style :)&lt;/p&gt;
&lt;p&gt;Yes, the logic in the second module is crazy, and probably wrong, but
don't focus on that, just look at the patterns here, and fix up the
coding style, do not remove lines of code.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Attachments:&lt;/strong&gt;&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;&lt;a class="reference external" href="https://git.sr.ht/~bryanb/eudyptula/tree/ba717131952f892d88459a265915563796683347/item/tasks/01/hello-world.c"&gt;hello_world.c&lt;/a&gt; - &lt;strong&gt;Note:&lt;/strong&gt; This should be the module you submitted
from the &lt;a class="reference external" href="https://0x42.sh/the-hello-world-kernel-module/"&gt;First Eudyptula Challenge&lt;/a&gt;. If you've
been following along, this is the file we submitted.&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="https://git.sr.ht/~bryanb/eudyptula/tree/ba717131952f892d88459a265915563796683347/item/tasks/04/coding_style.c"&gt;coding_style.c&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p class="attribution"&gt;&amp;mdash;Little Penguin&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;As the Little Penguin was saying, the Linux &lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/process/coding-style.html"&gt;style guide&lt;/a&gt; is here not
for &lt;strong&gt;your&lt;/strong&gt; benefit, but to help other developers, maintainers, and
reviewers understand and evaluate the code you write faster. The kernel
works at a &lt;strong&gt;furious&lt;/strong&gt; pace, receiving on average &lt;strong&gt;~10 patches every
hour&lt;/strong&gt;, a pace that is only accelerating as more people and companies
begin to depend on the project for the new device, radio, microwave or
any other IoT gadget they're working on.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;TLDR:&lt;/strong&gt; The style guide helps everyone review your code which improves
the chances of your patch being accepted.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="executive-summary-of-the-linux-kernel-coding-style"&gt;
&lt;h2&gt;Executive Summary of the &lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/process/coding-style.html"&gt;Linux Kernel Coding Style&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The &lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/process/coding-style.html"&gt;style guide&lt;/a&gt; is a very straight forward document, the majority of
it dealing with whitespace and naming things. (Something every project
has to deal with regardless the number of developers it has) I'll list
the &lt;em&gt;&amp;quot;executive overview&amp;quot;&lt;/em&gt; here (for the SEO points 🙂) though you
should really take a moment to skim the Linux coding &lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/process/coding-style.html"&gt;style guide&lt;/a&gt; for
the latest updates.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;strong&gt;tabs are 8 characters, one statement per line.&lt;/strong&gt; Historically tab
keys were used &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Tab_key"&gt;to help tabulate charts&lt;/a&gt; on typewriters. Now they
help maintainers keep track of your control blocks.&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p class="first"&gt;&lt;strong&gt;80 characters per line.&lt;/strong&gt; An old throwback to when screens only
displayed 80 characters per line. &lt;em&gt;&amp;quot;Now-O-Days&amp;quot;&lt;/em&gt;, with modern 48&amp;quot; 8k
monitors, the 80 character hard limit is a bit softer.  Today we can
get away with a few lines under 100 characters if it improves
readability.  One possible exception to this is &lt;code&gt;printk()&lt;/code&gt;
statements, allowing us to easily &lt;code&gt;grep&lt;/code&gt; for them when
debugging.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;&lt;strong&gt;brackets.&lt;/strong&gt; Don't worry about brackets if you only need one line.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ENOMEM&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For multiple lines, place the bracket at the end of your if (switch,
do, for, while, ...) statements.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;pr_dbg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;Failed to allocate memory&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ENOMEM&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Though function have brackets on the next new line.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;do_something_cool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;something_cool&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;DONE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;&lt;strong&gt;spaces.&lt;/strong&gt; Most of the time, spaces go after each keyword. Some
notable exceptions to this are &lt;code&gt;sizeof&lt;/code&gt;, &lt;code&gt;typeof&lt;/code&gt;,
&lt;code&gt;alignof&lt;/code&gt;, and &lt;code&gt;__attribute__&lt;/code&gt; which are treaded somewhat
like functions in the kernel.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;&lt;strong&gt;naming.&lt;/strong&gt; Don't worry about descriptive variable names inside a
function. &lt;code&gt;tmp&lt;/code&gt; or &lt;code&gt;err&lt;/code&gt; are perfectly acceptable.
However global variables (to be used sparingly) should have
descriptive names. Don't use &lt;em&gt;MixedCase!&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;&lt;strong&gt;typedefs.&lt;/strong&gt; Just &lt;em&gt;Don't&lt;/em&gt;. Typedefs only hurt readability. You can't
immediately tell if it's a &lt;code&gt;struct&lt;/code&gt; or an integer of some
specific size, or a pointer (to a pointer, to a pointer...  to...).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;&lt;strong&gt;commenting your code.&lt;/strong&gt; If your explaining &lt;em&gt;how&lt;/em&gt; the code works,
you've written it wrong. Comments are for &lt;em&gt;what&lt;/em&gt; your code does if the
function's name didn't make this clear already.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These bullets take care of the majority of what a &lt;strong&gt;new&lt;/strong&gt; kernel
developer will deal with on a day-to-day basis, and if you're ever
unsure about something you can always &lt;code&gt;git grep &amp;lt;pattern&amp;gt;&lt;/code&gt; to see
how others have coded similar things in the past.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="scripts-checkpatch-pl"&gt;
&lt;h2&gt;scripts/checkpatch.pl&lt;/h2&gt;
&lt;p&gt;Fortunately, previous developers created a script to check for &lt;strong&gt;some&lt;/strong&gt;
of these style blunders and alleviate the burden put on maintainers to
point out these minor coding style violations. This script is one of
many helpful scripts inside the Linux source tree we &lt;a class="reference external" href="https://0x42.sh/building-the-linux-kernel/#cloning"&gt;downloaded in the
second challenge&lt;/a&gt;.  The one we're interested in and we will
be using today is called &lt;code&gt;scripts/checkpatch.pl&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;At the root of the Linux source code, use the &lt;code&gt;--help&lt;/code&gt; or
&lt;code&gt;--version&lt;/code&gt; options to see the usage instructions and get a
complete list of all the options available.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ ./scripts/checkpatch.pl --help
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For this Eudyptula Challenge, we will remain focused on the
&lt;code&gt;--file&lt;/code&gt; or &lt;code&gt;-f&lt;/code&gt; option to analyze the files the little
penguin gave us (&lt;a class="reference external" href="https://git.sr.ht/~bryanb/eudyptula/tree/ba717131952f892d88459a265915563796683347/item/tasks/01/hello-world.c"&gt;hello_world.c&lt;/a&gt; and &lt;a class="reference external" href="https://git.sr.ht/~bryanb/eudyptula/tree/ba717131952f892d88459a265915563796683347/item/tasks/04/coding_style.c"&gt;coding_style.c&lt;/a&gt;) for style
violations. I won't go into the specific changes we need to make (a task
for you to complete). Instead I will go through how to use
&lt;code&gt;checkpatch.pl&lt;/code&gt; to find some of issues with the files the Little
Penguin gave us.&lt;/p&gt;
&lt;p&gt;We can check our file using the &lt;code&gt;-f&lt;/code&gt; option followed by the file
we want to check:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ ./scripts/checkpatch.pl -f ~/eudyptupla/tasks/04/coding_style.c
...
total: 1 errors, 1 warnings, 1 checks, 35 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

eudyptupla/tasks/04/coding_style.c has style problems, please review.
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then use you favorite editor to fix the issues &lt;code&gt;checkpatch.pl&lt;/code&gt;
printed out.&lt;/p&gt;
&lt;p&gt;You can even use &lt;code&gt;--strict&lt;/code&gt; or the &lt;code&gt;--subjective&lt;/code&gt; option to
enable more subjective tests and &lt;code&gt;--codespell&lt;/code&gt; to check the file
for misspelled words.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ ./scripts/checkpatch.pl --strict --codespell -f ~/eudyptula ...
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;Keep in mind!&lt;/p&gt;
&lt;p class="last"&gt;&lt;code&gt;checkpatch.pl&lt;/code&gt; will find &lt;strong&gt;some&lt;/strong&gt; of the style issues. Consult
the &lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/process/coding-style.html"&gt;style guide&lt;/a&gt; to find all the needed changes.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Once you've made your changes, and feel the code looks good (it's hard
when you can remove any lines) use &lt;code&gt;git format-patch HEAD~&lt;/code&gt; and
send in your changes.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Good Luck!&lt;/em&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="wrapping-up"&gt;
&lt;h2&gt;Wrapping Up&lt;/h2&gt;
&lt;p&gt;If you made it here, you may be surprised to see how little there was to
this challenge. Essentially this &lt;em&gt;&amp;quot;boils down&amp;quot;&lt;/em&gt; to reading the Linux
&lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/process/coding-style.html"&gt;style guide&lt;/a&gt; and practice using it. Though, so far in my software
career, I have found that true for all coding standards.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;PS:&lt;/strong&gt; I made around 39 changes in all, though this may change
depending on &lt;em&gt;when&lt;/em&gt; you work on this.  Hope that helps :)&lt;/p&gt;
&lt;/div&gt;
</content><category term="Notes"/><category term="Eudyptula Challenge"/></entry><entry><title>Boston Parking Tickets</title><link href="https://0x42.sh/boston-parking-tickets/" rel="alternate"/><published>2021-02-09T00:00:00+00:00</published><updated>2021-02-09T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2021-02-09:/boston-parking-tickets/</id><summary type="html">&lt;p&gt;Last year (mid December 2020) I created a &lt;a class="reference external" href="https://www.foia.gov/"&gt;Freedom of Information Act&lt;/a&gt; request for all parking tickets issued in Boston from
2011 to the end of 2020. Eventually I was given 40 CSV files that I've combined
into a simple torrent you can download here:&lt;/p&gt;
&lt;blockquote&gt;
&lt;a class="reference external" href="https://git.sr.ht/~bryanb/boston-parking-tickets/blob/canon/data/boston-parking-tickets-2011-2020.tar.gz.torrent"&gt;boston-parking-tickets-2011-2020.tar.gz&lt;/a&gt;&lt;/blockquote&gt;
&lt;p&gt;Please feel …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Last year (mid December 2020) I created a &lt;a class="reference external" href="https://www.foia.gov/"&gt;Freedom of Information Act&lt;/a&gt; request for all parking tickets issued in Boston from
2011 to the end of 2020. Eventually I was given 40 CSV files that I've combined
into a simple torrent you can download here:&lt;/p&gt;
&lt;blockquote&gt;
&lt;a class="reference external" href="https://git.sr.ht/~bryanb/boston-parking-tickets/blob/canon/data/boston-parking-tickets-2011-2020.tar.gz.torrent"&gt;boston-parking-tickets-2011-2020.tar.gz&lt;/a&gt;&lt;/blockquote&gt;
&lt;p&gt;Please feel free to &lt;a class="reference external" href="https://0x42.sh/connect/"&gt;send me an email&lt;/a&gt;
if you don't wish to use BitTorrent, and I'll do my best to send you a copy
using a different protocol.&lt;/p&gt;
&lt;p&gt;These &lt;strong&gt;(very messy)&lt;/strong&gt; files have data on every ticket, time and date it was
issued, violation and fine total, how much was payed, the license plate number
and state, including the car's make, style and color on every parking ticket.
Also included is the hand-entered location of where the ticket was issued.&lt;/p&gt;
&lt;p&gt;I say messy, because each ticket is manually entered on very small screens,
often by parking attendants wearing gloves during winter while someone is
telling them about their very bad day. &lt;em&gt;Understandably there is a lot of typos
and cleaning to do.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;All of the code for these examples, along with the torrent file, is available in
a &lt;a class="reference external" href="https://git.sr.ht/~bryanb/boston-parking-tickets"&gt;git repository here&lt;/a&gt;.
My goal is to implement various data cleaning techniques to see how well I can
prepare this data for processing. What follows is the little bit of data
expiration.&lt;/p&gt;
&lt;div class="section" id="tickets-issued-in-each-year"&gt;
&lt;h2&gt;Tickets Issued in each Year&lt;/h2&gt;
&lt;img alt="tickets grouped by year." class="right" src="https://0x42.sh/boston-parking-tickets/tickets-per-year.png" /&gt;
&lt;p&gt;Boston police officers issued 13,023,114 parking tickets inside the Boston city
limits between January 1&lt;sup&gt;st&lt;/sup&gt; 2011 and December 31&lt;sup&gt;st&lt;/sup&gt; 2020. If we
exclude 2020 and its &lt;a class="reference external" href="https://www.bostonglobe.com/2020/03/26/metro/mayor-walsh-just-relaxed-some-boston-parking-rules-heres-what-they-are/"&gt;relaxed parking rules&lt;/a&gt;, Boston receives on average
1,367,606 (±51,848) parking tickets each year.&lt;/p&gt;
&lt;p&gt;Surprisingly there wasn't a significant change in the number of tickets issued
during each year even as Boston's &lt;a class="reference external" href="https://www.census.gov/quickfacts/fact/table/bostoncitymassachusetts,US/PST045219"&gt;population continues to grow&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I had assumed this would correlate with Boston's population growth, however a
simple linear fit shows there are 3,358 fewer tickets being issued each year,
well within the margin of error of the 1.4 million tickets issued on average.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="tickets-issued-in-each-month"&gt;
&lt;h2&gt;Tickets Issued in each Month&lt;/h2&gt;
&lt;img alt="tickets grouped by the month they where issued." class="right" src="https://0x42.sh/boston-parking-tickets/tickets-per-month.png" /&gt;
&lt;p&gt;Another interesting thing I noticed (for this southerner) if we plot the number
of tickets issued by month, we can clearly see a dip in tickets during the
winter months.&lt;/p&gt;
&lt;p&gt;I have no evidence to support this, however, I assume the snow covered streets
of Boston during the winter reduces the available metered parking spaces, or
removes the 20,000 people willing to park their car when snow plows are
actively roaming.&lt;/p&gt;
&lt;p&gt;On average there is a decline of 20,000 tickets during the winter months. If we
exclude 2020 again (shown here as red dots), there is on average 117,000 tickets
issued each month.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="tickets-issued-by-day-of-month"&gt;
&lt;h2&gt;Tickets Issued by Day of Month&lt;/h2&gt;
&lt;p&gt;Drilling further into when tickets are issued, if we look at each day in a month
each ticket is issued (excluding the 31&lt;sup&gt;st&lt;/sup&gt; day of the 7 months that have
31 days) we can see a pretty steady ticketing rate. Again the red dots represent
data from 2020 and are not included in the white shaded standard deviation range.&lt;/p&gt;
&lt;img alt="tickets grouped by day of the month" src="https://0x42.sh/boston-parking-tickets/tickets-by-day-of-month.png" /&gt;
&lt;p&gt;We can also clearly see the 572 days with under 1,000 tickets issued. As we'll
see in &lt;a class="reference internal" href="#tickets-issued-by-day-of-week"&gt;the next section&lt;/a&gt;, this is mostly
due (80% of the 572 day) to the relaxed parking rules on Sundays when most
parking meters are turned off.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="tickets-issued-by-day-of-week"&gt;
&lt;h2&gt;Tickets Issued by Day of Week&lt;/h2&gt;
&lt;img alt="tickets grouped by day of week" class="right" src="https://0x42.sh/boston-parking-tickets/tickets-by-day-of-week.png" /&gt;
&lt;p&gt;Like I was saying in the last section, &lt;a class="reference external" href="https://www.boston.gov/departments/parking-clerk/how-do-parking-meters-work"&gt;parking on Sundays and City holidays is
free&lt;/a&gt;.
When we split the tickets issued by day of week we can see just how great this
policy is for people who fail to feed their meters.&lt;/p&gt;
&lt;p&gt;It's also interesting to see the reduction (roughly 1,000 on average) in tickets
issued on Mondays. As of right now I don't have a good explanation as what could
cause this.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="tickets-by-violation"&gt;
&lt;h2&gt;Tickets by Violation&lt;/h2&gt;
&lt;img alt="number of tickets for each violation" class="right" src="https://0x42.sh/boston-parking-tickets/tickets-by-violation.png" /&gt;
&lt;p&gt;When we see what type of violations people are breaking on average each year, we
can start to see why Sundays and City holidays have such a huge impact on the
number of tickets issued each day.&lt;/p&gt;
&lt;p&gt;First place with 25.7% or 3,348,515 of all tickets issued was from unpaid parking
meters, most of which are disabled on Sundays.&lt;/p&gt;
&lt;p&gt;Followed by an ever shrinking list of significantly less common violations.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="the-to-do-list"&gt;
&lt;h2&gt;The To-Do List&lt;/h2&gt;
&lt;p&gt;There is still a large amount of cleaning work I would like to do in the future.
There is currently many misspelled states, vehicle makes and models, ticket
locations or cross streets indicated with different symbols all of which makes
classifying this data a fun and difficult task.&lt;/p&gt;
&lt;p&gt;As of right now though, I'll publish this dataset with the promise to see what
insights we can gleam from it in the future.&lt;/p&gt;
&lt;/div&gt;
</content><category term="Datasets"/></entry><entry><title>Cgit, Nginx &amp; Gitolite: A Personal Git Server</title><link href="https://0x42.sh/cgit-nginx-gitolite-a-personal-git-server/" rel="alternate"/><published>2021-01-12T00:00:00+00:00</published><updated>2021-01-12T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2021-01-12:/cgit-nginx-gitolite-a-personal-git-server/</id><summary type="html">&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;Ugh... He's So Lame: &lt;em&gt;(2023-04-08)&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;As I (and my responsibilities) continue to grow older, I find myself
having less and less time to do the things I used to (write and
maintain servers just for fun). And while I continue to want to do
these things, I have to be …&lt;/p&gt;&lt;/div&gt;</summary><content type="html">&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;Ugh... He's So Lame: &lt;em&gt;(2023-04-08)&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;As I (and my responsibilities) continue to grow older, I find myself
having less and less time to do the things I used to (write and
maintain servers just for fun). And while I continue to want to do
these things, I have to be realistic and prioritize the things I wish
to do with the ever decreasing time I have to do them.&lt;/p&gt;
&lt;p class="last"&gt;All of that to say, I've &lt;em&gt;&amp;quot;sold-out&amp;quot;&lt;/em&gt; and transitioned to paying
someone else (&lt;a class="reference external" href="https://sr.ht/getting-started"&gt;sourcehut&lt;/a&gt;) to worry about maintaining my publicly
accessable repositories for me and allowing me to prioritize other
exciting and novel things.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;I've been on a &lt;em&gt;&amp;quot;own my online presence&amp;quot;&lt;/em&gt; kick for more than a year now. So for
this (overly protracted) essay, I thought I'd publish my notes on how I created
my own Git server.&lt;/p&gt;
&lt;p&gt;There are many open source projects like &lt;a class="reference external" href="https://gitea.io/en-us/"&gt;GitTea&lt;/a&gt; or &lt;a class="reference external" href="https://gitlab.com/"&gt;GitLab&lt;/a&gt; to make hosting your
own git projects effortless; however I wanted a much more simple (read: old
school) setup. I ended up with something that uses many of the same projects
that the &lt;a class="reference external" href="https://www.kernel.org/"&gt;Linux Organization&lt;/a&gt; uses to publish the &lt;a class="reference external" href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/"&gt;Linux Kernel&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The server (as of this writing) uses &lt;a class="reference external" href="http://www.releases.ubuntu.com/20.04/"&gt;Ubuntu's 20.04.1 LTS (Focal Fossa)&lt;/a&gt;
running on &lt;a class="reference external" href="https://www.digitalocean.com/"&gt;Digital Ocean's&lt;/a&gt; hardware (&lt;a class="reference external" href="https://m.do.co/c/b0f6f650ad4e"&gt;referral-link&lt;/a&gt;). I wholeheartedly
support and recommend you chose a different setup. Diversity in people and in
tech stack is always and will always be a great thing.&lt;/p&gt;
&lt;p&gt;What lies below can be broken into 3 main topics:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;&lt;a class="reference internal" href="#the-start"&gt;The Start&lt;/a&gt; prepares a newly minted server for git hosting
duties. Creating a new admin user, locking down the OpenSSH daemon, and
installing fail2ban.&lt;/li&gt;
&lt;li&gt;&lt;a class="reference internal" href="#gitolite"&gt;Gitolite&lt;/a&gt; installs and configures the server to allow us (and
colleagues) to have more fine-grained control over who has access to
&lt;code&gt;git push|fetch&lt;/code&gt; on the server.&lt;/li&gt;
&lt;li&gt;And &lt;a class="reference internal" href="#cgit"&gt;Cgit&lt;/a&gt;, &lt;a class="reference internal" href="#fastcgi-wrapper"&gt;fcgiwrap&lt;/a&gt;, and &lt;a class="reference internal" href="#nginx"&gt;Nginx&lt;/a&gt;
to create a web-server to view our published projects.&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="line-block"&gt;
&lt;div class="line"&gt;In the end, you'll have a server much like &lt;a class="reference external" href="https://bryanbrattlof.com/500/"&gt;this one&lt;/a&gt;.&lt;/div&gt;
&lt;div class="line"&gt;&lt;em&gt;Enjoy!&lt;/em&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="the-start"&gt;
&lt;h2&gt;The Start&lt;/h2&gt;
&lt;p&gt;I often find security &lt;em&gt;&amp;quot;best practices&amp;quot;&lt;/em&gt; are a lot like driving down the
highway. Some people speeding past you are &lt;em&gt;&amp;quot;obviously&amp;quot;&lt;/em&gt; just moments away from
a major data breach, while the others you're passing are &lt;em&gt;&amp;quot;clearly&amp;quot;&lt;/em&gt; so worried
about the entire data-center burning down, they couldn't possibly get anything
else done. Everyone thinks everyone else has lost their marbles.&lt;/p&gt;
&lt;p&gt;So with that in mind, here are a few steps I took to secure my newly minted
server. Please feel free to use only the &lt;em&gt;&amp;quot;best practices&amp;quot;&lt;/em&gt; you deem appropriate
for your mission.&lt;/p&gt;
&lt;p&gt;Or just &lt;a class="reference internal" href="#gitolite"&gt;skip to the &amp;quot;installing Gitolite&amp;quot;&lt;/a&gt; part directly.&lt;/p&gt;
&lt;div class="section" id="admin-user"&gt;
&lt;h3&gt;Admin User&lt;/h3&gt;
&lt;p&gt;For whatever reason, be it for security or protecting the server from my
stupidity, one of the first things I do when creating a new server is add a new
user for my general admin tasks.&lt;/p&gt;
&lt;p&gt;Adding a new user is remarkably easy to do on a Ubuntu system:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ adduser limb
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You'll be prompted to answer a few questions, including creating a new UNIX
password. This will be the password you'll need to &lt;code&gt;sudo -i&lt;/code&gt; and gain
&lt;code&gt;root&lt;/code&gt; permissions, so make it a good one, or use tools like &lt;a class="reference external" href="https://www.passwordstore.org/"&gt;Pass&lt;/a&gt;, or
&lt;a class="reference external" href="https://bitwarden.com/"&gt;BitWarden&lt;/a&gt; to help you remember.&lt;/p&gt;
&lt;p&gt;Then give our new &lt;code&gt;limb&lt;/code&gt; user &lt;code&gt;sudo&lt;/code&gt; permissions:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ usermod -G sudo limb
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I've also largely eliminated all password based authentication when signing
into servers, relying on open source smart cards like &lt;a class="reference external" href="https://www.nitrokey.com/"&gt;NitroKey&lt;/a&gt; for
authentication. If interested, this requires we setup
&lt;code&gt;.ssh/authorized_keys&lt;/code&gt; for our &lt;code&gt;limb&lt;/code&gt; user:&lt;/p&gt;
&lt;p&gt;Just replace &lt;code&gt;key&lt;/code&gt; with your public ssh key:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ mkdir /home/limb/.ssh
$ echo &amp;quot;key&amp;quot; &amp;gt; /home/limb/.ssh/authorized_keys
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Next, set the &lt;code&gt;.ssh&lt;/code&gt; directory's file permissions so the &lt;code&gt;ssh&lt;/code&gt;
daemon can read the files:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ chown -R limb:limb /home/limb/.ssh
$ chmod 700 /home/limb/.ssh
$ chmod 644 /home/limb/.ssh/authorized_keys
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;Keep in Mind:&lt;/p&gt;
&lt;p class="last"&gt;If the &lt;code&gt;authorized_keys&lt;/code&gt; file or the &lt;code&gt;.ssh&lt;/code&gt; directory's
permissions are set too permissively (eg: &lt;code&gt;0777&lt;/code&gt;) the SSH daemon will
refuse to load the files.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;If everything worked, after you restart the &lt;code&gt;ssh&lt;/code&gt; daemon (&lt;code&gt;service
sshd restart&lt;/code&gt;) you will now be able to login as the administrator user:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ ssh limb@host
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="openssh"&gt;
&lt;h3&gt;OpenSSH&lt;/h3&gt;
&lt;p&gt;Git and Gitolite (&lt;a class="reference internal" href="#gitolite"&gt;installed in the next sections&lt;/a&gt;) will need us
to keep port 22 open, allowing us to &lt;code&gt;git push&lt;/code&gt; from anywhere on the
internet. This open port will eventually attract &lt;em&gt;&amp;quot;a lot&amp;quot;&lt;/em&gt; of attention from
bots who endlessly scour the internet looking for vulnerable servers,
mindlessly stuffing passwords, hoping one password will eventually let them in.&lt;/p&gt;
&lt;p&gt;We can eliminate all worry about weak or compromised passwords by disabling all
password based authentication, relying solely on &lt;a class="reference external" href="https://cryptography.io/en/latest/hazmat/primitives/asymmetric/"&gt;asymmetric cryptography&lt;/a&gt;, or
&lt;em&gt;&amp;quot;ssh keys&amp;quot;&lt;/em&gt;. Just use your favorite text editor to open
&lt;code&gt;/etc/ssh/sshd_config&lt;/code&gt; and ensure these lines exist somewhere in it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;PubkeyAuthentication yes
PasswordAuthentication no
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;New to ssh keys?&lt;/p&gt;
&lt;p class="last"&gt;Digital Ocean has a nice write-up on &lt;a class="reference external" href="https://www.digitalocean.com/community/tutorials/how-to-set-up-ssh-keys-on-ubuntu-20-04"&gt;how to get started&lt;/a&gt; with ssh keys.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;While we're here, a large majority &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt; of these bots are interested in logging
in as the &lt;code&gt;root&lt;/code&gt; user. If you created a new admin account in &lt;a class="reference internal" href="#admin-user"&gt;the previous
section&lt;/a&gt; and ensured you can login using your public key, you
can also disable &lt;code&gt;root&lt;/code&gt; logins entirely with this line in the config:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;PermitRootLogin no
&lt;/pre&gt;&lt;/div&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-1" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;Some simple &lt;em&gt;&amp;quot;bash-fu&amp;quot;&lt;/em&gt; on my &lt;code&gt;/var/log/auth.log&lt;/code&gt; shows ~93.58% of
the roughly 15,000 login attempts since I started this server, tried to
login as &lt;code&gt;root&lt;/code&gt; Second place was the user &lt;code&gt;git&lt;/code&gt; (including
legitimate logins) at ~1.82%.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;If you uploaded your public key to your VPS provider, most of these changes
should have already been configured for you. But in the off chance you had to
make some changes, restart the &lt;code&gt;ssh&lt;/code&gt; service to load the new config
changes in:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ service ssh restart
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="uncomplicated-firewall"&gt;
&lt;h3&gt;Uncomplicated FireWall&lt;/h3&gt;
&lt;p&gt;Depending on your VPS provider, they may also have a firewall system built into
their admin panel allowing you to apply rules simply by adding tags to a server.
However, I enjoy keeping all my firewall rules inside each box, if only for the
same reason I &lt;em&gt;keep all my socks on the left hand drawer,&lt;/em&gt; so everything stays
organized and in the same place.&lt;/p&gt;
&lt;p&gt;You can install &lt;code&gt;ufw&lt;/code&gt; using the Advanced Packaging Tool:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ apt install ufw
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Right now, the only thing we have enabled is &lt;code&gt;ssh&lt;/code&gt; which uses port 22. To
allow port 22 through &lt;code&gt;ufw&lt;/code&gt; just use the following command:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ ufw allow ssh
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;and then turn the firewall on:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ ufw enable
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;and &lt;strong&gt;viola!&lt;/strong&gt; You have a firewall.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="fail2ban"&gt;
&lt;h3&gt;fail2ban&lt;/h3&gt;
&lt;p&gt;Even though we've turned off password based authentication &lt;a class="reference internal" href="#openssh"&gt;in a previous
section&lt;/a&gt;, we will still receive a significant amount of bots
wasting our compute cycles trying to login. And while the likelihood of this
being successful is &lt;em&gt;zero&lt;/em&gt; when rounded to any order of magnitude, the bots
will nevertheless continue to pilfer a non-zero amount of CPU if given the
opportunity.&lt;/p&gt;
&lt;p&gt;To stop the most brazen of these bots, tools like &lt;a class="reference external" href="https://github.com/fail2ban/fail2ban"&gt;Fail2Ban&lt;/a&gt;, which creates
temporary firewall rules to block IP address who repeatedly fail to
authenticate with &lt;code&gt;ssh&lt;/code&gt;, are a great compromise between usefulness and
annoyance.&lt;/p&gt;
&lt;p&gt;The Advanced Packaging Tool can again help us install &lt;code&gt;fail2ban&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ apt install fail2ban
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Once installed, the &lt;code&gt;ssh&lt;/code&gt; &amp;quot;jail&amp;quot; will come pre-enabled for you. If you
wish to make any changes, you will need to make a copy of the &lt;code&gt;fail2ban&lt;/code&gt;
config file:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.local
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then add your changes to &lt;code&gt;jail.local&lt;/code&gt; so they will persist after an
upgrade.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;fail2ban&lt;/code&gt; does a great job documenting what each option does in the
config file. Some of the changes I made are:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;because I use &lt;code&gt;ufw&lt;/code&gt; to manage my firewall, I changed
&lt;code&gt;banaction = ufw&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;enabled &lt;code&gt;bantime.increment&lt;/code&gt; to increase the duration of a ban based on
how many times the IP address has been banned previously.&lt;/li&gt;
&lt;li&gt;enabled &lt;code&gt;bantime.rndtime&lt;/code&gt; to &lt;em&gt;&amp;quot;randomize&amp;quot;&lt;/em&gt; the length of a ban,
preventing bots from knowing exactly when they can resume their assault.&lt;/li&gt;
&lt;li&gt;enabled &lt;code&gt;bantime.maxtime&lt;/code&gt; so I won't need to unban IP addresses (if
you're unfortunate enough to share an IP with a bot).&lt;/li&gt;
&lt;li&gt;lowered &lt;code&gt;bantime&lt;/code&gt;, &lt;code&gt;findtime&lt;/code&gt; and &lt;code&gt;maxretry&lt;/code&gt; allowing me to
issue small bans that increase in severity as the IP address continues to
antagonize.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Once you're satisfied with your changes, start &lt;code&gt;fail2ban&lt;/code&gt; using
&lt;code&gt;systemd&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ systemctl enable fail2ban
$ systemctl start fail2ban
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And &lt;strong&gt;Done!&lt;/strong&gt;&lt;/p&gt;
&lt;div class="sidebar" id="jackass"&gt;
&lt;p class="first sidebar-title"&gt;Keep in Mind:&lt;/p&gt;
&lt;p class="last"&gt;Any &lt;em&gt;&amp;quot;script-kiddie&amp;quot;&lt;/em&gt; (read: jackass) that runs a pen-test script they found
on Reddit from a large network (eg: college, Starbucks) could ban everyone on
that network from your server. Make sure your ban rules have a way to forgive,
unless you enjoy playing sys-admin.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Depending on how &lt;em&gt;&amp;quot;popular&amp;quot;&lt;/em&gt; you are on the internet, you should start to see
&lt;code&gt;NOTICE&lt;/code&gt; lines in &lt;code&gt;/var/log/fail2ban.log&lt;/code&gt; of misbehaving bots and
the equivalent firewall rules in &lt;code&gt;ufw&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ cat /var/log/fail2ban.log | grep &amp;#39;NOTICE&amp;#39; | tail -1
...  [125553]: NOTICE  [sshd] Ban 156.155.159.161

$ ufw status
Status: active

To                         Action      From
--                         ------      ----
Anywhere                   REJECT      156.155.159.161
22/tcp                     ALLOW       Anywhere
22/tcp (v6)                ALLOW       Anywhere (v6)
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="gitolite"&gt;
&lt;h2&gt;Gitolite&lt;/h2&gt;
&lt;p&gt;Installing &lt;a class="reference external" href="https://gitolite.com/gitolite/index.html"&gt;Gitolite&lt;/a&gt; is amazingly simple, there are no
binaries to compile or daemons to monitor.&lt;/p&gt;
&lt;p&gt;At its core, Gitolite is just a collection of &lt;a class="reference external" href="https://www.perl.org/"&gt;Perl&lt;/a&gt; scripts that run after
someone signs into the server using the &lt;code&gt;ssh&lt;/code&gt; daemon &lt;a class="reference internal" href="#openssh"&gt;we configured in the
previous sections&lt;/a&gt;. Once installed, Gitolite will give us more
fine-grained-control over who has &lt;code&gt;git push|fetch&lt;/code&gt; permissions to each
repository. I encourage you to checkout &lt;a class="reference external" href="https://gitolite.com/gitolite/basic-admin.html"&gt;Gitolite's amazing documentation&lt;/a&gt; if only to see how capable Gitolite can be.&lt;/p&gt;
&lt;div class="section" id="step-1-create-the-git-user"&gt;
&lt;h3&gt;Step: 1 - Create The Git User&lt;/h3&gt;
&lt;p&gt;Before we install Gitolite, we'll need to create a new user for everyone to log
into and to run Gitolite's Perl scripts. I typically use the username &lt;code&gt;git&lt;/code&gt;
for this, feel free to replace &lt;code&gt;git&lt;/code&gt; with the username that you feel is
more appropriate.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ adduser --system --group --disabled-password --home /var/lib/git git
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This creates a new system-user on the server called &lt;code&gt;git&lt;/code&gt;. Because this is
not a &lt;em&gt;&amp;quot;normal&amp;quot;&lt;/em&gt; user, there will be no aging information in &lt;code&gt;/etc/shadow&lt;/code&gt;,
which is convenient when nobody will be monitoring this account.&lt;/p&gt;
&lt;p&gt;We also used the &lt;code&gt;--home&lt;/code&gt; option to set the &lt;code&gt;$HOME&lt;/code&gt; variable to
&lt;code&gt;/var/lib/git&lt;/code&gt;. This is where we will eventually put Gitolite's
configuration files and our Git repositories. Feel free to adjust this to where
you prefer, I've seen many use &lt;code&gt;/home/git&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I included the &lt;code&gt;--disabled-password&lt;/code&gt; to disable any password based access
into our new user. In &lt;a class="reference internal" href="#openssh"&gt;the previous sections&lt;/a&gt;, we've disabled all
password based authentication into the server and Gitolite requires ssh keys for
authentication, so disabling passwords for our user is a smart move.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="step-2-install-gitolite"&gt;
&lt;h3&gt;Step: 2 - Install Gitolite&lt;/h3&gt;
&lt;p&gt;Because Gitolite is just a bunch of Perl scripts, I prefer to install Gitolite
from &lt;a class="reference external" href="https://github.com/sitaramc/gitolite"&gt;the source&lt;/a&gt;. As we will see, installing Gitolite from
source also has the benefit of making upgrades and adding custom patches in the
future extremely easy.&lt;/p&gt;
&lt;p&gt;This also means we'll need to install Gitolite's dependencies ourselves:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ apt install perl git
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;When we (or a colleague) signs into the server, using the &lt;code&gt;git&lt;/code&gt; user, we
will automatically run Gitolite's Perl scripts, which means these scripts must
be executable by our &lt;code&gt;git&lt;/code&gt; user. So, to make managing file permissions
easier, we'll use our &lt;code&gt;git&lt;/code&gt; user for the rest of the installation process.&lt;/p&gt;
&lt;p&gt;Log into our &lt;code&gt;git&lt;/code&gt; user with the &amp;quot;substitute user&amp;quot; command: (assuming
you're the &lt;code&gt;root&lt;/code&gt; user)&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ su - git
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then clone Gitolite's source code into the &lt;code&gt;$HOME&lt;/code&gt; directory: (this should
be &lt;code&gt;/var/lib/git&lt;/code&gt; unless &lt;a class="reference internal" href="#step-1-create-the-git-user"&gt;you changed it&lt;/a&gt;
when we setup the &lt;code&gt;git&lt;/code&gt; user above)&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ git clone https://github.com/sitaramc/gitolite
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;🌠 The More You Know:&lt;/p&gt;
&lt;p class="last"&gt;If you want to use a particular version of Gitolite or want to add custom
patches, &lt;code&gt;cd&lt;/code&gt; into the &lt;code&gt;$HOME/gitolite&lt;/code&gt; directory and &lt;code&gt;git
checkout&lt;/code&gt; the desired tag, branch, or commit. All changes will be picked up
immediately the next time someone logs in.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="step-3-setup-gitolite"&gt;
&lt;h3&gt;Step: 3 - Setup Gitolite&lt;/h3&gt;
&lt;p&gt;To setup Gitolite on the server, we'll need to assign Gitolite an admin that will
have full control over editing Gitolite's configuration repository. This will
most likely be you.&lt;/p&gt;
&lt;p&gt;Still as the &lt;code&gt;git&lt;/code&gt; user, use &lt;a class="reference external" href="https://www.gnu.org/software/emacs/"&gt;your favorite text editor&lt;/a&gt; to
create a new file with your desired username in the &lt;code&gt;$HOME&lt;/code&gt; directory and
copy your &lt;strong&gt;public&lt;/strong&gt; ssh key into it. For example, my file would be called
&lt;code&gt;bryanbrattlof.pub&lt;/code&gt; and look like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ cat $HOME/bryanbrattlof.pub
ssh-ed25519 AAAAC3NzaC1l ...
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then run Gitolite's sanity checks and setup script to finish the installation:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ $HOME/gitolite setup -pk bryanbrattlof.pub
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Färdig!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If everything went well, you should now be able to clone Gitolite's
configuration repository:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ git clone git@host:gitolite-admin
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I recommend consulting &lt;a class="reference external" href="https://gitolite.com/gitolite/basic-admin.html"&gt;Gitolite's incredible documentation&lt;/a&gt; to understand how to properly configure access and add hooks to all
of your projects.&lt;/p&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;🤬 Uh Oh:&lt;/p&gt;
&lt;p class="last"&gt;If you're asked for a password when your try to clone &lt;code&gt;gitolite-admin&lt;/code&gt;
then something has gone wrong. This is usually a permission issue. Again,
consult &lt;a class="reference external" href="https://gitolite.com/gitolite/"&gt;Gitolite's superb documentation&lt;/a&gt; for some of the
more common troubleshooting advice.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="step-4-add-profile"&gt;
&lt;h3&gt;Step: 4 - Add ~/.profile&lt;/h3&gt;
&lt;p&gt;While not technically needed for Gitolite to function properly, I find adding
gitolite to our &lt;code&gt;git&lt;/code&gt; user's &lt;code&gt;$PATH&lt;/code&gt; is a great quality of life
improvement on the rare days I need to play system administrator.&lt;/p&gt;
&lt;p&gt;First, as our &lt;code&gt;git&lt;/code&gt; user, create a new &lt;code&gt;$HOME/bin&lt;/code&gt; directory:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ mkdir -p $HOME/bin
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Next, create the &lt;code&gt;$HOME/.profile&lt;/code&gt; file and add the &lt;code&gt;$HOME/bin&lt;/code&gt;
directory to our &lt;code&gt;$PATH&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;PATH=$HOME/bin:$PATH
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;a class="reference external" href="https://www.gnu.org/software/bash/"&gt;born again shell&lt;/a&gt; will automatically run
&lt;code&gt;$HOME/.profile&lt;/code&gt; when someone starts a new session.&lt;/p&gt;
&lt;p&gt;Now, we can use Gitolite's &lt;code&gt;install&lt;/code&gt; script to add a symbolic link inside
our &lt;code&gt;$HOME/bin&lt;/code&gt; folder:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ gitolite/install -ln $HOME/bin
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Done!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Logout and back in to the &lt;code&gt;git&lt;/code&gt; user (or use
&lt;code&gt;source $HOME/.profile&lt;/code&gt;) to pick up the changes. If everything was done
correctly, you won't need to type the full path to Gitolite anymore.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ whereis gitolite
gitolite: /var/lib/git/bin/gitolite
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="cgit"&gt;
&lt;h2&gt;Cgit&lt;/h2&gt;
&lt;p&gt;&lt;a class="reference external" href="https://git.zx2c4.com/cgit/about/"&gt;Cgit&lt;/a&gt; is a script (written in C) that uses the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Common_Gateway_Interface"&gt;Common Gateway
Interface (CGI)&lt;/a&gt; specification to give people a web view of our
projects. Convenient when you don't have access to your terminal or just want to
lookup (or showoff) some changes to a project.&lt;/p&gt;
&lt;p&gt;It operates as a back-end (much like &lt;a class="reference external" href="https://www.php.net/"&gt;PHP&lt;/a&gt;) to a webserver (we'll
install Nginx &lt;a class="reference internal" href="#nginx"&gt;in the next sections&lt;/a&gt;) that will parse our repositories
and return a web-page for our webserver to distribute.&lt;/p&gt;
&lt;p&gt;To get an idea for what Cgit will look like, some of the more popular projects
that use Cgit are the &lt;a class="reference external" href="https://git.kernel.org"&gt;Linux&lt;/a&gt; and &lt;a class="reference external" href="https://cgit.freebsd.org/"&gt;FreeBSD&lt;/a&gt; kernels, along with &lt;a class="reference external" href="https://git.zx2c4.com/?q=wireguard"&gt;Wireguard&lt;/a&gt; and
&lt;a class="reference external" href="https://git.zx2c4.com/cgit/about/"&gt;Cgit&lt;/a&gt; itself.&lt;/p&gt;
&lt;div class="section" id="step-1-install-cgit"&gt;
&lt;h3&gt;Step: 1 - Install Cgit&lt;/h3&gt;
&lt;p&gt;Just like with Gitolite, I prefer to install Cgit &lt;a class="reference external" href="https://git.zx2c4.com/cgit/"&gt;from source&lt;/a&gt;
so I can add personal patches and quickly change what version is running on the
server. This also means we'll need to install the dependencies ourselves:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ apt install libc6 liblua5.1-0 zlib1g \
  python3-docutils python3-markdown python3-pygments
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We'll also need the &lt;code&gt;build-essential&lt;/code&gt; packages to install the &lt;code&gt;gcc&lt;/code&gt;
and &lt;code&gt;make&lt;/code&gt; tools needed to compile Cgit after we've cloned the project:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ apt install build-essential
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then, as the &lt;code&gt;root&lt;/code&gt; user in the &lt;code&gt;/root&lt;/code&gt; directory clone the Cgit
project:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ git clone https://git.zx2c4.com/cgit
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Because Cgit uses parts of Git's source code, (included as a &lt;a class="reference external" href="https://git-scm.com/book/en/v2/Git-Tools-Submodules"&gt;submodule&lt;/a&gt;) we'll need to use &lt;code&gt;git submodule&lt;/code&gt; to download the
remaining code from the Git project.&lt;/p&gt;
&lt;p&gt;After you &lt;code&gt;cd&lt;/code&gt; into &lt;code&gt;cgit&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ git submodule init    # register the git submodule in .git/config
$ git submodule update  # clone/fetch and checkout correct git version
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="step-2-build-cgit"&gt;
&lt;h3&gt;Step: 2 - Build Cgit&lt;/h3&gt;
&lt;p&gt;With a full copy of Cgit on the server, we can now create some patches to
customize it for our use-case. We'll start with creating &lt;code&gt;cgit.conf&lt;/code&gt;
inside the &lt;code&gt;cgit&lt;/code&gt; project we just cloned, to tell &lt;code&gt;make&lt;/code&gt; where we
want to install the Cgit binaries.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;CGIT_SCRIPT_PATH = /var/www/html/cgit/cgi
CGIT_CONFIG = /var/www/html/cgit/cgitrc
CACHE_ROOT = /var/www/html/cgit/cache
prefix = /var/www/html/cgit
libdir = $(prefix)
filterdir = $(libdir)/filters
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Because this is a version controlled project, we can commit our changes to save
our work:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ git add -f cgit.conf
$ git commit -m &amp;quot;installation path changes&amp;quot;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Some additional changes I made:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Updated the &lt;code&gt;cgit.png&lt;/code&gt; and &lt;code&gt;favicon.ico&lt;/code&gt; icons&lt;/li&gt;
&lt;li&gt;Changed the &lt;code&gt;pygments&lt;/code&gt; highlighting style to &lt;em&gt;&amp;quot;algol_nu&amp;quot;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;Removed the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Data_URI_scheme"&gt;Data URI&lt;/a&gt; icons from the tab menu&lt;/li&gt;
&lt;li&gt;Limited the &lt;code&gt;max-width&lt;/code&gt; of readme pages to &lt;code&gt;95ch&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Added &lt;code&gt;padding: 1em;&lt;/code&gt; to code-blocks&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;When you're satisfied with your changes, use &lt;code&gt;make&lt;/code&gt; to compile and
install Cgit:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ make &amp;amp;&amp;amp; make install
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If everything went well, when you execute Cgit from the terminal, a web-page
should print out:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ ./cgit
Content-Type: text/html; charset=UTF-8
Last-Modified: Tue, 12 Jan 2021 22:35:43 GMT
Expires: Tue, 12 Jan 2021 22:40:43 GMT

&amp;lt;!DOCTYPE html&amp;gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And &lt;strong&gt;Done!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We can further customize Cgit's behavior using the &lt;code&gt;cgitrc&lt;/code&gt; file located
at &lt;code&gt;/var/www/html/cgit/cgitrc&lt;/code&gt;. Feel free to check out &lt;a class="reference external" href="https://git.zx2c4.com/cgit/tree/cgitrc.5.txt"&gt;the man page&lt;/a&gt; for a complete description of what every option does.&lt;/p&gt;
&lt;p&gt;Some of the options I used:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;set &lt;code&gt;scan-path&lt;/code&gt; to the location of our repositories
&lt;code&gt;/var/lib/git/repositories&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;set &lt;code&gt;project-list&lt;/code&gt; to the location of the &lt;code&gt;projects.list&lt;/code&gt; file
Gitolite creates, adding descriptions and categories to the list of
repositories on Cgit's index page&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="fastcgi-wrapper"&gt;
&lt;h2&gt;FastCGI Wrapper&lt;/h2&gt;
&lt;p&gt;Cgit, which uses code from Git, was designed to let users run a command (eg:
&lt;code&gt;git push&lt;/code&gt;) then exit, allowing our computers to reclaim the used
resources between each call. Nginx uses a faster protocol (&lt;a class="reference external" href="https://en.wikipedia.org/wiki/FastCGI"&gt;FastCGI&lt;/a&gt;) which calls
the same program multiple times without exiting.&lt;/p&gt;
&lt;p&gt;However because Cgit was designed to exit after every run, it will never give
back its used resources and will continue to take more, quickly exhausting all
of the computer's available resources. This is why we need &lt;code&gt;fcgiwrap&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Thankfully this is easy to install. The Advanced Packaging Tool can, once
again, help:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ apt install fcgiwrap
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then just enable and start the service:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ systemctl enable fcgiwrap
$ systemctl start fcgiwrap
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And that's a &lt;strong&gt;wrap!&lt;/strong&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="nginx"&gt;
&lt;h2&gt;Nginx&lt;/h2&gt;
&lt;p&gt;With &lt;a class="reference internal" href="#cgit"&gt;Cgit&lt;/a&gt; and the &lt;a class="reference internal" href="#fastcgi-wrapper"&gt;FastCGI Wrapper&lt;/a&gt; installed, we can now turn our attentions
to Nginx, which can be installed using the Advanced Packaging Tool:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ apt install nginx
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Next, create a new configuration file in the &lt;code&gt;/etc/nginx/sites-enabled&lt;/code&gt;
directory, replacing &lt;code&gt;git.bryanbrattlof.com&lt;/code&gt; with your domain. The
minimum configuration file you'll need for Cgit to work will look something
like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;server {
    server_name  git.bryanbrattlof.com;

    listen [::]:80;
    listen 80;

    access_log  /var/log/nginx/cgit-access.log;
    error_log   /var/log/nginx/cgit-error.log;

    root /var/www/html/cgit/cgi;
    try_files $uri @cgit;

    location @cgit {
        include          fastcgi_params;
        fastcgi_param    SCRIPT_FILENAME /var/www/html/cgit/cgi/cgit.cgi;
        fastcgi_pass     unix:/run/fcgiwrap.socket;

        fastcgi_param    PATH_INFO    $uri;
        fastcgi_param    QUERY_STRING $args;
        fastcgi_param    HTTP_HOST    $server_name;
    }
}
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Feel free to add more to this, I've added a custom &lt;a class="reference external" href="https://bryanbrattlof.com/500/"&gt;5xx page&lt;/a&gt;, caching headers, as well as
recommendations from &lt;a class="reference external" href="https://observatory.mozilla.org/analyze/git.bryanbrattlof.com"&gt;Mozilla Observatory&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Once satisfied, start the Nginx service and open port 80 in the firewall:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ service nginx start
$ ufw allow http
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If something went wrong, or if you ever change the configuration file, you can
use &lt;code&gt;nginx -t&lt;/code&gt; to check the configuration for errors and
&lt;code&gt;nginx -s reload&lt;/code&gt; to restart the Nginx server.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ nginx -t &amp;amp;&amp;amp; nginx -s reload
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="wrapping-up"&gt;
&lt;h2&gt;Wrapping Up&lt;/h2&gt;
&lt;p&gt;By now we should have a working git server. However like most creative things
&lt;em&gt;&amp;quot;90% done ... 90% left to go.&amp;quot;&lt;/em&gt; There is truly an endless supply of things you
can and should add or configure to make your server more secure and accessible.
If you're the type that likes to learn, then you'll likely find this as fun and
rewarding experience as I did.&lt;/p&gt;
&lt;p&gt;Some of the extra things I added:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Installed &lt;a class="reference external" href="https://certbot.eff.org/"&gt;certbot&lt;/a&gt; to install and manage a free SSL certificate from &lt;a class="reference external" href="https://letsencrypt.org/"&gt;Let's
Encrypt&lt;/a&gt;. This has largely been mandatory for any public server for around 5
years now&lt;/li&gt;
&lt;li&gt;Created a &lt;a class="reference external" href="https://www.borgbackup.org/"&gt;Borg&lt;/a&gt; based backup script with a Borg specific subscription to
&lt;a class="reference external" href="https://rsync.net/products/attic.html"&gt;rsync.net&lt;/a&gt; to backup my projects. Useful when &lt;a class="reference internal" href="#jackass"&gt;that jackass we talked about&lt;/a&gt; finds a 0-day&lt;/li&gt;
&lt;li&gt;Created a &lt;a class="reference external" href="https://git-scm.com/book/en/v2/Git-on-the-Server-Git-Daemon"&gt;Git Daemon&lt;/a&gt; service to allow people to clone my projects using
the &lt;code&gt;git://&lt;/code&gt; protocol, if they prefer&lt;/li&gt;
&lt;li&gt;Placed a bunch of &lt;a class="reference external" href="https://healthchecks.io/"&gt;Healthchecks&lt;/a&gt; Pings in the scripts and service required to
keep everything running. Fail2Ban, Borg Backup, Certbot, all will alert me
when &lt;code&gt;cron&lt;/code&gt; or &lt;code&gt;systemd&lt;/code&gt; fall over&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All of which should be given their own essay as they can be used in all your
setups, not just in this fully open-source and free (as in libre) git server.&lt;/p&gt;
&lt;!-- LocalWords:  config sshd PubkeyAuthentication PermitRootLogin fu --&gt;
&lt;!-- LocalWords:  PasswordAuthentication auth FireWall ufw pre sys SSL --&gt;
&lt;!-- LocalWords:  Git's Certbot Healthchecks libre VPS readme Cgit's --&gt;
&lt;/div&gt;
</content><category term="Notes"/></entry><entry><title>The Great Influenza</title><link href="https://0x42.sh/the-great-influenza/" rel="alternate"/><published>2020-12-07T00:00:00+00:00</published><updated>2020-12-07T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2020-12-07:/the-great-influenza/</id><summary type="html">&lt;a class="reference external image-reference" href="https://bookshop.org/books/the-great-influenza-the-story-of-the-deadliest-pandemic-in-history/9780143036494"&gt;
&lt;img alt="The cover of The Great Influenza by John M. Barry." class="right" src="https://0x42.sh/the-great-influenza/the-great-influenza.png" /&gt;
&lt;/a&gt;
&lt;p&gt;When living in New Orleans (&lt;a class="reference external" href="https://www.openstreetmap.org/search?query=new%20orleans#map=12/29.9496/-90.0652"&gt;Louisiana&lt;/a&gt;) one of the many &lt;em&gt;&amp;quot;iconic&amp;quot;&lt;/em&gt;
things visiting guests would request to see (before pandemics existed of course)
was the &lt;a class="reference external" href="https://www.neworleans.com/things-to-do/attractions/cemeteries/"&gt;above ground cemeteries&lt;/a&gt;. I was regularly surprised during each visit by both
how heavily touristed New Orleans' cemeteries are, and how much death the 1918 …&lt;/p&gt;</summary><content type="html">&lt;a class="reference external image-reference" href="https://bookshop.org/books/the-great-influenza-the-story-of-the-deadliest-pandemic-in-history/9780143036494"&gt;
&lt;img alt="The cover of The Great Influenza by John M. Barry." class="right" src="https://0x42.sh/the-great-influenza/the-great-influenza.png" /&gt;
&lt;/a&gt;
&lt;p&gt;When living in New Orleans (&lt;a class="reference external" href="https://www.openstreetmap.org/search?query=new%20orleans#map=12/29.9496/-90.0652"&gt;Louisiana&lt;/a&gt;) one of the many &lt;em&gt;&amp;quot;iconic&amp;quot;&lt;/em&gt;
things visiting guests would request to see (before pandemics existed of course)
was the &lt;a class="reference external" href="https://www.neworleans.com/things-to-do/attractions/cemeteries/"&gt;above ground cemeteries&lt;/a&gt;. I was regularly surprised during each visit by both
how heavily touristed New Orleans' cemeteries are, and how much death the 1918
influenza season brought to the city.&lt;/p&gt;
&lt;p&gt;New Orleans' hot and humid swamps had always brought disease to the historically
significant port city. In 1853 alone an estimated 8,000 people died from Yellow
Fever. And while 8,000 is a shockingly high number, there is something different
to seeing the aisles of 3,489 vaults (roughly 1% of New Orleans' population in
1918) for the people who died from influenza. It's much like the real life
version of &lt;a class="reference external" href="https://www.washingtonpost.com/graphics/2020/national/coronavirus-deaths-neighborhood/"&gt;&amp;quot;what if all C&amp;#64;\/!$-19 deaths happened in your neighborhood&amp;quot;&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;That experience together with reading &lt;a class="reference external" href="http://www.johnmbarry.com/"&gt;John M. Barry's&lt;/a&gt; book &lt;a class="reference external" href="https://bookshop.org/books/the-great-influenza-the-story-of-the-deadliest-pandemic-in-history/9780143036494"&gt;The Great Influenza&lt;/a&gt; it's easy to
see how humbling a pandemic can be. Even today, without a vaccine available, the
best advice doctors can give is little more than &lt;a class="reference external" href="https://www.cdc.gov/coronavirus/2019-ncov/prevent-getting-sick/cloth-face-cover-guidance.html"&gt;wear a mask and avoid other
people&lt;/a&gt;, the same advice given by doctors in 1918. It
would take scientists another 15 years, until 1933 to discover that influenza
was caused by a virus rather than a bacterium they has believed during the 1918
pandemic. Today, even with the amazing advances with mRNA, and the many other
tools 100 years of research has given us, science is still much slower than we
wish.&lt;/p&gt;
&lt;p&gt;Though, after reading the book that inspired &lt;a class="reference external" href="https://en.wikipedia.org/wiki/George_W._Bush"&gt;George W. Bush&lt;/a&gt; and his Health Secretary,
&lt;a class="reference external" href="https://en.wikipedia.org/wiki/Mike_Leavitt"&gt;Mike Leavitt&lt;/a&gt;, to work on and
publish a &lt;a class="reference external" href="https://www.govtrack.us/congress/bills/109/s3678/summary"&gt;multi-billion dollar pandemic-preparation bill&lt;/a&gt;, giving us a
playbook on how to navigate today's pandemic, it's clear how capable we are when
we collectively focus on a goal (and history).&lt;/p&gt;
&lt;p&gt;Doctors, as Barry describes, who where just recently expected to have what we
today would think of as a traditional medical school education, thanks to the
amazing work of schools like &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Johns_Hopkins_University"&gt;Johns Hopkins University&lt;/a&gt;, and &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Rockefeller_University"&gt;The
Rockefeller University&lt;/a&gt;
founded decades before the pandemic in 1918, showed up day after day and helped
the millions of people too sick to feed themselves all while making huge leaps
in scientific discovery, that would lay the groundwork for what we rely on
during the current pandemic and expect in modern medical universities.&lt;/p&gt;
&lt;p&gt;It was truly amazing (in a morbid way) to read the parallels between the 1918
influenza season play out in the book while watching the news today. And it was
amazing to see how determined the first responders of the &lt;em&gt;&amp;quot;lost generation&amp;quot;&lt;/em&gt;
reacted to major cities all but consumed by the overwhelming death brought by
influenza. One nurse was quoted in the book as saying:&lt;/p&gt;
&lt;blockquote&gt;
She remembered that at the peak of the epidemic the nurses wrapped more than
one living patient in winding sheets and put toe tags on the boys’ left big
toe. It saved time, and the nurses were utterly exhausted. The toe tags were
shipping tags, listing the sailor’s name, rank, and hometown. She remembered
bodies “stacked in the morgue from floor to ceiling like cord wood.” In her
nightmares she wondered “what it would feel like to be that boy who was at
the bottom of the pile&amp;quot; ...&lt;/blockquote&gt;
&lt;p&gt;Only to witness people today &lt;a class="reference external" href="https://time.com/5812569/covid-19-new-york-morgues/"&gt;say almost the same thing&lt;/a&gt; about our current
pandemic.&lt;/p&gt;
&lt;p&gt;At first I was a little uneasy starting this book in the middle of a pandemic
and presidential election. I thought that the parallels in the news of
relearning what we already knew 100 years ago would be too depressing for me to
read. And while I learned a great deal about the beginning of &lt;em&gt;&amp;quot;modern medical
schools&amp;quot;&lt;/em&gt;, the &lt;em&gt;&amp;quot;Captain America&amp;quot;&lt;/em&gt; levels of bravery and sacrifice fighting a
world war during the deadliest influenza season in history, the greatest
take-a-way I had from this read was how capable we humans are when we put our
minds to a goal, both good and bad.&lt;/p&gt;
</content><category term="Books"/></entry><entry><title>Calling Bullshit</title><link href="https://0x42.sh/calling-bullshit/" rel="alternate"/><published>2020-10-30T00:00:00+00:00</published><updated>2020-10-30T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2020-10-30:/calling-bullshit/</id><summary type="html">&lt;a class="reference external image-reference" href="https://bookshop.org/books/calling-bullshit-the-art-of-skepticism-in-a-data-driven-world/"&gt;
&lt;img alt="The cover of Calling Bullshit by Carl T. Bergstrom and Jevin D. West" class="right" src="https://0x42.sh/calling-bullshit/calling-bullshit.png" /&gt;
&lt;/a&gt;
&lt;p&gt;Turn on the news, log into social media, or just start talking to someone at the
store and you're sure to run into it.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Complete and utter bullshit.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Some of it is easy to spot: The &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Flat_Earth"&gt;earth is flat&lt;/a&gt;, &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Area_51#UFO_and_other_conspiracy_theories"&gt;Area 51 is experimenting on aliens&lt;/a&gt;, &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Chemtrail_conspiracy_theory"&gt;airlines are spraying
chemicals on …&lt;/a&gt;&lt;/p&gt;</summary><content type="html">&lt;a class="reference external image-reference" href="https://bookshop.org/books/calling-bullshit-the-art-of-skepticism-in-a-data-driven-world/"&gt;
&lt;img alt="The cover of Calling Bullshit by Carl T. Bergstrom and Jevin D. West" class="right" src="https://0x42.sh/calling-bullshit/calling-bullshit.png" /&gt;
&lt;/a&gt;
&lt;p&gt;Turn on the news, log into social media, or just start talking to someone at the
store and you're sure to run into it.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Complete and utter bullshit.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Some of it is easy to spot: The &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Flat_Earth"&gt;earth is flat&lt;/a&gt;, &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Area_51#UFO_and_other_conspiracy_theories"&gt;Area 51 is experimenting on aliens&lt;/a&gt;, &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Chemtrail_conspiracy_theory"&gt;airlines are spraying
chemicals on us&lt;/a&gt;,
the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Loch_Ness_Monster"&gt;loch ness monster&lt;/a&gt; or
&lt;a class="reference external" href="https://en.wikipedia.org/wiki/Bigfoot"&gt;bigfoot exists&lt;/a&gt;, &lt;a class="reference external" href="https://www.reddit.com/r/AskReddit/comments/7s36ub/what_conspiracy_theory_do_you_100_buy_into_and_why/dt1xdja/"&gt;Mattress Firms are
laundering money&lt;/a&gt;, &lt;a class="reference external" href="https://en.wikipedia.org/wiki/High-frequency_Active_Auroral_Research_Program#Conspiracy_theories"&gt;Alaska is
trying to control everyone's minds&lt;/a&gt;, and the
list goes on … for a depressingly long time.&lt;/p&gt;
&lt;p&gt;Other times, bullshit can be tricky. &lt;a class="reference external" href="https://www.callingbullshit.org/case_studies/case_study_foodstamp_fraud.html"&gt;Rampant food-stamps fraud&lt;/a&gt;, &lt;a class="reference external" href="https://xkcd.com/1161/"&gt;kills 99.9% of germs&lt;/a&gt;, &lt;a class="reference external" href="https://www.npr.org/sections/thesalt/2015/05/28/410313446/why-a-journalist-scammed-the-media-into-spreading-bad-chocolate-science"&gt;chocolate as a weight loss supplement&lt;/a&gt;,
&lt;a class="reference external" href="https://www.callingbullshit.org/case_studies/case_study_musician_mortality.html"&gt;rap music is deadly&lt;/a&gt;. Outside of scientific or journalistic
circles, you would be forgiven for believing some of these claims or falling for
their bullshit.&lt;/p&gt;
&lt;p&gt;And that's exactly what Carl Bergstrom's and Jevin West's new book, &lt;a class="reference external" href="https://bookshop.org/books/calling-bullshit-the-art-of-skepticism-in-a-data-driven-world/"&gt;Calling
Bullshit&lt;/a&gt;, is trying to fix. &lt;a class="reference external" href="http://ctbergstrom.com/"&gt;Carl Bergstrom&lt;/a&gt; is a professor of Biology at the University of
Washington who studies how information flows through scholarly circles. &lt;a class="reference external" href="https://jevinwest.org/"&gt;Jevin
West&lt;/a&gt; is a data scientist and associate professor at
the University of Washington who focuses on misinformation in science and
society. Together they created a fun and relevant &amp;quot;how-to&amp;quot; book on calling out
the bullshit we run into in our everyday lives.&lt;/p&gt;
&lt;p&gt;In their book, Carl and Jeven argue, rather than spending weeks into fact
checking every claim you hear, all you need to &amp;quot;call bullshit&amp;quot; on the vast
majority of these false claims is to remain skeptical, use basic logic, and
occasionally a quick web search. Even the imitating bullshit that hides behind
a wall of statistics uses the same basic fallacies that gives all bullshit away,
we just need practice spotting it. The best part of Carl's and Jeven's
techniques is you don't have to be a data scientist, professional statistician
or math prodigy to use them. And with the rise of &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Attention_economy"&gt;the attention economy&lt;/a&gt; this book couldn't have
come at a better time.&lt;/p&gt;
&lt;p&gt;This book was fun, timely (especially for the upcoming US election), and
enlightening. The book walks through case studies ranging from debunking
food-stamp fraud to papers claiming to have built AI to detect criminality.
Guiding us through how to spot the various forms of &amp;quot;new-school&amp;quot;, data
saturated bullshit we see everywhere today, from misleading visuals, equating
correlation and causation, unrepresentative data, selection biases, problems
with small sample sizes, and more. They even &lt;a class="reference external" href="https://www.callingbullshit.org/"&gt;setup a website&lt;/a&gt; to help us practice spotting bullshit,
like the fake images generated with cutting edge machine learning tools.&lt;/p&gt;
&lt;p&gt;Whatever we call bullshit, whether it fools us or not, lies, spin, fake news,
conspiracy theories and bullshit is everywhere. It's been with us forever and
will likely remain when it all ends. The reason it's so prevalent is fairly
simple. As an Italian programmer famously tweeted:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The bullshit asymmetry: the amount of energy needed to refute bullshit is an
order of magnitude bigger than to produce it.&lt;/p&gt;
&lt;p class="attribution"&gt;&amp;mdash;Alberto Brandolini (&lt;a class="reference external" href="https://twitter.com/ziobrando/status/289635060758507521"&gt;&amp;#64;ziobrando&lt;/a&gt;)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I really enjoyed this book. Even if you're not into math and statistics, this
book does an amazing job imparting the skills needed to spot logical fallacies
using relevant examples and a ton of humor, which makes the book such a fun
read that I would recommend to everyone.&lt;/p&gt;
</content><category term="Books"/></entry><entry><title>Adding OpenStreetMaps To Matplotlib</title><link href="https://0x42.sh/adding-openstreetmaps-to-matplotlib/" rel="alternate"/><published>2020-10-21T00:00:00+00:00</published><updated>2020-10-21T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2020-10-21:/adding-openstreetmaps-to-matplotlib/</id><summary type="html">&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;It's Dead Jim:&lt;/p&gt;
&lt;p&gt;OpenStreetMap uses expensive hardware amazingly donated by community
members. Recently they've started to block requests that abuse this
hardware which, unfortunately, includes this script.&lt;/p&gt;
&lt;p class="last"&gt;As someone who enjoys hosting publicly accessible computers on the
internet, I completly understand this position and wish to honnor it.
Therefore these …&lt;/p&gt;&lt;/div&gt;</summary><content type="html">&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;It's Dead Jim:&lt;/p&gt;
&lt;p&gt;OpenStreetMap uses expensive hardware amazingly donated by community
members. Recently they've started to block requests that abuse this
hardware which, unfortunately, includes this script.&lt;/p&gt;
&lt;p class="last"&gt;As someone who enjoys hosting publicly accessible computers on the
internet, I completly understand this position and wish to honnor it.
Therefore these scripts will remain broken. You are welcome to fix
them but I encourage you to read the &lt;a class="reference external" href="https://operations.osmfoundation.org/policies/tiles/"&gt;tile server
usage policy&lt;/a&gt; first to
ensure you do not abuse this wonderful asset.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Adding a map to your visuals is a great way to quickly understand the geographic
information you're trying to investigate. Thankfully there are quite a few
packages and libraries (like &lt;a class="reference external" href="https://geopandas.org/"&gt;geopandas&lt;/a&gt;, &lt;a class="reference external" href="https://scitools.org.uk/cartopy/docs/latest/"&gt;cartopy&lt;/a&gt;, &lt;a class="reference external" href="https://github.com/rossant/smopy"&gt;smopy&lt;/a&gt;, &lt;a class="reference external" href="https://github.com/python-visualization/folium"&gt;folium&lt;/a&gt;, &lt;a class="reference external" href="https://github.com/MatthewDaws/TileMapBase"&gt;tilemapbase&lt;/a&gt;, or &lt;a class="reference external" href="https://ipyleaflet.readthedocs.io/en/latest/"&gt;ipyleaflet&lt;/a&gt;) that can make creating these
visuals fairly straightforward and easy in your jupyter notebooks or whatever
stack you're using.&lt;/p&gt;
&lt;p&gt;For this essay though, I'll walk through the process of adding a base-map from
OpenStreetMap to you're matplotlib visuals without using any of these libraries.
In the end, we'll have a visual much like this (very messy) scatter-plot of
buses as they service route 16 in New Orleans.&lt;/p&gt;
&lt;img alt="A plot of the ~466,000 position reports for buses servicing route 16 as they work their way up and down South Claiborne Avenue, between South Carrollton Avenue and Harrah's near the French Quarter." src="https://0x42.sh/adding-openstreetmaps-to-matplotlib/route-16-plot.png" /&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;Quick Note:&lt;/p&gt;
&lt;p&gt;All the code used to generate these visuals is available in a &lt;a class="reference external" href="https://git.sr.ht/~bryanb/norta"&gt;git repository
here&lt;/a&gt;. If you see any issues, have a
question or idea, please feel free to &lt;a class="reference external" href="https://0x42.sh/connect/"&gt;send me an email&lt;/a&gt;.&lt;/p&gt;
&lt;p class="last"&gt;However the data used here is too large for me to comfortably publish online
directly, so I've published the torrent file in the git repository that you
&lt;a class="reference external" href="https://git.sr.ht/~bryanb/norta/blob/canon/data/bus.log.tar.gz.torrent"&gt;can download here&lt;/a&gt;. If you don't like using BitTorrent
please feel
free to &lt;a class="reference external" href="https://0x42.sh/connect/"&gt;send me an email&lt;/a&gt; and I'll
do my best to send you a copy.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="how-it-works"&gt;
&lt;h2&gt;How it Works&lt;/h2&gt;
&lt;p&gt;The ability to add our base-map to our &lt;a class="reference external" href="https://matplotlib.org/"&gt;matplotlib&lt;/a&gt;
visuals relies on matplotlib's &lt;a class="reference external" href="https://matplotlib.org/3.2.1/api/_as_gen/matplotlib.pyplot.imshow.html"&gt;imshow() function&lt;/a&gt;,
which internally uses the &lt;a class="reference external" href="https://python-pillow.org/"&gt;Pillow library&lt;/a&gt; to
display images, or any other two dimensional scalar data we want (like &lt;a class="reference external" href="https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html"&gt;numpy
arrays&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;For example, if we download &lt;a class="reference external" href="https://0x42.sh/pages/hi/profile.png"&gt;my self-portrait&lt;/a&gt;, we can add the image to a plot using code
like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;matplotlib&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;

&lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;imread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;path/to/my/self-portrait.png&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;imshow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Resulting in a matplotlib visual that looks like this:&lt;/p&gt;
&lt;img alt="A matplotlib plot of a self portrait (stick figure drawing) of Bryan Brattlof" src="https://0x42.sh/adding-openstreetmaps-to-matplotlib/self-portrait-plot.png" /&gt;
&lt;/div&gt;
&lt;div class="section" id="creating-the-map"&gt;
&lt;h2&gt;Creating the Map&lt;/h2&gt;
&lt;p&gt;The easiest way to generate the base-map for &lt;code&gt;plt.imshow()&lt;/code&gt; is to use a
mapping service. These mapping services use enormous amounts of raw computing
power to take the &lt;a class="reference external" href="https://wiki.openstreetmap.org/wiki/Planet.osm"&gt;terabytes of map data&lt;/a&gt; and render a map for us.
Today there are quite a few services available online. The one I enjoy working
with (and the one we will be using in this essay) is a free, community
maintained service called &lt;a class="reference external" href="https://www.openstreetmap.org/about"&gt;OpenStreetMap&lt;/a&gt;.&lt;/p&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;Before We Begin:&lt;/p&gt;
&lt;p class="last"&gt;OpenStreetMap uses (expensive) hardware that was amazingly donated by
community members. Please &lt;strong&gt;do not abuse this asset&lt;/strong&gt; or you will have a
great many people angry with you. I encourage you to read the &lt;a class="reference external" href="https://operations.osmfoundation.org/policies/tiles/"&gt;tile server
usage policy&lt;/a&gt; to
avoid this happening.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="tile-servers"&gt;
&lt;h2&gt;Tile Servers&lt;/h2&gt;
&lt;p&gt;To make updating and sharing their work easier, OpenStreetMap (and virtually all
other mapping services) have split their maps into billions of tiny (256 pixel)
sections, called tiles, that we can download individually from their tile servers.&lt;/p&gt;
&lt;p&gt;OpenStreetMap has &lt;a class="reference external" href="https://wiki.openstreetmap.org/wiki/Tile_servers"&gt;quite a few tile servers&lt;/a&gt; that style or prioritize
different map features with some having a slightly different
&lt;a class="reference external" href="https://en.wikipedia.org/wiki/API"&gt;API&lt;/a&gt; to request tiles. For example the
&lt;a class="reference external" href="http://maps.stamen.com/toner/#12/29.9722/-90.1167"&gt;Stamen Toner Map&lt;/a&gt; that I
used in the first visual and prefer for it's simple color pallet.&lt;/p&gt;
&lt;p&gt;For this essay though, we'll use the default tile server's API to request tiles:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;URL = &amp;quot;https://tile.openstreetmap.org/{z}/{x}/{y}.png&amp;quot;.format
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This string formatting function will replace the &lt;code&gt;{z}&lt;/code&gt;, &lt;code&gt;{x}&lt;/code&gt;, and
&lt;code&gt;{y}&lt;/code&gt; with the tile coordinates and zoom level of the tile we want to
download, where:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;code&gt;{z}&lt;/code&gt; is the &amp;quot;zoom&amp;quot; level ranging from 0 to 18. Zoom 0 being the most
&amp;quot;zoomed out&amp;quot; and needs only one tile to depict &lt;a class="reference external" href="https://tile.openstreetmap.org/0/0/0.png"&gt;the entire world&lt;/a&gt; at that level.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;{x}&lt;/code&gt; is the number of tiles from the left most tile of the map.&lt;/li&gt;
&lt;li&gt;and &lt;code&gt;{y}&lt;/code&gt; is the number of tiles from the top most tile of the map.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Both &lt;code&gt;{x}&lt;/code&gt; and &lt;code&gt;{y}&lt;/code&gt; depend on the zoom level &lt;code&gt;{z}&lt;/code&gt; we've
chosen, with larger zoom levels requiring more tiles to render the map. To
understand how to calculate which tiles we need for our data-set, we'll need to
understand how mapping projections work.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="map-projections"&gt;
&lt;h2&gt;Map Projections&lt;/h2&gt;
&lt;p&gt;Without going too deep into mapping projections, OpenStreetMap (along with many
other mapping services) needed a way to convert &lt;span class="formula"&gt;(&lt;i&gt;lat&lt;/i&gt;, &lt;i&gt;lon&lt;/i&gt;)&lt;/span&gt; coordinates
into planer &lt;span class="formula"&gt;(&lt;i&gt;x&lt;/i&gt;, &lt;i&gt;y&lt;/i&gt;)&lt;/span&gt; coordinates which work with their maps. Sadly there is
no perfect way to do this.&lt;/p&gt;
&lt;p&gt;Google (and everyone else eventually) settled on a variant of the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Mercator_projection"&gt;Mercator
Projection&lt;/a&gt; called the
&lt;a class="reference external" href="https://en.wikipedia.org/wiki/Web_Mercator_projection"&gt;Web Mercator Projection&lt;/a&gt;
which simplifies the conversion by assuming the earth is a perfect sphere (it's
not). This can (and does) lead to confusion in the final visuals and why many
official bodies refuse to accept this standard.&lt;/p&gt;
&lt;p&gt;The advantage of assuming the earth is a perfect sphere is that the equation to
convert our GPS coordinates into Web Mercator coordinates is fairly
straightforward. The &lt;a class="reference external" href="https://wiki.openstreetmap.org/wiki/Main_Page"&gt;OpenStreetMap Wiki&lt;/a&gt; has the algorithm available
in &lt;a class="reference external" href="https://wiki.openstreetmap.org/wiki/Mercator"&gt;multiple programming languages&lt;/a&gt;. Here is the one for Python:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;math&lt;/span&gt;
&lt;span class="n"&gt;TILE_SIZE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;256&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;point_to_pixels&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lon&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;zoom&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="sd"&gt;&amp;quot;&amp;quot;&amp;quot;convert gps coordinates to web mercator&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;zoom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;TILE_SIZE&lt;/span&gt;
    &lt;span class="n"&gt;lat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;radians&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lat&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;lon&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;180.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mf"&gt;360.0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lat&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lat&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pi&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mf"&gt;2.0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="downloading-a-tile"&gt;
&lt;h2&gt;Downloading A Tile&lt;/h2&gt;
&lt;p&gt;Now we can use the &lt;code&gt;point_to_pixels()&lt;/code&gt; function to calculate the number
of pixels from the top-left corner of the OSM map from the GPS coordinates in
our data-set at any &lt;code&gt;zoom&lt;/code&gt; level, for example the French Quarter of
New Orleans:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;zoom&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;point_to_pixels&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;90.064279&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;29.95863&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;zoom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Dividing the number of pixels by &lt;code&gt;TILE_SIZE&lt;/code&gt; will then give us the
&lt;code&gt;{x}&lt;/code&gt; and &lt;code&gt;{y}&lt;/code&gt; that we need for the &lt;code&gt;URL()&lt;/code&gt; function we
created &lt;a class="reference internal" href="#tile-servers"&gt;a few sections ago&lt;/a&gt; for the OpenStreetMap API.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;x_tiles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_tiles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;TILE_SIZE&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;TILE_SIZE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That we can then use, along with the &lt;a class="reference external" href="https://requests.readthedocs.io/en/master/"&gt;requests&lt;/a&gt; and &lt;a class="reference external" href="https://python-pillow.org/"&gt;Pillow&lt;/a&gt; libraries, to download a tile from the
OpenStreetMap tile servers:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;io&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BytesIO&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;PIL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;requests&lt;/span&gt;

&lt;span class="c1"&gt;# format the url&lt;/span&gt;
&lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;x_tiles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;y_tiles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;zoom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# make the request&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# just in case&lt;/span&gt;
    &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BytesIO&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# plot the tile&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;imshow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Producing a tile of Jackson Square in the French Quarter of New Orleans:&lt;/p&gt;
&lt;img alt="the tile" src="https://0x42.sh/adding-openstreetmaps-to-matplotlib/french-quarter-plot.png" /&gt;
&lt;/div&gt;
&lt;div class="section" id="stitching-tiles-together"&gt;
&lt;h2&gt;Stitching Tiles Together&lt;/h2&gt;
&lt;p&gt;To download all the tiles needed for our visual, we'll need to calculate the
limits of the data we'll be using in our visual. There are many ways we can do
this, all of them are valid. For simplicity though, I'll calculate the
&lt;span class="formula"&gt;&lt;i&gt;min&lt;/i&gt;&lt;/span&gt; and &lt;span class="formula"&gt;&lt;i&gt;max&lt;/i&gt;&lt;/span&gt; of both the &lt;code&gt;lat&lt;/code&gt; and &lt;code&gt;lon&lt;/code&gt; columns in
my &lt;a class="reference external" href="https://pandas.pydata.org/"&gt;pandas&lt;/a&gt; DataFrame:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;top&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bot&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lat&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;min&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;lef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rgt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lon&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lon&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This gives us a bounding box (in GPS coordinates) that encompasses our entire
data-set.&lt;/p&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;NOTE:&lt;/p&gt;
&lt;p class="last"&gt;This is a good time to adjust our &lt;code&gt;zoom&lt;/code&gt; level to download just enough
tiles. Please do not anger the community by &lt;a class="reference internal" href="#creating-the-map"&gt;downloading large collections of
tiles&lt;/a&gt; at one time.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Next, just like we did in &lt;a class="reference internal" href="#downloading-a-tile"&gt;the last section&lt;/a&gt;, we'll use
the &lt;code&gt;point_to_pixels()&lt;/code&gt; function to convert our GPS coordinates into Web
Mercator coordinates.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;zoom&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;
&lt;span class="n"&gt;x0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y0&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;point_to_pixels&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;top&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;zoom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;point_to_pixels&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rgt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bot&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;zoom&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That we can then divide by &lt;code&gt;TILE_SIZE&lt;/code&gt; to calculate the minimum and
maximum number of tiles we'll need to download for both the &lt;code&gt;{x}&lt;/code&gt; and
&lt;code&gt;{y}&lt;/code&gt; arguments for the API:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;x0_tile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y0_tile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x0&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;TILE_SIZE&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y0&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;TILE_SIZE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;x1_tile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y1_tile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ceil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x1&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;TILE_SIZE&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ceil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y1&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;TILE_SIZE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;NOTE:&lt;/p&gt;
&lt;p class="last"&gt;We use &lt;code&gt;math.ceil()&lt;/code&gt; to round up, assuring small fractions of a tile
will still be downloaded.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;As a precaution, we'll add an &lt;code&gt;assert&lt;/code&gt; statement to limit the number of
tiles we can download and save us from the embarrassment of accidentally
burdening OpenStreetMap tile servers.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x1_tile&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;x0_tile&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y1_tile&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;y0_tile&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;That&amp;#39;s too many tiles!&amp;quot;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now that we've calculated which tiles we need to download from OpenStreetMap, we
can use the built-in &lt;a class="reference external" href="https://docs.python.org/3/library/itertools.html"&gt;itertools&lt;/a&gt; &lt;code&gt;product()&lt;/code&gt; function
to loop through every tile, downloading and saving the tiles to a single large
pillow image using &lt;a class="reference external" href="https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.paste"&gt;Pillow's paste() function&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;itertools&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;

&lt;span class="c1"&gt;# full size image we&amp;#39;ll add tiles to&lt;/span&gt;
&lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;RGB&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x1_tile&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;x0_tile&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;TILE_SIZE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y1_tile&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;y0_tile&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;TILE_SIZE&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# loop through every tile inside our bounded box&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x_tile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_tile&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x0_tile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x1_tile&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y0_tile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y1_tile&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;x_tile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;y_tile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;zoom&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# just in case&lt;/span&gt;
        &lt;span class="n"&gt;tile_img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BytesIO&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;# add each tile to the full size image&lt;/span&gt;
    &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;paste&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;im&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tile_img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;box&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;x_tile&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;x0_tile&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;TILE_SIZE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y_tile&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;y0_tile&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;TILE_SIZE&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;imshow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Resulting in a plot like this:&lt;/p&gt;
&lt;img alt="A plot of New Orleans using the script we just developed to stitch multiple tiles together into one continuous map that we can place under our scatter plot in the next section." src="https://0x42.sh/adding-openstreetmaps-to-matplotlib/the-basemap.png" /&gt;
&lt;p&gt;The eagle-eyed among us will notice the image is too large for the visual we
want to create. This is because of the &lt;code&gt;math.ceil()&lt;/code&gt; and &lt;code&gt;int()&lt;/code&gt;
functions we used to round the pixel coordinates into &lt;code&gt;{x}&lt;/code&gt; and
&lt;code&gt;{y}&lt;/code&gt; tiles we used above. To get our image back to size we'll need to
crop out the fractions of tiles not inside our bounding box.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="cropping-the-basemap"&gt;
&lt;h2&gt;Cropping the Basemap&lt;/h2&gt;
&lt;p&gt;To help my human-eyed brethren, I added some lines to our previous graphic to
help understand what's going on. Essentially some fraction of each tile we've
downloaded (outlined in black) will be used in our final visual (outlined in
red) that we calculated in &lt;a class="reference internal" href="#stitching-tiles-together"&gt;the last section&lt;/a&gt;. Our
goal for this section is to trim the fraction of tiles outside of our red square.&lt;/p&gt;
&lt;img alt="A plot of New Orleans with black lines outlining each tile we downloaded from the tile servers overlaid with a red line representing the section of the map we wish to keep after we crop the image." src="https://0x42.sh/adding-openstreetmaps-to-matplotlib/basemap-cropping-lines.png" /&gt;
&lt;p&gt;To curtail our oversize image, we'll use pillow's &lt;a class="reference external" href="https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.crop"&gt;Image.crop()&lt;/a&gt;
function, which takes a tuple &lt;code&gt;(left, top, right, bottom)&lt;/code&gt; measured in
pixels from the top left corner to crop our image.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference internal" href="#stitching-tiles-together"&gt;From our work above&lt;/a&gt;, we know the pixel
coordinates of the red square is defined as &lt;code&gt;x0, y0&lt;/code&gt; and &lt;code&gt;x1, y1&lt;/code&gt;.
We can then multiply the tile coordinates &lt;code&gt;x0_tile, y0_tile&lt;/code&gt; by
&lt;code&gt;TILE_SIZE&lt;/code&gt; to find the pixel coordinates for the top-left corner of the
current (oversize) basemap:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;x0_tile&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;TILE_SIZE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y0_tile&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;TILE_SIZE&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now it's just a simple process of subtracting the edges of our red square from
the pixel coordinates we just calculated for our oversize image to crop it to
our desired size:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;crop&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;
    &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;x0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;# left&lt;/span&gt;
    &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;y0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;# top&lt;/span&gt;
    &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;# right&lt;/span&gt;
    &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;y1&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="c1"&gt;# bottom&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;imshow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Resulting in our final (properly sized) basemap for our visual:&lt;/p&gt;
&lt;img alt="A, now properly sized, plot of New Orleans using the cropping script we just developed to resize our basemap to the proper size for our visual." src="https://0x42.sh/adding-openstreetmaps-to-matplotlib/basemap-cropped.png" /&gt;
&lt;/div&gt;
&lt;div class="section" id="plotting-the-data"&gt;
&lt;h2&gt;Plotting The Data&lt;/h2&gt;
&lt;p&gt;Finally, with our basemap created, we can plot our data just like any other
visual with some key exceptions. We can start by setting a &lt;a class="reference external" href="https://matplotlib.org/3.3.1/api/_as_gen/matplotlib.pyplot.subplot.html"&gt;matplotlib
subplots()&lt;/a&gt;
and a &lt;a class="reference external" href="https://matplotlib.org/3.3.1/api/_as_gen/matplotlib.axes.Axes.scatter.html"&gt;scatter()&lt;/a&gt;
plot for the &lt;code&gt;lat&lt;/code&gt; and &lt;code&gt;lon&lt;/code&gt; columns in our pandas DataFrames:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subplots&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lon&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;alpha&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;red&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then we'll add an extra argument to the &lt;code&gt;imshow()&lt;/code&gt; function to properly
locate our image in the final visual. The &lt;code&gt;extent&lt;/code&gt; argument is used to
move a image to a &lt;a class="reference external" href="https://matplotlib.org/3.3.1/tutorials/intermediate/imshow_extent.html"&gt;particular region in dataspace&lt;/a&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;imshow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;extent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rgt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bot&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;top&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Next, we'll lock down the &lt;span class="formula"&gt;&lt;i&gt;x&lt;/i&gt;&lt;/span&gt; and &lt;span class="formula"&gt;&lt;i&gt;y&lt;/i&gt;&lt;/span&gt; axes to the limits we defined
&lt;a class="reference internal" href="#stitching-tiles-together"&gt;a few sections ago&lt;/a&gt; by using the
&lt;code&gt;set_ylim()&lt;/code&gt; and &lt;code&gt;set_xlim()&lt;/code&gt; functions.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_ylim&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bot&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;top&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ax&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_xlim&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rgt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;All of this work will produce a simple graphic with a (gorgeous) basemap of
buses servicing New Orleans' Route 16.&lt;/p&gt;
&lt;img alt="The final visual we've been working to depicting the roughly 400,000 position reports of buses as they service route 16 of New Orleans." src="https://0x42.sh/adding-openstreetmaps-to-matplotlib/final-visual.png" /&gt;
&lt;/div&gt;
</content><category term="Tips"/><category term="NORTA"/></entry><entry><title>Tribe</title><link href="https://0x42.sh/tribe/" rel="alternate"/><published>2020-09-08T00:00:00+00:00</published><updated>2020-09-08T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2020-09-08:/tribe/</id><summary type="html">&lt;a class="reference external image-reference" href="https://bookshop.org/books/tribe-on-homecoming-and-belonging/9781455566389"&gt;
&lt;img alt="The cover of Tribe by Sebastian Junger" class="right" src="https://0x42.sh/tribe/tribe.png" /&gt;
&lt;/a&gt;
&lt;p&gt;We are currently living in the most connected time in human history. Today,
&lt;a class="reference external" href="https://ourworldindata.org/internet"&gt;half of the world's population&lt;/a&gt; has
some type of access to the internet and the limitless information it contains.
Together with the advent of social media, geography is no longer a limitation to
finding and belonging to …&lt;/p&gt;</summary><content type="html">&lt;a class="reference external image-reference" href="https://bookshop.org/books/tribe-on-homecoming-and-belonging/9781455566389"&gt;
&lt;img alt="The cover of Tribe by Sebastian Junger" class="right" src="https://0x42.sh/tribe/tribe.png" /&gt;
&lt;/a&gt;
&lt;p&gt;We are currently living in the most connected time in human history. Today,
&lt;a class="reference external" href="https://ourworldindata.org/internet"&gt;half of the world's population&lt;/a&gt; has
some type of access to the internet and the limitless information it contains.
Together with the advent of social media, geography is no longer a limitation to
finding and belonging to a community, allowing the most obscure groups to exist
and thrive. In short, with the internet, it has never been easier to belong.&lt;/p&gt;
&lt;p&gt;Surprisingly though, we are also living in the loneliest time in human history.
&lt;a class="reference external" href="https://www.multivu.com/players/English/8294451-cigna-us-loneliness-survey/docs/IndexReport_1524069371598-173525450.pdf"&gt;An online study of 20,000 Americans&lt;/a&gt;
above the age of 18 found that 43% of respondents feel they lack companionship,
43% feel their relationships are meaningless, 43% feel isolated from others, and
39% are no longer close to anyone. The study went on to find that, how frequently
people have meaningful face-to-face interactions and how lonely they felt had
the highest correlation, and contrary to popular belief, they found no
correlation between how lonely you feel and how often you use social media. This
suggests that while social media is a great benefit to the world, it cannot
replace the benefits we get from belonging to an (&lt;a class="reference external" href="https://www.dictionary.com/browse/irl"&gt;IRL&lt;/a&gt;) tribe.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="https://www.sebastianjunger.com/"&gt;Sebastian Junger's&lt;/a&gt; book &lt;a class="reference external" href="https://bookshop.org/books/tribe-on-homecoming-and-belonging/9781455566389"&gt;Tribe&lt;/a&gt;
is a (thin, 138 page) thought-provoking book that blends anthropology and
psychology to discuss the obstacles military veterans and other people returning
from war-torn regions and other high stress environments face when returning
home. Junger argues that the tight knit bonds formed in these intense
environments is what we have historically depended on to survive and thrive as a
species since the stone-age and what our, increasingly individualistic, society
needs to encourage in order to reduce the obstacles our veterans face as they
struggle to find the same bonds they formed during war. These &lt;em&gt;&amp;quot;small groups
defined by clear purpose and understanding&amp;quot;&lt;/em&gt; are what we need to feel purposeful
and happy and unfortunately is in increasingly short supply in our modern world.
Tribe essentially turns the question around from asking what is wrong with our
veterans, to asking what is wrong with us.&lt;/p&gt;
&lt;p&gt;I personally enjoyed this book. Junger, an award wining journalist and special
corespondent to ABC News, has spent more time than most in intense environments
allowing him to make the observations needed for the claims in his book. However
I would have liked to see more examples taken from history, anthropology and
psychology experts, or more interviews from people returning home, to give the
book less of a journalistic story and more of an anthropological investigation.
Overall the, very short, book makes a complex topic very accessible and
introduces me to a new possible explanation as to why we are collectively
struggling to belong when we have never been more &amp;quot;connected.&amp;quot;&lt;/p&gt;
</content><category term="Books"/></entry><entry><title>Why We Sleep</title><link href="https://0x42.sh/why-we-sleep/" rel="alternate"/><published>2020-08-27T00:00:00+00:00</published><updated>2020-08-27T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2020-08-27:/why-we-sleep/</id><summary type="html">&lt;a class="reference external image-reference" href="https://bookshop.org/books/why-we-sleep-unlocking-the-power-of-sleep-and-dreams/9781501144318"&gt;
&lt;img alt="The cover of Why We Sleep by Matthew Walker" class="right" src="https://0x42.sh/why-we-sleep/why-we-sleep.png" /&gt;
&lt;/a&gt;
&lt;p&gt;If you're anything like me, the fundamental processes all living organisms need
to sustain life do not come naturally to us. Things like knowing when to eat
food, drink fluids, or just sleeping are often completely forgotten when our
mind finds something else to occupy its time. It's this intense …&lt;/p&gt;</summary><content type="html">&lt;a class="reference external image-reference" href="https://bookshop.org/books/why-we-sleep-unlocking-the-power-of-sleep-and-dreams/9781501144318"&gt;
&lt;img alt="The cover of Why We Sleep by Matthew Walker" class="right" src="https://0x42.sh/why-we-sleep/why-we-sleep.png" /&gt;
&lt;/a&gt;
&lt;p&gt;If you're anything like me, the fundamental processes all living organisms need
to sustain life do not come naturally to us. Things like knowing when to eat
food, drink fluids, or just sleeping are often completely forgotten when our
mind finds something else to occupy its time. It's this intense focus that
allows me to characterize my relationship with sleep as quarrelsome.&lt;/p&gt;
&lt;p&gt;After reading Matthew Walker's book, &lt;a class="reference external" href="https://bookshop.org/books/why-we-sleep-unlocking-the-power-of-sleep-and-dreams/9781501144318"&gt;Why We Sleep&lt;/a&gt;, it's easy see the
damage my adversarial relationship with sleeping causes. Walker, who is the
director of the Human Sleep Science Center in UC Berkeley, lists some of the
immediately recognizable symptoms from the aftermath of pulling an
&lt;em&gt;&amp;quot;all-nighter&amp;quot;&lt;/em&gt;. In the short term, sleep deprivation destroys your creativity,
problem solving, and decision-making skills, along with inhibiting your memory
and your ability to learn. Chronic lack of sleep also has long term
side-effects, devouring your heart, brain, and mental health, along with your
emotional well-being, immune system, and ultimately your life span.&lt;/p&gt;
&lt;p&gt;Throughout the book Walker does a great job of describing the why and how
sleep can be such a cure-all for so many ailments, and why 8 hours in bed does
not equal 8 hours of sleep. The goal of sleep is to bathe our brain in the
restorative neurochemials generated by the different &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Rapid_eye_movement_sleep"&gt;Rapid-Eye Movement (REM)&lt;/a&gt; and &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Non-rapid_eye_movement_sleep"&gt;Non-Rapid-Eye
Movement (NREM)&lt;/a&gt;
stages of sleep. The 8 hour minimum suggested by doctors ensures you spend
enough time in each stage to receive their full effect. If you (like me) have
trouble sleeping throughout the night, you need to spend more than 8 hours in
bed, giving your brain more opportunities to cycle through the different stages
of sleep. He goes on to explain the importance of each stage of the sleeping
process and why you will never be able to &amp;quot;catch up&amp;quot; from all the lost sleep on
the weekend. A popular myth I had believed all throughout my college years.&lt;/p&gt;
&lt;p&gt;Walker also does an amazing job describing the various systems our brain uses to
control the many functions that regulate our lives, including when to sleep.
My favorite section dealt with our &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Circadian_rhythm"&gt;Circadian Rhythm&lt;/a&gt; and how it changes as we age. Walker argues that
teenagers, notorious for sleeping in, might have their brain chemistry to
thank. Our circadian rhythm is a ancient internal time keeping process in our
brain that regulates many things like our moods, emotions, urine output, core
body temperature, metabolic rate, and numerous other hormones. As we age, our
circadian rhythm adjusts as our brain develops, until we finally reach the
typical split between morning people (morning larks) and evening dwellers (night
owls).&lt;/p&gt;
&lt;p&gt;For teenagers, as their brains continue to develop, their circadian rhythms are
skewed to the evening, allowing them to naturally stay awake and alert
throughout much of the night. This will make it almost impossible for them to
wakeup before the irresponsibly set 7am school starting bell with a full 8 hours
of much needed sleep. Walker describes this situation as if a fully grown adult
where to go to bed at 5 or 6pm, and frustratingly try to sleep only to be woken
at 1 or 2am.&lt;/p&gt;
&lt;p&gt;Overall, Matthew Walker did an amazing job creating an enjoyably short
introduction into the world of sleep science. Our brains are such a complex part
of our body that we are only just beginning to understand. I thoroughly enjoyed
reading this book, and I'm sure you will to.&lt;/p&gt;
</content><category term="Books"/></entry><entry><title>Modify The Linux Kernel</title><link href="https://0x42.sh/modify-the-linux-kernel/" rel="alternate"/><published>2020-08-15T00:00:00+00:00</published><updated>2020-08-15T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2020-08-15:/modify-the-linux-kernel/</id><summary type="html">&lt;p&gt;For the 3&lt;sup&gt;rd&lt;/sup&gt; task (this task) in &lt;a class="reference external" href="https://eudyptula-challenge.org"&gt;The Eudyptula Challenge&lt;/a&gt;, we continue our work from &lt;a class="reference external" href="https://0x42.sh/building-the-linux-kernel/"&gt;the last
challenge&lt;/a&gt;, compiling the
absolute latest Linux Kernel from the source code. This time we focus on
modifying the Makefiles and &lt;code&gt;.config&lt;/code&gt; files that help us compile the
kernel.&lt;/p&gt;
&lt;p&gt;Before we begin, if …&lt;/p&gt;</summary><content type="html">&lt;p&gt;For the 3&lt;sup&gt;rd&lt;/sup&gt; task (this task) in &lt;a class="reference external" href="https://eudyptula-challenge.org"&gt;The Eudyptula Challenge&lt;/a&gt;, we continue our work from &lt;a class="reference external" href="https://0x42.sh/building-the-linux-kernel/"&gt;the last
challenge&lt;/a&gt;, compiling the
absolute latest Linux Kernel from the source code. This time we focus on
modifying the Makefiles and &lt;code&gt;.config&lt;/code&gt; files that help us compile the
kernel.&lt;/p&gt;
&lt;p&gt;Before we begin, if you wish to work on The Eudyptula Challenge yourself before
you read my notes (recommended), you can use &lt;a class="reference external" href="https://git.sr.ht/~bryanb/eudyptula"&gt;my git repository&lt;/a&gt;, which has all 20
tasks and the code I used to complete each one.&lt;/p&gt;
&lt;div class="section" id="task-no-3"&gt;
&lt;h2&gt;Task No.3&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;Now that you have your custom kernel up and running, it's time to modify
it!&lt;/p&gt;
&lt;p&gt;The tasks for this round is:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;take the kernel git tree from Task 02 and modify the Makefile to
and modify the EXTRAVERSION field.  Do this in a way that the
running kernel (after modifying the Makefile, rebuilding, and
rebooting) has the characters &amp;quot;-eudyptula&amp;quot; in the version string.&lt;/li&gt;
&lt;li&gt;show proof of booting this kernel.  Extra cookies for you by
providing creative examples, especially if done in interpretive
dance at your local pub.&lt;/li&gt;
&lt;li&gt;Send a patch that shows the Makefile modified.  Do this in a manner
that would be acceptable for merging in the kernel source tree.
(Hint, read the file Documentation/SubmittingPatches and follow the
steps there.)&lt;/li&gt;
&lt;/ul&gt;
&lt;p class="attribution"&gt;&amp;mdash;Little Penguin&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Depending on how literally we interpret &lt;em&gt;&amp;quot;modifying the Makefile&amp;quot;&lt;/em&gt;, there are
multiple ways we can accomplish this challenge. One of which we briefly talked
about in &lt;a class="reference external" href="https://0x42.sh/building-the-linux-kernel/"&gt;the last challenge&lt;/a&gt; was to override variables
in our &lt;code&gt;make&lt;/code&gt; command when compiling the kernel.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="override-directives"&gt;
&lt;h2&gt;Override Directives&lt;/h2&gt;
&lt;p&gt;Just like with a lot of userspace projects, the Linux kernel uses &lt;a class="reference external" href="https://www.gnu.org/software/make/"&gt;GNU make&lt;/a&gt; to compile the various files into its
final form. This means we can use make's ability to pass variable assignments as
command line arguments.&lt;/p&gt;
&lt;p&gt;From Chapter 6 of make's &lt;a class="reference external" href="https://www.gnu.org/software/make/manual/html_node/Override-Directive.html"&gt;documentation&lt;/a&gt;, if we use a command line argument to set
the &lt;code&gt;EXTRAVERSION&lt;/code&gt; variable, then all other assignments to
&lt;code&gt;EXTRAVERSION&lt;/code&gt; within the Makefile will be ignored. For example, we can
override &lt;code&gt;EXTRAVERSION&lt;/code&gt; by using a &lt;code&gt;make&lt;/code&gt; command like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ make -j `getconf _NPROCESSORS_ONLN` EXTRAVERSION=-eudyptula
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This will force &lt;code&gt;make&lt;/code&gt; to set &lt;code&gt;EXTRAVERSION&lt;/code&gt; to &lt;code&gt;-eudyptula&lt;/code&gt;
and ignore the value set in the kernel's Makefile, accomplishing our task for The
Little Penguin.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="modify-the-makefile"&gt;
&lt;h2&gt;Modify the Makefile&lt;/h2&gt;
&lt;p&gt;Our second option, if you want to take &lt;em&gt;&amp;quot;modifying the Makefile&amp;quot;&lt;/em&gt; literally, is
to do exactly that.&lt;/p&gt;
&lt;p&gt;Simply open the kernel's Makefile, located in the root directory in the source
code we copied from Linus in &lt;a class="reference external" href="https://0x42.sh/building-the-linux-kernel/"&gt;the last challenge&lt;/a&gt;, with your favorite text editor. The first five
lines will have &lt;code&gt;EXTRAVERSION&lt;/code&gt; somewhere in it. Change the value to
&lt;code&gt;-eudyptula&lt;/code&gt; and save.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="gd"&gt;-EXTRAVERSION = -rc1&lt;/span&gt;
&lt;span class="gi"&gt;+EXTRAVERSION = -eudyptula&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now that the kernel's Makefile will append &lt;code&gt;-eudyptula&lt;/code&gt; to the kernel's
version string by default, we can simplify our &lt;code&gt;make&lt;/code&gt; command to build the
kernel into:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ make -j `getconf _NPROCESSORS_ONLN`
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="modify-menuconfig"&gt;
&lt;h2&gt;Modify menuconfig&lt;/h2&gt;
&lt;p&gt;Our third option and arguably the least accurate of the three ways to
interpret &lt;em&gt;&amp;quot;modifying the Makefile&amp;quot;&lt;/em&gt; is to use the ncurses configuration tool
that we briefly talked about in &lt;a class="reference external" href="https://0x42.sh/building-the-linux-kernel/"&gt;the last challenge&lt;/a&gt;. I say this is the least
accurate because this method modifies the &lt;code&gt;CONFIG_LOCALVERSION&lt;/code&gt; variable
and not the &lt;code&gt;EXTRAVERSION&lt;/code&gt; variable, which will add &lt;code&gt;-eudyptula&lt;/code&gt; to
our kernel's version string, however not technically in the correct place.&lt;/p&gt;
&lt;p&gt;We can start the ncurses menu with the following terminal command:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ make menuconfig
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;where you should see a screen that looks something like this:&lt;/p&gt;
&lt;img alt="Something" src="https://0x42.sh/modify-the-linux-kernel/menuconfig-home.png" /&gt;
&lt;p&gt;From here, navigate to the &lt;strong&gt;General setup&lt;/strong&gt; option and press &lt;strong&gt;ENTER&lt;/strong&gt; to move
into the next menu. Next, use the arrow keys to highlight the &lt;strong&gt;Local version&lt;/strong&gt;
option and press &lt;strong&gt;ENTER&lt;/strong&gt;. A new menu will appear letting you enter a value.&lt;/p&gt;
&lt;img alt="Else" src="https://0x42.sh/modify-the-linux-kernel/menuconfig-localversion.png" /&gt;
&lt;p&gt;Type in &lt;strong&gt;-eudyptula&lt;/strong&gt; and press &lt;strong&gt;TAB&lt;/strong&gt; to move our cursor to the &lt;strong&gt;&amp;lt;OK&amp;gt;&lt;/strong&gt;
option and press &lt;strong&gt;ENTER&lt;/strong&gt; to return back to the &lt;strong&gt;General setup&lt;/strong&gt; menu. If
everything went according to plan you should see &lt;em&gt;&amp;quot;-eudyptula&amp;quot;&lt;/em&gt; set in the Local
version menu:&lt;/p&gt;
&lt;img alt="else" src="https://0x42.sh/modify-the-linux-kernel/menuconfig-eudyptula.png" /&gt;
&lt;p&gt;From here, press &lt;strong&gt;TAB&lt;/strong&gt; a few more times to move our cursor to highlight the
&lt;strong&gt;&amp;lt;SAVE&amp;gt;&lt;/strong&gt; option at the bottom. Then press &lt;strong&gt;ENTER&lt;/strong&gt; to save the changes we made
to our &lt;code&gt;.config&lt;/code&gt; file.&lt;/p&gt;
&lt;p&gt;After we've saved our changes press &lt;strong&gt;TAB&lt;/strong&gt; a couple more times to highlight
the &lt;strong&gt;&amp;lt;EXIT&amp;gt;&lt;/strong&gt; option and &lt;strong&gt;ENTER&lt;/strong&gt; to exit the ncurses menu.&lt;/p&gt;
&lt;p&gt;Finally, with the changes to our &lt;code&gt;.config&lt;/code&gt; file saved, we can re-compile
our kernel with the same simpified command &lt;a class="reference internal" href="#modify-the-makefile"&gt;from above&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ make -j `getconf _NPROCESSORS_ONLN`
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
</content><category term="Notes"/><category term="Eudyptula Challenge"/></entry><entry><title>Prepared</title><link href="https://0x42.sh/prepared/" rel="alternate"/><published>2020-08-09T00:00:00+00:00</published><updated>2020-08-09T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2020-08-09:/prepared/</id><summary type="html">&lt;a class="reference external image-reference" href="https://www.amazon.com/Prepared-What-Kids-Need-Fulfilled/dp/1984826069"&gt;
&lt;img alt="The cover of Prepared by Diane Tavenner" class="right" src="https://0x42.sh/prepared/prepared-diane-tavenner.png" /&gt;
&lt;/a&gt;
&lt;p&gt;When you think back to your primary or secondary school days was there
something you would change? If you could change absolutely anything, create your
own curriculum, add or remove some lectures, assign less homework, what would
your dream school look like? Would you change school start times? Allowing
students …&lt;/p&gt;</summary><content type="html">&lt;a class="reference external image-reference" href="https://www.amazon.com/Prepared-What-Kids-Need-Fulfilled/dp/1984826069"&gt;
&lt;img alt="The cover of Prepared by Diane Tavenner" class="right" src="https://0x42.sh/prepared/prepared-diane-tavenner.png" /&gt;
&lt;/a&gt;
&lt;p&gt;When you think back to your primary or secondary school days was there
something you would change? If you could change absolutely anything, create your
own curriculum, add or remove some lectures, assign less homework, what would
your dream school look like? Would you change school start times? Allowing
students to &lt;a class="reference external" href="https://www.cdc.gov/features/school-start-times/index.html"&gt;get enough sleep&lt;/a&gt; and, in my case,
preventing students from waking up at an un-Godly hour trying to catch a 6am bus.&lt;/p&gt;
&lt;p&gt;Everyone involved in a child's education, including parents, teachers,
administrators, and even the students themselves all want every student to
succeed. As a country we have devised countless initiatives and tests to
identify and help a student who falls behind on their journey to developing the
tools needed for a successful life. Yet as of 2018, there were an &lt;a class="reference external" href="https://nces.ed.gov/programs/coe/indicator_coj.asp"&gt;estimated
2.1 million (~5.3%) American children&lt;/a&gt;, between 16 to 25 who
would be labeled &lt;em&gt;&amp;quot;dropout&amp;quot;&lt;/em&gt;. A title, according to the &lt;a class="reference external" href="https://www.bls.gov/emp/chart-unemployment-earnings-education.htm"&gt;Bureau of Labor
Statistics&lt;/a&gt;,
that gives them half the earning power as someone with a college degree and 5
times more likely to go without a job. Which asks the question, if everyone is
helping our next generation succeed, why are so many failing to do so?&lt;/p&gt;
&lt;p&gt;Diane Tavenner, the author of the book &lt;a class="reference external" href="https://www.amazon.com/Prepared-What-Kids-Need-Fulfilled/dp/1984826069"&gt;Prepared&lt;/a&gt; and founder
of &lt;a class="reference external" href="https://summitps.org/"&gt;Summit Public Schools&lt;/a&gt;, which operates some of
the top-performing schools in the nation, looks to have found the answer to
ensuring every child has a successful life. Her goal with Summit is to go
beyond teaching the math, writing, and science that students will need to get
into college, but to also teach what students will need to live a great life.
Life skills like the ability to learn and research new topics, manage their time
wisely, and the self-confidence along with finding the initial direction they
wish to pursue with their lives.&lt;/p&gt;
&lt;p&gt;As she explains in her book, Summit is continuously refining and improving the
teaching model that focuses on three key elements with their students:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;strong&gt;self-directed learning&lt;/strong&gt;: All students are responsible (with support from
their teachers) for setting their own learning goals, planning how to learn
the information, test their knowledge and assess their performance afterward.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;project-based learning&lt;/strong&gt;: A hands-on, problem oriented, self-discovery
method of learning that allows students to deeply explore a topic they're
curious about.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;mentoring&lt;/strong&gt;: Beyond the typical school guidance counselor, each student in
Summit has a dedicated mentor that meets regularly with them to help and guide
students as they achieve their personal and academic goals.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These skills are incredibly important in today's workforce. When all of
humanity's information is just a fingertip away, we must learn more than just
the reading, writing and arithmetic. We need the life skills like
time-management, creative &amp;amp; critical thinking skills, along with the drive to
accomplish our goals that let people truly succeed in this information
saturated world.&lt;/p&gt;
&lt;p&gt;The majority of the book is dedicated to the teachers and administrators
overseeing our children's education. However, in the final section, Diane gives
parents some advice on how to encourage their children's independent growth.
Most of the section can be distilled into the belief that parents should be
mentoring, not directing. If a child is into computers, support them. When they
change their mind (and they will) support them. In this part of their lives they
are exploring their world and finding their place in it. The best thing a parent
can do is let them explore, as scary as that may be.&lt;/p&gt;
</content><category term="Books"/></entry><entry><title>Building The Linux Kernel</title><link href="https://0x42.sh/building-the-linux-kernel/" rel="alternate"/><published>2020-07-31T00:00:00+00:00</published><updated>2020-07-31T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2020-07-31:/building-the-linux-kernel/</id><summary type="html">&lt;p&gt;Previously, for the 1&lt;sup&gt;st&lt;/sup&gt; task in &lt;a class="reference external" href="https://eudyptula-challenge.org"&gt;The Eudyptula Challenge&lt;/a&gt;, we built a Loadable Kernel Module (LKM)
that adds &lt;em&gt;&amp;quot;Hello World&amp;quot;&lt;/em&gt; to our kernel's message buffer any time the module is
installed. You can read my notes all &lt;a class="reference external" href="https://0x42.sh/the-hello-world-kernel-module/"&gt;about that here&lt;/a&gt; if you want to
learn more.&lt;/p&gt;
&lt;p&gt;Or, if you …&lt;/p&gt;</summary><content type="html">&lt;p&gt;Previously, for the 1&lt;sup&gt;st&lt;/sup&gt; task in &lt;a class="reference external" href="https://eudyptula-challenge.org"&gt;The Eudyptula Challenge&lt;/a&gt;, we built a Loadable Kernel Module (LKM)
that adds &lt;em&gt;&amp;quot;Hello World&amp;quot;&lt;/em&gt; to our kernel's message buffer any time the module is
installed. You can read my notes all &lt;a class="reference external" href="https://0x42.sh/the-hello-world-kernel-module/"&gt;about that here&lt;/a&gt; if you want to
learn more.&lt;/p&gt;
&lt;p&gt;Or, if you wish to work on The Eudyptula Challenge yourself before you read my
notes (recommended), you can use &lt;a class="reference external" href="https://git.sr.ht/~bryanb/eudyptula"&gt;my git repository&lt;/a&gt;, which has all 20 tasks and the
code I used to complete each one.&lt;/p&gt;
&lt;p&gt;For this challenge, the 2&lt;sup&gt;nd&lt;/sup&gt; task of The Eudyptula Challenge, we will
use the Kbuild system once again. This time we will use the Kbuild system to
compile the latest version of the Linux Kernel. Then after compiling it (this
takes some time), we install and boot from it.&lt;/p&gt;
&lt;div class="section" id="task-no-2"&gt;
&lt;h2&gt;Task No.2&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;Now that you have written your first kernel module, it's time to take
off the training wheels and move on to building a custom kernel.  No
more distro kernels for you, for this task you must run your own kernel.
And use git!  Exciting isn't it!  No, oh, ok...&lt;/p&gt;
&lt;p&gt;The tasks for this round is:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;download Linus's latest git tree from git.kernel.org (you have to
figure out which one is his, it's not that hard, just remember what
his last name is and you should be fine.)&lt;/li&gt;
&lt;li&gt;build it, install it, and boot it.  You can use whatever kernel
configuration options you wish to use, but you must enable
CONFIG_LOCALVERSION_AUTO=y.&lt;/li&gt;
&lt;li&gt;show proof of booting this kernel.  Bonus points for you if you do
it on a &amp;quot;real&amp;quot; machine, and not a virtual machine (virtual machines
are acceptable, but come on, real kernel developers don't mess
around with virtual machines, they are too slow.  Oh yeah, we aren't
real kernel developers just yet.  Well, I'm not anyway, I'm just a
script...)  Again, proof of running this kernel is up to you, I'm
sure you can do well.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Hint, you should look into the 'make localmodconfig' option, and base
your kernel configuration on a working distro kernel configuration.
Don't sit there and answer all 1625 different kernel configuration
options by hand, even I, a foolish script, know better than to do that!&lt;/p&gt;
&lt;p&gt;After doing this, don't throw away that kernel and git tree and
configuration file.  You'll be using it for later tasks, a working
kernel configuration file is a precious thing, all kernel developers
have one they have grown and tended to over the years.  This is the
start of a long journey with yours, don't discard it like was a broken
umbrella, it deserves better than that.&lt;/p&gt;
&lt;p class="attribution"&gt;&amp;mdash;Little Penguin&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/div&gt;
&lt;div class="section" id="build-tools"&gt;
&lt;h2&gt;Build Tools&lt;/h2&gt;
&lt;p&gt;Just like mechanics, plumbers, and all the other skilled trades, having the
right tools for the job makes a world of difference. This &lt;em&gt;&amp;quot;tools maketh man&amp;quot;&lt;/em&gt;
proverb rings true even for software developers, even though most of our tools
are less tangible.&lt;/p&gt;
&lt;p&gt;The tools we will need to properly build the kernel vary wildly depending on
many factors, like your hardware, kernel version, how we plan to install the
kernel, as well as the linux distribution on your computer. All the &lt;em&gt;minimum&lt;/em&gt;
requirements we will need to build the kernel are listed in &lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/process/changes.html"&gt;the documentation
here&lt;/a&gt;, chances
are though you will need to use more.&lt;/p&gt;
&lt;p&gt;The tools I needed to build version 5.8 of the kernel on my (charmingly retro)
Lenovo T430, running Ubuntu's 18.04 LTS (Bionic Beaver) release, are easily
installed with:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ sudo apt install build-essential libncurses-dev bison flex libssl-dev libelf-dev
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="cloning"&gt;
&lt;h2&gt;Cloning&lt;/h2&gt;
&lt;p&gt;When you first navigate to the &lt;a class="reference external" href="https://git.kernel.org"&gt;kernel's git repositories&lt;/a&gt;, you will be greeted with hundreds of projects all
having something to do with archiving or improving the Linux Kernel in some way.
To help simplify the process of finding Linus' repository, which has the latest
patches, we can search through the projects listed (using &lt;strong&gt;Ctrl+F&lt;/strong&gt;) for
&lt;strong&gt;&amp;quot;torvalds/&amp;quot;&lt;/strong&gt;.&lt;/p&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;NOTE:&lt;/p&gt;
&lt;p class="last"&gt;If you still have problems finding Linus' repository, you can cheat with &lt;a class="reference external" href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/"&gt;this
URL&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Once you have found the URL, we can use &lt;code&gt;git&lt;/code&gt; to download a copy of Linus'
Linux repository using the typical &lt;code&gt;git clone&lt;/code&gt; command. If you are unsure
about how to clone a repository, the &lt;a class="reference external" href="https://git-scm.com/doc"&gt;git documentation&lt;/a&gt; has a great article about &lt;a class="reference external" href="https://git-scm.com/book/en/v2/Git-Basics-Getting-a-Git-Repository"&gt;getting started with git&lt;/a&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ git clone https://git.kernel.org/pub/scm/linux ...
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;NOTE:&lt;/p&gt;
&lt;p class="last"&gt;The Linux Project is big (~3GB). If you have a slower internet connection,
you can add &lt;code&gt;--depth=256&lt;/code&gt; to the &lt;code&gt;git clone&lt;/code&gt; command above to
create a &lt;a class="reference external" href="https://www.git-scm.com/docs/git-clone#Documentation/git-clone.txt---depthltdepthgt"&gt;&amp;quot;shallow clone&amp;quot;&lt;/a&gt;
which limits the number of older patches you need to download and still have
a functioning copy of the kernel's source code.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="configuring"&gt;
&lt;h2&gt;Configuring&lt;/h2&gt;
&lt;p&gt;With a copy of the kernel's source code in hand, our next task is to set the
many thousands of configuration options needed for the &lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/kbuild/index.html"&gt;Kbuild System&lt;/a&gt; to properly
compile our kernel. Each configuration option, like the
&lt;code&gt;CONFIG_LOCALVERSION_AUTO&lt;/code&gt; the Little Penguin mentioned in &lt;a class="reference internal" href="#task-no-2"&gt;the challenge
message above&lt;/a&gt;, informs the Kbuild System of the drivers and
features needed to be installed to function properly with our computer's hardware.&lt;/p&gt;
&lt;p&gt;However, instead of setting roughly 8700 config options by hand, one-at-a-time
(a &lt;em&gt;very&lt;/em&gt; error prone, time consuming process), we can copy the configuration
file from the linux distribution currently running on our computer as a sort of
starting point.&lt;/p&gt;
&lt;div class="section" id="configuration-plagiarism"&gt;
&lt;h3&gt;Configuration Plagiarism&lt;/h3&gt;
&lt;p&gt;The first step in copying our distribution's configuration file is to find it
in our &lt;code&gt;/boot&lt;/code&gt; directory.&lt;/p&gt;
&lt;p&gt;There are multiple configuration files inside our &lt;code&gt;/boot&lt;/code&gt; directory. The
one our computer is using depends on the kernel version our computer is using.
For example my computer, using kernel version &lt;code&gt;4.15.0-112&lt;/code&gt;, will have a
configuration file named &lt;code&gt;config-4.15.0-112-generic&lt;/code&gt; in the &lt;code&gt;/boot&lt;/code&gt;
directory.&lt;/p&gt;
&lt;p&gt;To copy your distribution's config file into the root directory of the kernel
use the this command:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ cp /boot/config-`uname -r` .config
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Next, answer &lt;code&gt;yes&lt;/code&gt; to any new configuration options added between the
release of our distribution's kernel (they usually lag behind a few versions)
and the bleeding edge we've downloaded from Linus.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ yes &amp;quot;&amp;quot; | make oldconfig
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;NOTE:&lt;/p&gt;
&lt;p class="last"&gt;Combined, these commands are equivalent to &lt;code&gt;make localyesconfig&lt;/code&gt; found in
&lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/admin-guide/README.html#configuring-the-kernel"&gt;the kernel documentation&lt;/a&gt;.
I find it nice to see &lt;em&gt;&amp;quot;how the sausage is made&amp;quot;&lt;/em&gt; though.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="configuring-by-hand"&gt;
&lt;h3&gt;Configuring By Hand&lt;/h3&gt;
&lt;p&gt;While not technically needed to complete this eudyptula challenge, this is a
good place to talk about how we can customize our newly copied configuration
file for our kernels. While there are a few, largely similar, commands that help
us edit the &lt;code&gt;.config&lt;/code&gt; file, the command I prefer is a &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Ncurses"&gt;ncurses&lt;/a&gt; based menu that can run
on most systems or on remote servers if wanted.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ make menuconfig
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It produces a color menu, full of radio-lists and dialogues, that looks
something like this:&lt;/p&gt;
&lt;img alt="A screens shot of the menuconfig dialog for the 5.8.0-rc7 kernel." src="https://0x42.sh/building-the-linux-kernel/ncurses.png" /&gt;
&lt;p&gt;Each menu option allows us to enable and disable certain features in our
kernels, along with anything that depended on that option. For example, if we
disabled &lt;strong&gt;Networking&lt;/strong&gt;, all network-related options, like &lt;strong&gt;Amateur Radio&lt;/strong&gt;,
will also be disabled.&lt;/p&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;NOTE:&lt;/p&gt;
&lt;p class="last"&gt;It is easy to remove an option from our kernel that will ultimately leave us
with a broken kernel. When in doubt, leave the option enabled unless you know
what you are removing.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The full list of commands used to edit the kernel's configuration file is
available to you in the &lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/admin-guide/README.html#configuring-the-kernel"&gt;kernel documentation&lt;/a&gt;, if you want to learn more.
They largely use different &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Graphical_user_interface"&gt;GUIs&lt;/a&gt; to edit our
&lt;code&gt;.config&lt;/code&gt; file.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="building"&gt;
&lt;h2&gt;Building&lt;/h2&gt;
&lt;p&gt;With all the preparations complete, we can start to work on compiling the kernel
into a compressed image ready for our computers. Just like in userspace
applications, a simple &lt;code&gt;make&lt;/code&gt; command is all that is needed to build
our kernel:&lt;/p&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;WARNING:&lt;/p&gt;
&lt;p class="last"&gt;Building the kernel will take a significant amount of time.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ make -j `getconf _NPROCESSORS_ONLN` CONFIG_LOCALVERSION_AUTO=y LOCALVERSION=-eudyptula
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;While a bare &lt;code&gt;make&lt;/code&gt; command, without the added arguments, will also work
to build our kernel, however, the kernel is such a large project, using the
&lt;code&gt;-j&lt;/code&gt; flag to compile multiple files at the same time will dramatically
reduce the total time needed to compile.&lt;/p&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;NOTE:&lt;/p&gt;
&lt;p class="last"&gt;&lt;code&gt;getconf _NPROCESSORS_ONLN&lt;/code&gt; determines the number of CPU cores your
computer has. You can replace this with any number you wish. For example
&lt;code&gt;make -j4&lt;/code&gt; will allow &lt;code&gt;make&lt;/code&gt; to compile 4 targets at a time,
consuming 4 of your CPU's cores. You can read more about how &lt;a class="reference external" href="https://www.gnu.org/software/make/manual/html_node/Parallel.html"&gt;Parallel
Execution&lt;/a&gt;
works in the &lt;code&gt;make&lt;/code&gt; documentation.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;We are also passing two extra configuration options with the &lt;code&gt;make&lt;/code&gt;
command above:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nv"&gt;CONFIG_LOCALVERSION_AUTO&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;y&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;LOCALVERSION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;-eudyptula
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Both of these configuration options will add information to our kernel's version
string, helping us prove to The Little Penguin we installed the latest kernel
published by Linus Torvalds.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;The &lt;code&gt;CONFIG_LOCALVERSION_AUTO&lt;/code&gt; option will append enough characters from
the latest &lt;a class="reference external" href="https://git-scm.com/book/en/v2/Git-Internals-Git-Objects"&gt;commit id&lt;/a&gt;
to be unique in our kernel's version name. For me this was &lt;code&gt;g1df0d8960499&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;LOCALVERSION&lt;/code&gt; will also add &lt;code&gt;-eudyptula&lt;/code&gt; to that same version string&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;producing a kernel named something like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;linux-image-5.8.0-rc4-eudyptula-00381-g1df0d8960499
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="installing"&gt;
&lt;h2&gt;Installing&lt;/h2&gt;
&lt;p&gt;Our last and final step to completing this challenge is to install the kernel
along with its supporting modules into our &lt;code&gt;/boot&lt;/code&gt; and &lt;code&gt;/lib/modules&lt;/code&gt;
folders. This will allow our boot loader, &lt;a class="reference external" href="https://www.gnu.org/software/grub/"&gt;GRUB&lt;/a&gt;, to properly initialize our freshly
compiled linux kernel the next time we reboot.&lt;/p&gt;
&lt;p&gt;Each distribution of linux installs the kernel in their way. For example &lt;a class="reference external" href="https://wiki.archlinux.org/index.php/Kernel/Traditional_compilation#Installation"&gt;Arch
Linux's documentation&lt;/a&gt;
tells us to manually copy (with &lt;code&gt;cp&lt;/code&gt;) and &lt;em&gt;not&lt;/em&gt; to use &lt;code&gt;make install&lt;/code&gt;
to install the kernel in our &lt;code&gt;/boot&lt;/code&gt; directory. All this to say &lt;em&gt;&amp;quot;your mileage
may vary&amp;quot;&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;For most linux distributions we can install our kernel with a simple &lt;code&gt;make&lt;/code&gt;
command:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ sudo make modules_install install
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Along with installing all the kernel's modules and the kernel itself, this
command will also update our &lt;code&gt;/boot/grub/grub.conf&lt;/code&gt; files, so we don't have
to manually update GRUB ourselves.&lt;/p&gt;
&lt;div class="section" id="ubuntu-debian-weirdness"&gt;
&lt;h3&gt;Ubuntu/Debian Weirdness&lt;/h3&gt;
&lt;p&gt;If your linux distribution has trouble with the &lt;code&gt;make install&lt;/code&gt; commands
above, an alternative is to bundle our linux kernel into a package. This method
is slower (a real pain when developing drivers) however, packages usually have
more success installing on certain systems.&lt;/p&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;NOTE:&lt;/p&gt;
&lt;p class="last"&gt;There are so many linux distributions I cannot possibly test them all. For my
hardware running Ubuntu 18.04 LTS, the following commands worked.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;For Debian and Ubuntu based distributions we can use &lt;code&gt;deb-pkg&lt;/code&gt; to generate
the packages:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ make deb-pkg
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This command will generate up to 5 Debian packages:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;code&gt;linux-image-version&lt;/code&gt; which contains the kernel image and the associated
modules.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;linux-headers-version&lt;/code&gt; has the header files required to build external
modules.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;linux-firmware-image-version&lt;/code&gt; contains the firmware files needed by
some drivers (this package could be missing when you build from the kernel
sources provided by Debian).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;linux-image-version-dbg&lt;/code&gt; which contains the debugging symbols for the
kernel image and its modules&lt;/li&gt;
&lt;li&gt;and &lt;code&gt;linux-libc-dev&lt;/code&gt; holds headers relevant to some user-space libraries.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All of these packages can easily be installed with a simple &lt;code&gt;dpkg&lt;/code&gt; command:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;sudo dpkg -i ../*.deb
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;And that's it!&lt;/strong&gt; Reboot you computer, and run &lt;code&gt;uname -r&lt;/code&gt; and you should
see something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ uname -r
linux-image-5.8.0-rc4-eudyptula-00381-g1df0d8960499
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This output will prove to The Little Penguin you've successfully compiled and
installed your first (of many I'm sure) custom linux kernel. &lt;strong&gt;Great Job!&lt;/strong&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</content><category term="Notes"/><category term="Eudyptula Challenge"/></entry><entry><title>The Hello World Kernel Module</title><link href="https://0x42.sh/the-hello-world-kernel-module/" rel="alternate"/><published>2020-07-15T00:00:00+00:00</published><updated>2020-07-15T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2020-07-15:/the-hello-world-kernel-module/</id><summary type="html">&lt;p&gt;As a brief introduction to this (very long) essay. What lies below are
my notes while completing the 1&lt;sup&gt;st&lt;/sup&gt; in a set of 20 tasks from
&lt;a class="reference external" href="https://eudyptula-challenge.org"&gt;The Eudyptula Challenge&lt;/a&gt;. Each
task, emailed one at a time, starting with building a &amp;quot;hello world&amp;quot;
kernel module (this essay) and progressed in …&lt;/p&gt;</summary><content type="html">&lt;p&gt;As a brief introduction to this (very long) essay. What lies below are
my notes while completing the 1&lt;sup&gt;st&lt;/sup&gt; in a set of 20 tasks from
&lt;a class="reference external" href="https://eudyptula-challenge.org"&gt;The Eudyptula Challenge&lt;/a&gt;. Each
task, emailed one at a time, starting with building a &amp;quot;hello world&amp;quot;
kernel module (this essay) and progressed in difficulty until we
ultimately submit patches into the main tree of the Linux kernel. The
ultimate goal of The Eudyptula Challenge is to get new developers
comfortable with the somewhat unique world of kernel development by
separating the &amp;quot;on-boarding&amp;quot; process into focused manageable tasks.&lt;/p&gt;
&lt;p&gt;Sadly, The Eudyptula Challenge is no longer accepting new
applicants. However, if you wish to work on the tasks yourself, I've
published the 20 tasks I've managed to find along with the code I used
to &amp;quot;complete&amp;quot; them in a &lt;a class="reference external" href="https://git.sr.ht/~bryanb/eudyptula"&gt;git repo here&lt;/a&gt;.&lt;/p&gt;
&lt;div class="section" id="task-no-1"&gt;
&lt;h2&gt;Task No.1&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;Write a Linux kernel module, and stand-alone Makefile, that when loaded
prints to the kernel debug log level, &amp;quot;Hello World!&amp;quot;  Be sure to make
the module be able to be unloaded as well.&lt;/p&gt;
&lt;p&gt;The Makefile should build the kernel module against the source for the
currently running kernel, or, use an environment variable to specify
what kernel tree to build it against.&lt;/p&gt;
&lt;p&gt;Please show proof of this module being built, and running, in your
kernel.  What this proof is is up to you, I'm sure you can come up with
something.  Also be sure to send the kernel module you wrote, along with
the Makefile you created to build the module.&lt;/p&gt;
&lt;p class="attribution"&gt;&amp;mdash;Little Penguin&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/div&gt;
&lt;div class="section" id="what-is-a-module"&gt;
&lt;h2&gt;What Is A Module&lt;/h2&gt;
&lt;p&gt;A kernel module is piece of code designed to be loaded and unloaded on
demand by our kernels. For example, the device drivers for your
keyboard or a network card are a type of module. By separating the
kernel into individual software components, we can keep the overall
size of the kernel small, letting Linux fit into the smallest of
embedded systems. Some kernel modules, like the one we'll be building,
can even be installed without the need to recompile and reboot our
kernel, making upgrades easy, and saving us a lot of time.&lt;/p&gt;
&lt;p&gt;If you have access to a Linux machine, you can find the modules that
are currently loaded into the kernel by using the &lt;code&gt;lsmod&lt;/code&gt;
command, which gets its information from &lt;code&gt;/proc/modules&lt;/code&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="chiseling-a-cave-module"&gt;
&lt;h2&gt;Chiseling A Cave Module&lt;/h2&gt;
&lt;p&gt;Every kernel module must have at least two functions, one that will be
called when we install the module and another function to remove it
from the kernel. Back in the pre v2.3 era (&lt;a class="reference external" href="https://en.wikipedia.org/wiki/Linux_kernel_version_history"&gt;early 2000s&lt;/a&gt;) this
could only be done with a &amp;quot;start&amp;quot; function, called
&lt;code&gt;init_module()&lt;/code&gt; and an &amp;quot;end&amp;quot; function, called
&lt;code&gt;cleanup_module()&lt;/code&gt;. There are more modern (and preferred)
methods available to us today, however some developers still use
these, so it's a great starting point.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="cp"&gt;#include&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cpf"&gt;&amp;lt;linux/kernel.h&amp;gt;&lt;/span&gt;&lt;span class="c1"&gt;  /* for KERN_DEBUG */&lt;/span&gt;
&lt;span class="cp"&gt;#include&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cpf"&gt;&amp;lt;linux/module.h&amp;gt;&lt;/span&gt;&lt;span class="c1"&gt;  /* for all kernel modules */&lt;/span&gt;

&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;init_module&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;KERN_DEBUG&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;Hello World.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="cm"&gt;/* init_module loaded successfully */&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;cleanup_module&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;KERN_DEBUG&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;oh, the rest is silence.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Typically &lt;code&gt;init_module()&lt;/code&gt; is used to register handlers or alter
some other part of the kernel for a device or something. The
&lt;code&gt;cleanup_module()&lt;/code&gt; will then undo those changes, allowing the
module to be removed safely from the kernel. Both of these functions
(as of version 5.7) can be found on line 75, as well as everything
else we need, in &lt;code&gt;linux/module.h&lt;/code&gt; of &lt;a class="reference external" href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/module.h?h=v5.7#n75"&gt;the source code&lt;/a&gt;.&lt;/p&gt;
&lt;div class="section" id="printk-printf"&gt;
&lt;h3&gt;printk() != printf()&lt;/h3&gt;
&lt;p&gt;To print &lt;code&gt;Hello World&lt;/code&gt; on &lt;em&gt;&amp;quot;the kernel debug log level&amp;quot;&lt;/em&gt;, we'll
need to use another, very old, function called &lt;code&gt;printk()&lt;/code&gt;. Unlike
the &lt;code&gt;printf()&lt;/code&gt; commonly used in userspace applications,
&lt;code&gt;printk()&lt;/code&gt; is not designed to communicate to the user (or say
hello to worlds). It's a logging mechanism used to give warnings and
to log messages. This is why each &lt;code&gt;printk()&lt;/code&gt; statement also
comes with a priority. There are currently 8 defined priorities we can
use ranging from &lt;code&gt;KERN_DEBUG&lt;/code&gt; to &lt;code&gt;KERN_EMERG&lt;/code&gt;. You can see
them all, and their definitions, currently (version 5.7) in
&lt;code&gt;linux/kern_levels.h&lt;/code&gt; in &lt;a class="reference external" href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/kern_levels.h?h=v5.7"&gt;the source code&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Pay attention to the single argument passed to &lt;code&gt;printk()&lt;/code&gt;. Looking
into the &lt;a class="reference external" href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/printk/printk.c?h=v5.7#n2054"&gt;source code&lt;/a&gt;
shows that &lt;code&gt;printk(const char *ftm, ...)&lt;/code&gt; accepts only one
string, with space to pass extra arguments to format the string if
needed, for example, our &amp;quot;Hello World&amp;quot; statement from above, which
doesn't need formatting and therefore passes no extra arguments:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;KERN_DEBUG&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;Hello World.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code&gt;KERN_DEBUG&lt;/code&gt; macro will expand to &lt;code&gt;&amp;quot;\001&amp;quot; &amp;quot;7&amp;quot;&lt;/code&gt;,
turning our statement into:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\001&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;7&amp;quot;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;Hello World.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Our C lexer will then combine the adjacent string literals to produce
our formatted string for the kernel to log:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\001&lt;/span&gt;&lt;span class="s"&gt;7Hello World.&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Even though &lt;code&gt;printk()&lt;/code&gt; is falling out of style with modern Linux
maintainers, as &lt;a class="reference internal" href="#pr-debug"&gt;we will see in later sections&lt;/a&gt;, there
is a lot more to read about how to work with &lt;code&gt;printk()&lt;/code&gt; and
format specifiers in the kernel in &lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/core-api/printk-formats.html"&gt;the documentation here&lt;/a&gt;
if you're into that kind of stuff.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="making-a-kernel-module"&gt;
&lt;h2&gt;Making A Kernel Module&lt;/h2&gt;
&lt;p&gt;Much like how kernel modules are a little different than userspace
application modules, the Makefiles that compile the kernel are also a
bit different than Makefiles in userspace.&lt;/p&gt;
&lt;p&gt;Originally, as the Linux code-base grew, so did its Makefiles. As they
continued to grow in complexity, they eventually became a burden to
maintain. Fortunately a solution, called the &lt;em&gt;&amp;quot;kbuild system&amp;quot;&lt;/em&gt;, was
created and accepted into the kernel to help organize and simplify the
kernel's building process. If you are interested, there is an entire
section about the &lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/kbuild/index.html"&gt;kbuild system in the
documentation.&lt;/a&gt;&lt;/p&gt;
&lt;div class="section" id="kbuild-makefile"&gt;
&lt;h3&gt;Kbuild Makefile&lt;/h3&gt;
&lt;p&gt;Just like Makefiles in userspace, we can start a Kbuild Makefile by
creating a new file called &lt;em&gt;…wait for it…&lt;/em&gt; &lt;code&gt;Makefile&lt;/code&gt; in the
same folder as our &lt;code&gt;hello-world.c&lt;/code&gt; module we made in &lt;a class="reference internal" href="#chiseling-a-cave-module"&gt;the
sections above&lt;/a&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ ls -l
total 8
-rw-rw-r-- 1 me us 903 Jul  5 00:00 hello-world.c
-rw-rw-r-- 1 me us 167 Jul  5 00:00 Makefile
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We can alternatively use the name &lt;code&gt;Kbuild&lt;/code&gt; (not preferred) to
indicate to other developers that the Makefile is intended to run
using the kbuild system. However, while the &lt;code&gt;Kbuild&lt;/code&gt; name is not
preferred, interestingly, if both &lt;code&gt;Makefile&lt;/code&gt; and &lt;code&gt;Kbuild&lt;/code&gt;
files exist in the same directory the &lt;code&gt;Kbuild&lt;/code&gt; file will be
used. (&lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/kbuild/makefiles.html#the-kbuild-files"&gt;source&lt;/a&gt;)&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="goal-definitions"&gt;
&lt;h3&gt;Goal Definitions&lt;/h3&gt;
&lt;p&gt;The &amp;quot;heart&amp;quot; of the kbuild system uses lines called &lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/kbuild/makefiles.html#goal-definitions"&gt;&amp;quot;goal definitions&amp;quot;&lt;/a&gt; to define all the various target files, special
compilation options, and any sub-directories to enter. When we compile
the kernel (with its thousands of Makefiles) the goal definitions are
collected and used to build all the various, documentation files,
modules, and other files we need for our particular kernel.&lt;/p&gt;
&lt;p&gt;The simplest Kbuild Makefile we can write for our module contains a
single line:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nv"&gt;obj-m&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;hello-world.o
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;obj-m&lt;/code&gt; tells kbuild that our &lt;code&gt;hello-world.o&lt;/code&gt; object file
is a &lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/kbuild/makefiles.html#loadable-module-goals-obj-m"&gt;loadable kernel module (LKM)&lt;/a&gt;
that can be loaded and unloaded at any time without needing to reboot
the kernel. This line will also tell the kbuild system to look for
files in our directory named &lt;code&gt;hello-world.c&lt;/code&gt; or
&lt;code&gt;hello-world.S&lt;/code&gt; to compile into the &lt;code&gt;hello-world.o&lt;/code&gt; object
file, before building the kernel object file &lt;code&gt;hello-world.ko&lt;/code&gt;
we'll use to load into our kernel.&lt;/p&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;NOTE:&lt;/p&gt;
&lt;p class="last"&gt;There are a few different types of goal definitions, like
&lt;code&gt;obj-y&lt;/code&gt;, which defines build-in modules that are automatically
inserted into the kernel at boot-up (and not loadable by us). The
definitions for all the goal definitions can be found in the
&lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/kbuild/makefiles.html#goal-definitions"&gt;documentation here&lt;/a&gt; if you wish to learn more.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="convenience-targets"&gt;
&lt;h3&gt;Convenience Targets&lt;/h3&gt;
&lt;p&gt;For the pure convenience of it, we can add extra &lt;a class="reference external" href="https://www.gnu.org/software/make/manual/html_node/Phony-Targets.html"&gt;phony targets&lt;/a&gt;
to our Kbuild Makefile to easily compile our module for the kernel
currently running on our computer, simplifying the task of compiling
our module down to just typing &lt;code&gt;make&lt;/code&gt; into our terminals:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="si"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;MAKE&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-C&lt;span class="w"&gt; &lt;/span&gt;/lib/modules/&lt;span class="k"&gt;$(&lt;/span&gt;shell&lt;span class="w"&gt; &lt;/span&gt;uname&lt;span class="w"&gt; &lt;/span&gt;-r&lt;span class="k"&gt;)&lt;/span&gt;/build&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;M&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;PWD&lt;span class="k"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;modules
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And &lt;code&gt;make clean&lt;/code&gt; to clean up everything afterwards:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nf"&gt;clean&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="si"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;MAKE&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-C&lt;span class="w"&gt; &lt;/span&gt;/lib/modules/&lt;span class="k"&gt;$(&lt;/span&gt;shell&lt;span class="w"&gt; &lt;/span&gt;uname&lt;span class="w"&gt; &lt;/span&gt;-r&lt;span class="k"&gt;)&lt;/span&gt;/build&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;M&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;$(&lt;/span&gt;PWD&lt;span class="k"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;clean
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Both of these phony targets use the &lt;code&gt;-C&lt;/code&gt; option to move out of
our current directory and into our kernel's source directory. There
&lt;code&gt;make&lt;/code&gt; can find and use the top most kbuild Makefile, which
takes the &lt;code&gt;M&lt;/code&gt; option to locate the folder we are current
working in, and build the files defined using the &lt;code&gt;obj-m&lt;/code&gt; goal
definition &lt;a class="reference internal" href="#goal-definitions"&gt;we setup above&lt;/a&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="installing-a-kernel-module"&gt;
&lt;h2&gt;Installing A Kernel Module&lt;/h2&gt;
&lt;p&gt;Just as with our userspace applications, kernel modules need to be
compiled. Using the kbuild system, along with &lt;a class="reference internal" href="#convenience-targets"&gt;our convenience targets
above&lt;/a&gt;, we can compile our kernel module by
issuing the &lt;code&gt;make&lt;/code&gt; command, and if all goes well, you should see
an output similar to this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ make
make -C /lib/modules/4.15.0-108-generic/build M=/home/me/src/eudyptula ...
/tasks/01 modules
make[1]: Entering directory &amp;#39;/usr/src/linux-headers-4.15.0-108-generic&amp;#39;
  CC [M]  /home/me/src/eudyptula/tasks/01/hello-world.o
  Building modules, stage 2.
  MODPOST 1 modules
WARNING: modpost: missing MODULE_LICENSE() in /home/me/src/eudyptula ...
see include/linux/module.h for more information
  CC      /home/me/src/eudyptula/tasks/01/hello-world.mod.o
  LD [M]  /home/me/src/eudyptula/tasks/01/hello-world.ko
make[1]: Leaving directory &amp;#39;/usr/src/linux-headers-4.15.0-108-generic&amp;#39;
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;NOTE:&lt;/p&gt;
&lt;p class="last"&gt;Ignore the &lt;code&gt;WARNING&lt;/code&gt; about missing &lt;code&gt;MODULE_LICENSE()&lt;/code&gt;
for now, it is a feature to warn users of non open-source code in
modules. We will address that issue in &lt;a class="reference internal" href="#kernel-taint"&gt;the following
sections.&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;The &lt;code&gt;.ko&lt;/code&gt; extension was introduced around kernel version 2.6 to
help differentiate between userspace object files and kernel object
files, which contain a &lt;code&gt;.modinfo&lt;/code&gt; section to hold extra metadata
information about the module. We can use the &lt;code&gt;modinfo&lt;/code&gt; command
to see and interpret the contents of the section:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ modinfo hello-world.ko
filename:       /home/me/src/eudyptula/tasks/01/hello-world.ko
srcversion:     18005133D4ECFCDD12928D8
depends:
retpoline:      Y
name:           hello_world
vermagic:       4.15.0-108-generic SMP mod_unload
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="section" id="installing-the-module"&gt;
&lt;h3&gt;Installing the Module&lt;/h3&gt;
&lt;p&gt;With our &lt;code&gt;hello-world.c&lt;/code&gt; module freshly compiled, we can insert
it into our kernel using the &lt;code&gt;insmod&lt;/code&gt; command as &lt;code&gt;root&lt;/code&gt; or
another user with &lt;code&gt;sudo&lt;/code&gt; privileges:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$&lt;span class="w"&gt; &lt;/span&gt;sudo&lt;span class="w"&gt; &lt;/span&gt;insmod&lt;span class="w"&gt; &lt;/span&gt;hello-world.ko
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="sidebar"&gt;
&lt;p class="first sidebar-title"&gt;NOTE:&lt;/p&gt;
&lt;p class="last"&gt;Don't worry about about kernel taint messages. We will address that
in &lt;a class="reference internal" href="#kernel-taint"&gt;the next section.&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Congratulations!&lt;/strong&gt;, you have created your first kernel module! A
quick inspection of the kernel's diagnostic messages, using
&lt;code&gt;dmesg&lt;/code&gt;, should show our &lt;code&gt;Hello World.&lt;/code&gt; message:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ dmesg | tail -1
[241745.247591] Hello World.
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="removing-the-module"&gt;
&lt;h3&gt;Removing the Module&lt;/h3&gt;
&lt;p&gt;After the well deserved pat-on-the-back and when you are ready to
continue, we can uninstall our module with the &lt;code&gt;rmmod&lt;/code&gt; command
as &lt;code&gt;root&lt;/code&gt; or someone with &lt;code&gt;sudo&lt;/code&gt; privileges:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ sudo rmmod hello_world
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The only indication we've uninstalled our module will be in
&lt;code&gt;dmesg&lt;/code&gt; from our &lt;code&gt;printk()&lt;/code&gt; statement in the
&lt;code&gt;cleanup_module()&lt;/code&gt; function.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ dmesg | tail -1
[241751.401232] oh, the rest is silence.
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="kernel-taint"&gt;
&lt;h2&gt;Kernel Taint&lt;/h2&gt;
&lt;p&gt;There are plenty of ways &lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/admin-guide/tainted-kernels.html#table-for-decoding-tainted-state"&gt;we can taint our kernel&lt;/a&gt;.
Don't worry too much about this though, most of the time it is
completely fine to run a tainted kernel. When something happens that
could be important to an investigation later on, a kernel will mark
itself as &amp;quot;tainted&amp;quot;. Usually the event that caused the kernel to
become tainted is the problem being investigated.&lt;/p&gt;
&lt;p&gt;We can find our kernel's tainted state by reading our
&lt;code&gt;/proc/sys/kernel/tainted&lt;/code&gt; file. Every way we can taint our
kernels is assigned one bit in a bit-field, meaning any value other
than &lt;code&gt;0&lt;/code&gt; indicates our kernel is tainted. To decode the
bit-field values, we can use the &lt;code&gt;tools/debugging/kernel-chktaint&lt;/code&gt;
script found in &lt;a class="reference external" href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/tools/debugging/kernel-chktaint"&gt;the source code&lt;/a&gt;,
to decode its meaning.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ tools/debugging/kernel-chktaint
Kernel is &amp;quot;tainted&amp;quot; for the following reasons:
 * proprietary module was loaded (#0)
 * kernel issued warning (#9)
 * externally-built (&amp;#39;out-of-tree&amp;#39;) module was loaded  (#12)
 * unsigned module was loaded (#13)
For a more detailed explanation of the various taint flags see
 Documentation/admin-guide/tainted-kernels.rst in the the Linux kernel sources
 or https://kernel.org/doc/html/latest/admin-guide/tainted-kernels.html
Raw taint value as int/string: 12801/&amp;#39;P        W  OE    &amp;#39;
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="section" id="licensing-documentation"&gt;
&lt;h3&gt;Licensing &amp;amp; Documentation&lt;/h3&gt;
&lt;p&gt;One of the ways we can taint our kernels is by loading proprietary
modules or modules that use licenses not compatible with the General
Public License (GPL) (bit &lt;code&gt;0&lt;/code&gt; in &lt;a class="reference external" href="https://www.kernel.org/doc/html/latest/admin-guide/tainted-kernels.html#more-detailed-explanation-for-tainting"&gt;the tainting list&lt;/a&gt;). Modules
that don't use the &lt;code&gt;MODULE_LICENSE()&lt;/code&gt; macro will also be
considered proprietary and taint our kernel, if loaded (this is why we
saw &lt;a class="reference internal" href="#installing-the-module"&gt;the warning above&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;There are many documentation macros, defined in
&lt;code&gt;linux/module.h&lt;/code&gt;, some of the basics I added are:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;MODULE_LICENSE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;MIT&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;MODULE_AUTHOR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;Bryan Brattlof &amp;lt;email@example.com&amp;gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;MODULE_DESCRIPTION&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;A Hello World Driver&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;MODULE_SUPPORTED_DEVICE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;testdevice&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Once we add our module's license, author and other information to the
end of our &lt;code&gt;hello-world.c&lt;/code&gt; module, when we compile our module
again using &lt;code&gt;make&lt;/code&gt;, the &lt;code&gt;WARNING&lt;/code&gt; should be gone:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$&lt;span class="w"&gt; &lt;/span&gt;make
make&lt;span class="w"&gt; &lt;/span&gt;-C&lt;span class="w"&gt; &lt;/span&gt;/lib/modules/4.15.0-108-generic/build&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;M&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/home/me/src/eudyptula&lt;span class="w"&gt; &lt;/span&gt;...
make&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;:&lt;span class="w"&gt; &lt;/span&gt;Entering&lt;span class="w"&gt; &lt;/span&gt;directory&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;/usr/src/linux-headers-4.15.0-108-generic&amp;#39;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;CC&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;M&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;/home/me/src/eudyptula/tasks/01/hello-world.o
&lt;span class="w"&gt;  &lt;/span&gt;Building&lt;span class="w"&gt; &lt;/span&gt;modules,&lt;span class="w"&gt; &lt;/span&gt;stage&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;.
&lt;span class="w"&gt;  &lt;/span&gt;MODPOST&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;modules
&lt;span class="w"&gt;  &lt;/span&gt;CC&lt;span class="w"&gt;      &lt;/span&gt;/home/me/src/eudyptula/tasks/01/hello-world.mod.o
&lt;span class="w"&gt;  &lt;/span&gt;LD&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;M&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;/home/me/src/eudyptula/tasks/01/hello-world.ko
make&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;:&lt;span class="w"&gt; &lt;/span&gt;Leaving&lt;span class="w"&gt; &lt;/span&gt;directory&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;/usr/src/linux-headers-4.15.0-108-generic&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;There where many reasons to add this system to the kernel. For
example, it gives developers a way to easily find who maintains a
module, describe what the module does, and what license the code is
protected with. It also provides an easy method to inform users when
they are using non open source software.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="updating-the-module"&gt;
&lt;h2&gt;Updating the Module&lt;/h2&gt;
&lt;p&gt;Everything in my notes, to this point, was needed to complete the 1&lt;sup&gt;st&lt;/sup&gt; task assigned to us by the Little Penguin. However, just
like with every software project, the Linux kernel is constantly
adding new features and adopting new coding styles, ensuring that
my notes will become obsolete as soon as I've writing them.&lt;/p&gt;
&lt;p&gt;With that said, the sections below, while not technically needed to
complete the task, are my notes on the macros and functions I saw in
the &lt;code&gt;drivers&lt;/code&gt; directory of the Linux source code that I found
particularly interesting. These functions are mostly stylistic changes
or they introduce functionality that improves efficiency and
modularity of the Linux kernel in some way.&lt;/p&gt;
&lt;div class="section" id="module-init-module-exit"&gt;
&lt;h3&gt;module_init() &amp;amp; module_exit()&lt;/h3&gt;
&lt;p&gt;Introduced in version 2.4 of the kernel, and defined in
&lt;code&gt;linux/init.h&lt;/code&gt; of the &lt;a class="reference external" href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/init.h?h=v5.7"&gt;source code&lt;/a&gt;,
we can now rename our &amp;quot;start&amp;quot; and &amp;quot;end&amp;quot; functions to whatever we
wish. In this example, I've chosen to rename the &amp;quot;start&amp;quot; function to
&lt;code&gt;hello_world_init()&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="gd"&gt;-int init_module(void)&lt;/span&gt;
&lt;span class="gi"&gt;+static int __init hello_world_init(void)&lt;/span&gt;
&lt;span class="w"&gt; &lt;/span&gt;{
&lt;span class="w"&gt; &lt;/span&gt;        printk(KERN_DEBUG &amp;quot;Hello World.\n&amp;quot;);
&lt;span class="w"&gt; &lt;/span&gt;        return 0; /* init_module loaded successfully */
&lt;span class="w"&gt; &lt;/span&gt;}
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And renamed the &amp;quot;exit&amp;quot; function &lt;code&gt;hello_world_exit()&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="gd"&gt;-void cleanup_module(void)&lt;/span&gt;
&lt;span class="gi"&gt;+static void __exit hello_world_exit(void)&lt;/span&gt;
&lt;span class="w"&gt; &lt;/span&gt;{
&lt;span class="w"&gt; &lt;/span&gt;        printk(KERN_DEBUG &amp;quot;oh, the rest is silence.\n&amp;quot;);
&lt;span class="w"&gt; &lt;/span&gt;}
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The kernel will then use the &lt;code&gt;module_init()&lt;/code&gt; macro to find the
function to execute when the module is installed and
&lt;code&gt;module_exit()&lt;/code&gt; to find the function to cleanup before being
removed.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;module_init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hello_world_init&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;module_exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hello_world_exit&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;To avoid compiling issues, both the &lt;code&gt;module_init()&lt;/code&gt; and
&lt;code&gt;module_exit()&lt;/code&gt; macros must be defined below our newly named
&amp;quot;start&amp;quot; and &amp;quot;end&amp;quot; functions.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="init-exit"&gt;
&lt;h3&gt;__init &amp;amp; __exit&lt;/h3&gt;
&lt;p&gt;I also introduced two macros to our &amp;quot;start&amp;quot; and &amp;quot;end&amp;quot; functions &lt;a class="reference internal" href="#module-init-module-exit"&gt;above&lt;/a&gt; called &lt;code&gt;__init&lt;/code&gt; and
&lt;code&gt;__exit&lt;/code&gt;. These macros, defined in &lt;code&gt;linux/init.h&lt;/code&gt; of the
&lt;a class="reference external" href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/init.h?h=v5.7#n48"&gt;source code&lt;/a&gt;, help reduce memory
used by the kernel depending on how the module is installed.&lt;/p&gt;
&lt;p&gt;For built-in modules, where our module cannot be removed from the
kernel without recompiling and restarting, the &lt;code&gt;__init&lt;/code&gt; keyword
will tell our C lexer to place our module's &amp;quot;start&amp;quot; function into a
special section inside the compiled kernel. After the module is loaded
and our &amp;quot;start&amp;quot; function has finished, the kernel will never have to
run the code again until reboot. So this special section can be freed,
saving memory.&lt;/p&gt;
&lt;p&gt;The same is true for the &lt;code&gt;__exit&lt;/code&gt; macro. For built-in modules,
the module cannot be removed from the kernel without recompiling and
restarting. So the kernel will never need to run our module's &amp;quot;exit&amp;quot;
function to safely remove it from the kernel. This means our C lexer
can safely omit our &amp;quot;exit&amp;quot; function from the compiled kernel.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="pr-debug"&gt;
&lt;h3&gt;pr_debug()&lt;/h3&gt;
&lt;p&gt;In the beginning there was &lt;code&gt;printk()&lt;/code&gt;, and the kernel's
diagnostic messages structure was formless. The lack of any format for
&lt;code&gt;printk()&lt;/code&gt; messages is one of a number of reasons why developers
are replacing &lt;code&gt;printk()&lt;/code&gt; statements with their newer
equivalents. Depending on what section of the kernel we are in, there
are newer functions that have some benefits for us.&lt;/p&gt;
&lt;p&gt;For example, the &lt;code&gt;pr_debug()&lt;/code&gt; function, which has the benefit of
being less syntactically verbose than &lt;code&gt;printk(KERN_DEBUG ...)&lt;/code&gt;
also allows us to take advantage of the &lt;a class="reference external" href="https://lwn.net/Articles/434833/"&gt;dynamic debugging interface&lt;/a&gt;, which gives developers a
uniform control interface for debugging kernel messages while avoiding
cluttering the kernel.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;static int __init hello_world_init(void)
&lt;span class="w"&gt; &lt;/span&gt;{
&lt;span class="gd"&gt;-    printk(KERN_DEBUG &amp;quot;Hello World.\n&amp;quot;);&lt;/span&gt;
&lt;span class="gi"&gt;+    pr_debug(&amp;quot;Hello World.\n&amp;quot;);&lt;/span&gt;
&lt;span class="w"&gt; &lt;/span&gt;    return 0; /* means init_module loaded successfully */
&lt;span class="w"&gt; &lt;/span&gt;}
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;static void __exit hello_world_exit(void)
&lt;span class="w"&gt; &lt;/span&gt;{
&lt;span class="gd"&gt;-    printk(KERN_DEBUG &amp;quot;oh, the rest is silence.\n&amp;quot;);&lt;/span&gt;
&lt;span class="gi"&gt;+    pr_debug(&amp;quot;oh, the rest is silence.\n&amp;quot;);&lt;/span&gt;
&lt;span class="w"&gt; &lt;/span&gt;}
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="wrapping-up"&gt;
&lt;h2&gt;Wrapping Up&lt;/h2&gt;
&lt;p&gt;If you made it here, all I can say is you are a very brave person, and
I'm glad my notes were able to help you in some way. If you see any
issues or have a question, please feel free to contact me, or better
yet &lt;a class="reference external" href="https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies"&gt;subscribe to the kernel newbies mailing list&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For the next challenge, we will be building the Linux Kernel from scratch, as
well as installing and booting from it. If you want to work on this challenge
before you read my notes (recommended), I've published a copy of the challenges
in a &lt;a class="reference external" href="https://git.sr.ht/~bryanb/eudyptula"&gt;git repo here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Next&lt;/strong&gt;: My notes on &lt;a class="reference external" href="https://0x42.sh/building-the-linux-kernel/"&gt;How to build the Linux Kernel&lt;/a&gt; from scratch.&lt;/p&gt;
&lt;/div&gt;
</content><category term="Notes"/><category term="Eudyptula Challenge"/></entry><entry><title>Humble Pi</title><link href="https://0x42.sh/humble-pi/" rel="alternate"/><published>2020-06-30T00:00:00+00:00</published><updated>2020-06-30T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2020-06-30:/humble-pi/</id><summary type="html">&lt;a class="reference external image-reference" href="https://www.amazon.com/Humble-Pi-Comedy-Maths-Errors/dp/0241360234"&gt;
&lt;img alt="The cover of Humble Pi by Matt Parker" class="right" src="https://0x42.sh/humble-pi/humble-pi.png" /&gt;
&lt;/a&gt;
&lt;p&gt;We all make mistakes … especially &lt;em&gt;math&lt;/em&gt; mistakes. Either we forgot to carry the
one, used &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Imperial_units"&gt;&amp;quot;freedom units&amp;quot;&lt;/a&gt;
instead of metric, lost the decimal point, or found some other creative way to
make a mistake. Usually the consequences are small enough that we can chuckle
when we find out &lt;a class="reference external" href="https://www.snopes.com/fact-check/bloomberg-political-ads/"&gt;Michael Bloomberg …&lt;/a&gt;&lt;/p&gt;</summary><content type="html">&lt;a class="reference external image-reference" href="https://www.amazon.com/Humble-Pi-Comedy-Maths-Errors/dp/0241360234"&gt;
&lt;img alt="The cover of Humble Pi by Matt Parker" class="right" src="https://0x42.sh/humble-pi/humble-pi.png" /&gt;
&lt;/a&gt;
&lt;p&gt;We all make mistakes … especially &lt;em&gt;math&lt;/em&gt; mistakes. Either we forgot to carry the
one, used &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Imperial_units"&gt;&amp;quot;freedom units&amp;quot;&lt;/a&gt;
instead of metric, lost the decimal point, or found some other creative way to
make a mistake. Usually the consequences are small enough that we can chuckle
when we find out &lt;a class="reference external" href="https://www.snopes.com/fact-check/bloomberg-political-ads/"&gt;Michael Bloomberg can't afford to give everyone $1 million
dollars&lt;/a&gt;. However,
for the engineers and software developers among us, these mistakes can have real
consequences.&lt;/p&gt;
&lt;p&gt;For the vast majority of us, math is not an easy subject, nor does it come
naturally to us. One of my favorite examples of this (also in the book) happened
in 1852. &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Radhanath_Sikdar"&gt;Radhanath Sikdar&lt;/a&gt;, a
young mathematician at the time, was compiling measurements from multiple
different observations for a mountian peak named &amp;quot;Peak XV&amp;quot; for the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Great_Trigonometrical_Survey"&gt;Great
Trigonometrical Survey&lt;/a&gt;, a project to survey the entire Indian subcontinent.&lt;/p&gt;
&lt;p&gt;While Sikdar was calculating the measurements, he came to the conclusion that
Peak XV was the tallest mountian on record, standing 25,000ft (8,839.2m) tall.
Excited, he gave his report to &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Andrew_Scott_Waugh"&gt;Andrew Scott Waugh&lt;/a&gt;, the director of the
survey at that time, who spent a few years validating the data. However, when
the time came to publish their findings, knowing the public was largely bad at
math and would incorrectly assume 25,000ft as a rounded number and not the exact
height they spent years calculating, they decided to add 2ft to the peak, giving
it the more precise &lt;em&gt;feeling&lt;/em&gt;, but completely wrong, height of 25,002ft
(8,839.8m). This would also lead Waugh to being playfully credited with being the
first person to put two feet on top of the mountian that would eventually be named
Mount Everest.&lt;/p&gt;
&lt;p&gt;Even though we know everyone makes mistakes, and math does not come naturally to
us, we still have an extremely hard time admitting to our errors. &lt;a class="reference external" href="http://standupmaths.com/"&gt;Matt Parker&lt;/a&gt; and his book &lt;a class="reference external" href="https://www.amazon.com/Humble-Pi-Comedy-Maths-Errors/dp/0241360234"&gt;Humble Pi&lt;/a&gt;
believe that, as our world grows in complexity, accidents will continue
to happen. Instead of hiding them in hopes that no one will notice, we should
build systems that encourage us to learn from our mistakes. Much like the United
State's National Aeronautics and Space Administration (NASA) and National
Transportation Safety Board (NTSB), both famously publicize their investigations
into the failures and accidents we see all over the news. Matt Parker's book does
a great job of showing us mistakes can be both funny and a great teaching tool
for our next generation of engineers and everyone else who continues to explore
the boundaries of this world.&lt;/p&gt;
</content><category term="Books"/></entry><entry><title>Probably Approximately Correct</title><link href="https://0x42.sh/probably-approximately-correct/" rel="alternate"/><published>2020-06-17T00:00:00+00:00</published><updated>2020-06-17T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2020-06-17:/probably-approximately-correct/</id><summary type="html">&lt;a class="reference external image-reference" href="https://www.amazon.com/Probably-Approximately-Correct-Algorithms-Prospering/dp/0465032710/"&gt;
&lt;img alt="The cover of Probably Approximately Correct by Leslie Valiant" class="right" src="https://0x42.sh/probably-approximately-correct/probably-approximately-correct.png" /&gt;
&lt;/a&gt;
&lt;p&gt;It was &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Charles_Darwin"&gt;Charles Darwin&lt;/a&gt; who
taught us the foundational and the now widely accepted concept of evolution.
Stating that all life has, over time, descended from a common ancestor through a
process of natural selection, much like how we selectively bred wolves to
produce hundreds of the domesticated dog breeds …&lt;/p&gt;</summary><content type="html">&lt;a class="reference external image-reference" href="https://www.amazon.com/Probably-Approximately-Correct-Algorithms-Prospering/dp/0465032710/"&gt;
&lt;img alt="The cover of Probably Approximately Correct by Leslie Valiant" class="right" src="https://0x42.sh/probably-approximately-correct/probably-approximately-correct.png" /&gt;
&lt;/a&gt;
&lt;p&gt;It was &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Charles_Darwin"&gt;Charles Darwin&lt;/a&gt; who
taught us the foundational and the now widely accepted concept of evolution.
Stating that all life has, over time, descended from a common ancestor through a
process of natural selection, much like how we selectively bred wolves to
produce hundreds of the domesticated dog breeds we see today. The &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Struggle_for_existence"&gt;struggle of
existence&lt;/a&gt; has only
allowed the best of us to produce offspring, resulting in a natural process of
selection, where only the fittest among us will survive. This evolutionary
process tells us why, over millennia, the Giraffes got taller, Sloths got
slower, Alligators got scarier and we doubly wise men became &lt;em&gt;(kinda)&lt;/em&gt; smarter.&lt;/p&gt;
&lt;p&gt;As life continues to evolve, it also continues to grow in its complexity. If
you want to succeed in an interview, or choose a life partner, or (if you're
like my &lt;a class="reference external" href="https://jhbrattlof.com"&gt;wife&lt;/a&gt;) decide on a restaurant, you can be
sure there is no equation that could guarantee you success. Even if we could
collect all the relevant information we needed to answer these questions
(assuming we even knew what that relevant information was) there's still no
sure-fire way to combine the information in a way to yield an answer for us.
Yet, every day, billions of us answer these questions as we go about our
lives. It's these &amp;quot;guesses,&amp;quot; &amp;quot;hunches,&amp;quot; and &amp;quot;gut instincts&amp;quot; that &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Leslie_Valiant"&gt;Dr. Leslie
Valiant&lt;/a&gt;, a world-renowned
theoretical computer scientist, and his book are interested in.&lt;/p&gt;
&lt;p&gt;Historically, computers were only suited for solving the theory-&lt;em&gt;full&lt;/em&gt;
questions in our lives, like modeling fluid flows or &lt;a class="reference external" href="https://maps.esri.com/rc/sat/index.html"&gt;calculating the orbits of
satellites&lt;/a&gt; using Newton's laws.
These theory-full questions, typically mathematical or scientific theories, like
Einstein's famous equation &lt;em&gt;E=mc&lt;/em&gt;&lt;sup&gt;2&lt;/sup&gt;, have clearly defined inputs and
instructions telling us how they operate to accurately calculate a solution to
our questions. More recently though, computers have been growing &amp;quot;softer&amp;quot;
skills, ones that require them to answer more theory-less questions, which have
no well-defined formula to calculate, like deciding on a movie we'd like to
watch, or how to safely drive us home from the pub.&lt;/p&gt;
&lt;p&gt;Computers today learn to answer these theory-less questions much like how we
learned about our world in our infancy (or adolescence for some), by drawing
general lessons from a particular experience to better answer the next question
life throws. Unlike the theory-full algorithms, expertise is not given from the
designer of the equation to the student or computer, but extracted from the
experiences gained from experimentation or through evolution by the student.
These are the equations for learning, rather than answer using &lt;em&gt;E=mc&lt;/em&gt;&lt;sup&gt;2&lt;/sup&gt;
absolutes, these algorithms produce probably approximately correct answers.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="https://www.amazon.com/Probably-Approximately-Correct-Algorithms-Prospering/dp/0465032710/"&gt;Probably Approximately Correct&lt;/a&gt; aims to explain the current
state and future goals of our attempts to find a mathematical definition of the
learning algorithms that we have, through evolution, used to prosper in this
(sometimes excessively) complex world. With the minuscule small hint of our
current understanding of these learning algorithms, we have revolutionized and
transformed our world (mostly) for the better. Through the power of learning
algorithms that can organize the endless supply of websites, the least educated
person today has immeasurably more knowledge at their fingertips than the most
educated had just a few decades ago. With a more thorough understanding of these
learning algorithms, we stand to reshape our civilization even more dramatically
in the decades to come.&lt;/p&gt;
</content><category term="Books"/></entry><entry><title>The Mosquito</title><link href="https://0x42.sh/the-mosquito/" rel="alternate"/><published>2020-05-28T00:00:00+00:00</published><updated>2020-05-28T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2020-05-28:/the-mosquito/</id><summary type="html">&lt;a class="reference external image-reference" href="https://www.amazon.com/Mosquito-Human-History-Deadliest-Predator/dp/1524743410"&gt;
&lt;img alt="The cover of The Mosquito by Timothy C. Winegard" class="right" src="https://0x42.sh/the-mosquito/the-mosquito.png" /&gt;
&lt;/a&gt;
&lt;p&gt;For thousands of years, we humans were terrorized by a deadly disease, without
effective treatments or a clue what actually caused malaria. Hindu texts
from the 6&lt;sup&gt;th&lt;/sup&gt; century BC, Mesopotamian tablets from 2000 BC, and Chinese
documents from around 2700 BC all make references to something we are almost …&lt;/p&gt;</summary><content type="html">&lt;a class="reference external image-reference" href="https://www.amazon.com/Mosquito-Human-History-Deadliest-Predator/dp/1524743410"&gt;
&lt;img alt="The cover of The Mosquito by Timothy C. Winegard" class="right" src="https://0x42.sh/the-mosquito/the-mosquito.png" /&gt;
&lt;/a&gt;
&lt;p&gt;For thousands of years, we humans were terrorized by a deadly disease, without
effective treatments or a clue what actually caused malaria. Hindu texts
from the 6&lt;sup&gt;th&lt;/sup&gt; century BC, Mesopotamian tablets from 2000 BC, and Chinese
documents from around 2700 BC all make references to something we are almost
certain was malaria. It wasn't until the early Greeks, including &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Homer"&gt;Homer&lt;/a&gt; (850 BC) and &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Hippocrates"&gt;Hippocrates&lt;/a&gt; (400 BC) that documents survived
well enough about the enlarged spleens and malarial fevers from people living
in marshy areas. This observation would help popularize &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Miasma_theory"&gt;miasma theory&lt;/a&gt; and how malaria would eventually
receive its name, which literally translates into &amp;quot;bad air&amp;quot; in medieval
Italian.&lt;/p&gt;
&lt;p&gt;Even with remedies like &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Cinchona"&gt;chinhona bark&lt;/a&gt;, a genus of around 23 species of
plants &amp;quot;discovered&amp;quot; in the early-1600s in tropical regions of the Andes
Mountains, it would take another 200 years for scientists like Charles Louis
and Alphonse Laveran in 1880 to discover the parasites causing the malarial
fevers, and another 17 years for Ronald Ross in 1897 to finally incriminate the
mosquito as the delivery system for malaria. Proving once and for all that,
despite all the hollywood movies about snakes on planes, man-eating crocodiles,
or shark-nadios, the mosquito is by far our deadliest predator.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="https://www.coloradomesa.edu/directory/social-behavioral-sciences/timothy-winegard.html"&gt;Timothy Winegard's&lt;/a&gt; book &lt;a class="reference external" href="https://www.amazon.com/Mosquito-Human-History-Deadliest-Predator/dp/1524743410"&gt;The Mosquito&lt;/a&gt; focuses on our collective history as a species and what the
mosquito has done to change it, which turns out to be quite a lot. Starting
with King Tut (1324 BC) who (probably) died at the hands of a mosquito carrying
malaria, to the invisible army of mosquitoes in the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Pontine_Marshes"&gt;Pontine Marshes&lt;/a&gt; literally sucking the life out
of Rome's enemies as they invaded, and includes modern events like the mind
boggling death toll it took to colonize the Americas, and Dr. Seuss' creative
advertisement campaigns needed to remind soldiers to protect themselves from
mosquitoes in World War II.&lt;/p&gt;
&lt;p&gt;A particularly interesting event (for me) covered in the book happened in 1698,
when five ships began their journey in Scotland, planning to ride the Trade
Winds to the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Dari%C3%A9n_Province"&gt;Darién region&lt;/a&gt; of Panama. These ships
where loaded with trading goods, like wigs, woolen socks and blankets,
mother-of-pearl combs, Bibles, twenty-five thousand pairs of leather shoes, and
even a printing press. Their goal was to establish a trading post that would
help the fiercely independent kingdom of Scotland, struggling from years of
famine, to secure a colony on the Isthmus of Panama on the Gulf of Darién.&lt;/p&gt;
&lt;p&gt;This plan was &lt;strong&gt;insanely&lt;/strong&gt; popular in the struggling kingdom, attracting all
forms of investors from poor farmers to members of the national parliament, who
collectively invested approximately 20% of &lt;strong&gt;all the money&lt;/strong&gt; circulating in
Scotland at that time with the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Company_of_Scotland"&gt;Company of Scotland&lt;/a&gt; who was establishing the
trading post in the new world. Mercantilism, the accepted economic theory at the
time, meant that instead of simply competing with England or Spain who, thanks
to their colonies in the new world, where literally moving boat loads of money,
Scotland would need to take market share from these larger empires in order to
stay relevant (and independent from England).&lt;/p&gt;
&lt;p&gt;The Scot's plan did not go well...&lt;/p&gt;
&lt;p&gt;To quote Winegard, &amp;quot;The words that are repeated to the point of nausea in the
diaries, letters, and accounts of the Scottish settlers are mosquitoes, fever,
ague, and death&amp;quot;. What the 1200 colonists didn't anticipate was the deadly
swarms of malaria and yellow fever carrying mosquitoes waiting to feast on the
Scottish settlers when they arrived. Having never experienced malaria or yellow
fever in the temperate climate of Scotland, the colonists had little resistance
and quickly fell sick. After six months, with nearly half of the colonists dead,
the survivors still able to move under their own power (six colonists where left
on the beach) picked-up and fled.&lt;/p&gt;
&lt;p&gt;To make matters worse, 300 additional colonists in a resupply mission and
another 1000 colonists in a second expedition left in 1699, before news of the
complete failure could reach Scotland. These later expeditions largely met the
same fate as the first.  Out of the combined 2500 colonists only a few hundred
survived the expedition to the New World.&lt;/p&gt;
&lt;p&gt;The monumental failure of the Darién scheme left Scotland, its landed
aristocracy, mercantile elites, and nobles in overwhelming debt, which was cited
as one of the motivations for signing the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Acts_of_Union_1707"&gt;Acts of Union&lt;/a&gt; with England in 1707. The
deadly mosquitoes of Darién, one blood-meal at a time, changed world history and
forced a reluctant Scotland to forfeit their independence and join England to
create Great Britain.&lt;/p&gt;
&lt;p&gt;Even today with the advent of pesticides like DDT, modern medicines like
&lt;a class="reference external" href="https://www.cdc.gov/malaria/travelers/drugs.html"&gt;antimalarial drugs&lt;/a&gt;, and
other repellents like &lt;a class="reference external" href="https://www.who.int/mediacentre/news/releases/2007/pr43/en/"&gt;insecticide-treated nets&lt;/a&gt;, we are still
not 100% protected from malaria. In 2018 alone, the &lt;a class="reference external" href="https://www.who.int/malaria/media/world-malaria-report-2018/en/"&gt;WHO found&lt;/a&gt; somewhere
between 206 to 258 &lt;em&gt;MILLION&lt;/em&gt; malaria infections worldwide, with 93% (213
million) of these infections taking place in Africa. Of the 405,000 deaths, a
horrifying 67% (272,000) where children under the age of five years old. While
these terrifying numbers are still 260 million too big, they show how, even with
the advances we've made from Hippocrates' time, our deadliest predator continues
to be a 0.002 gram insect.&lt;/p&gt;
</content><category term="Books"/></entry><entry><title>Removing Git LFS From A Project</title><link href="https://0x42.sh/removing-git-lfs-from-a-project/" rel="alternate"/><published>2020-05-01T00:00:00+00:00</published><updated>2020-05-01T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2020-05-01:/removing-git-lfs-from-a-project/</id><summary type="html">&lt;p&gt;&lt;a class="reference external" href="https://git-lfs.github.com/"&gt;Git Large File Storage (LFS)&lt;/a&gt; is an extension to
Git that separates large, frequently changing, binary files and saves them in a
separate storage location outside of the normal Git project. LFS replaces these
binary files with smaller text files, called pointers, that hold information
about the original file and …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;a class="reference external" href="https://git-lfs.github.com/"&gt;Git Large File Storage (LFS)&lt;/a&gt; is an extension to
Git that separates large, frequently changing, binary files and saves them in a
separate storage location outside of the normal Git project. LFS replaces these
binary files with smaller text files, called pointers, that hold information
about the original file and how to download them. These pointers look something
like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;version https://git-lfs.github.com/spec/v1
oid sha256:d7cbc07ce9b78a89764c0ac5e9c8e1b9dbdeb42c30d8396cba8f75aace5ba225
size 145930
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Git-LFS uses &lt;a class="reference external" href="https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks"&gt;git hooks&lt;/a&gt; to transparently replace and create these pointers
whenever we make a commit and saves the binary data outside of Git until the
next push where it will sync with your LFS server. When everything is working
correctly, we should never see these pointer files.&lt;/p&gt;
&lt;p&gt;When we say &lt;em&gt;&amp;quot;remove LFS from a project&amp;quot;&lt;/em&gt; we really mean removing the hooks
from our project, which, if not done correctly, will leave these pointer files
all throughout our project's history with no way to download the actual files
they pointed to, preventing us from ever building an older release of our
project again.&lt;/p&gt;
&lt;p&gt;To successfully uninstall LFS, we'll need to rewrite the project's history,
replacing all the LFS pointer files with the actual file they pointed to,
adding them back into our project's history. In the end, the project will look
as if LFS was never installed in the project at all.&lt;/p&gt;
&lt;div class="section" id="step-0-make-a-backup"&gt;
&lt;h2&gt;Step: 0 - Make A Backup&lt;/h2&gt;
&lt;p&gt;Before we begin, rewriting history is &lt;strong&gt;remarkably dangerous&lt;/strong&gt;, with many
opportunities to silently break something you'll only find out about after it's
too late. &lt;strong&gt;Make a backup before you start&lt;/strong&gt; and save it until you're absolutely
sure the project's history is correct.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;git&lt;span class="w"&gt; &lt;/span&gt;clone&lt;span class="w"&gt; &lt;/span&gt;--mirror&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;$URL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;backup&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;backup
git&lt;span class="w"&gt; &lt;/span&gt;lfs&lt;span class="w"&gt; &lt;/span&gt;fetch&lt;span class="w"&gt; &lt;/span&gt;--all
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Replacing &lt;code&gt;$URL&lt;/code&gt; with the location of the project, these two commands
will:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;Create a bare copy of the project and navigate into it.&lt;/li&gt;
&lt;li&gt;Download all the files in LFS from the remote LFS server.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This will give you a complete backup of the project, including branches and
other &lt;em&gt;refs&lt;/em&gt; like tags and notes, as well as download all the files from LFS,
ensuring any screw-ups with the next step are recoverable.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="step-1-rewrite-history"&gt;
&lt;h2&gt;Step: 1 - Rewrite History&lt;/h2&gt;
&lt;p&gt;In the working copy of our project, &lt;code&gt;filter-branch&lt;/code&gt; can easily replace
all the LFS pointer files for us. However this step will also change all the
names of the commits &lt;a class="reference external" href="https://git-scm.com/book/en/v2/Git-Internals-Git-Objects"&gt;(git objects)&lt;/a&gt; in your project, so references to specific
commits like &lt;code&gt;Fixed in a4749f3&lt;/code&gt; will break. If you have a lot of these
types of commits you might want to use more advanced tools like &lt;a class="reference external" href="https://github.com/newren/git-filter-repo"&gt;filter-repo&lt;/a&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;git&lt;span class="w"&gt; &lt;/span&gt;filter-branch&lt;span class="w"&gt; &lt;/span&gt;-f&lt;span class="w"&gt; &lt;/span&gt;--prune-empty&lt;span class="w"&gt; &lt;/span&gt;--tree-filter&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;
&lt;span class="s1"&gt;[ -f .gitattributes ] &amp;amp;&amp;amp;  git rm -f .gitattributes&lt;/span&gt;
&lt;span class="s1"&gt;find . -type f | while read FILE; do&lt;/span&gt;
&lt;span class="s1"&gt;   if head -2 &amp;quot;$FILE&amp;quot; | grep -q &amp;quot;^oid sha256:&amp;quot;; then&lt;/span&gt;
&lt;span class="s1"&gt;      POINTER=$(cat &amp;quot;$FILE&amp;quot;)&lt;/span&gt;
&lt;span class="s1"&gt;      echo -n &amp;quot;$POINTER&amp;quot; | git lfs smudge &amp;gt; &amp;quot;$FILE&amp;quot;&lt;/span&gt;
&lt;span class="s1"&gt;      git add &amp;quot;$FILE&amp;quot;&lt;/span&gt;
&lt;span class="s1"&gt;   fi&lt;/span&gt;
&lt;span class="s1"&gt;done&amp;#39;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;--tag-name-filter&lt;span class="w"&gt; &lt;/span&gt;cat&lt;span class="w"&gt; &lt;/span&gt;--&lt;span class="w"&gt; &lt;/span&gt;--all
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This uses &lt;code&gt;git filter-branch --tree-filter&lt;/code&gt; to go through each commit for
every branch, and use &lt;code&gt;find&lt;/code&gt; and &lt;code&gt;grep&lt;/code&gt; to list every LFS pointer
file so we can use &lt;code&gt;git lfs smudge&lt;/code&gt; to replace the pointers with the LFS
data, before committing the changes and moving to the next commit in the history.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="step-2-remove-hooks-filters"&gt;
&lt;h2&gt;Step: 2 - Remove Hooks &amp;amp; Filters&lt;/h2&gt;
&lt;p&gt;If this is the only project that uses LFS, you can remove the &lt;code&gt;--local&lt;/code&gt;
option from the following command to also remove the filters from your global
Git config file &lt;code&gt;~/.gitconfig&lt;/code&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;git lfs uninstall --local
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Once the pointer files have been replaced, we can remove the hooks and filters
LFS used to read the pointer files every time we &lt;em&gt;fetched&lt;/em&gt;, &lt;em&gt;pushed&lt;/em&gt; or
&lt;em&gt;committed&lt;/em&gt; changes to the project, with this simple built-in LFS command.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="step-3-remove-the-lfs-cache"&gt;
&lt;h2&gt;Step: 3 - Remove the LFS cache&lt;/h2&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;rm -r .git/lfs
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Finally, if this is your working copy of the project, you can save some disk space
by removing the LFS cache folder inside the git directory of your project.&lt;/p&gt;
&lt;/div&gt;
</content><category term="Tips"/></entry><entry><title>Upheaval</title><link href="https://0x42.sh/upheaval/" rel="alternate"/><published>2020-01-27T00:00:00+00:00</published><updated>2020-01-27T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2020-01-27:/upheaval/</id><summary type="html">&lt;a class="reference external image-reference" href="https://www.amazon.com/Upheaval-Turning-Points-Nations-Crisis/dp/0316409138"&gt;
&lt;img alt="The cover of Upheaval by Jared Diamond" class="right" src="https://0x42.sh/upheaval/jared-diamond-upheaval.png" /&gt;
&lt;/a&gt;
&lt;p&gt;&lt;a class="reference external" href="http://jareddiamond.org/Jared_Diamond/Welcome.html"&gt;Jared Diamond&lt;/a&gt; is one of
those prolific authors with so much clout to his name that when he writes a new
book, &lt;em&gt;everyone knows about it&lt;/em&gt;. That's why when he published his new book called
&lt;a class="reference external" href="https://www.amazon.com/Upheaval-Turning-Points-Nations-Crisis/dp/0316409138"&gt;Upheaval&lt;/a&gt; all of the algorithms I trust easily shoved this
straight to the top of …&lt;/p&gt;</summary><content type="html">&lt;a class="reference external image-reference" href="https://www.amazon.com/Upheaval-Turning-Points-Nations-Crisis/dp/0316409138"&gt;
&lt;img alt="The cover of Upheaval by Jared Diamond" class="right" src="https://0x42.sh/upheaval/jared-diamond-upheaval.png" /&gt;
&lt;/a&gt;
&lt;p&gt;&lt;a class="reference external" href="http://jareddiamond.org/Jared_Diamond/Welcome.html"&gt;Jared Diamond&lt;/a&gt; is one of
those prolific authors with so much clout to his name that when he writes a new
book, &lt;em&gt;everyone knows about it&lt;/em&gt;. That's why when he published his new book called
&lt;a class="reference external" href="https://www.amazon.com/Upheaval-Turning-Points-Nations-Crisis/dp/0316409138"&gt;Upheaval&lt;/a&gt; all of the algorithms I trust easily shoved this
straight to the top of my &amp;quot;recommended reading&amp;quot; list and they had good reason.&lt;/p&gt;
&lt;p&gt;This book, inspired by his wife, Marie Cohn who works as a psychologist, uses
crisis therapy techniques developed to help people through periods of crisis in
their own lives (like the death of a loved one) and adapts them to gain
understanding of how nations going through crisis can successfully navigate that
critical time.&lt;/p&gt;
&lt;p&gt;Diamond begins the first part of the book by taking the 12 things therapists have
identified as steps that indicate whether someone will succeed in resolving a
personal crisis (like acknowledging you're in a crisis), and adapts them into 12
success factors to apply to the case studies of countries going through a crisis
at different times in history.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;a class="reference external" href="https://en.wikipedia.org/wiki/Winter_War"&gt;Finland during World War II&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="https://en.wikipedia.org/wiki/Perry_Expedition"&gt;Japan after meeting U.S. President Millard Fillmore in 1853&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="https://en.wikipedia.org/wiki/1973_Chilean_coup_d%27%C3%A9tat"&gt;Chile's coup in the 1970s&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="https://en.wikipedia.org/wiki/Indonesian_mass_killings_of_1965%E2%80%9366"&gt;Indonesia's Communist Purge in 1965&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="https://en.wikipedia.org/wiki/Reconstruction_of_Germany"&gt;Rebuilding Germany after World War II&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="https://en.wikipedia.org/wiki/Post-war_immigration_to_Australia"&gt;Australia's identity crisis during the 1950s&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;While all the case studies were interesting, my favorite case study (largely
because I knew so little about the situation) was how Finland, which shares a
1,000 mile border with the Soviet Union, was able to successfully bridge the
desires of Stalin, who at the time was invading &lt;em&gt;all&lt;/em&gt; his neighboring countries,
(Finland was the only Soviet neighboring country not occupied by the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Soviet_Union"&gt;USSR&lt;/a&gt;) and the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Allies_of_World_War_II"&gt;Allied Powers&lt;/a&gt; who largely abandoned
Finland during the war.&lt;/p&gt;
&lt;p&gt;After Finland refused &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Joseph_Stalin"&gt;Joseph Stalin's&lt;/a&gt; demands on November 30, 1939, the Soviets invaded
and kicked off what would eventually become known as the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Winter_War"&gt;Winter War&lt;/a&gt; that &lt;em&gt;officially&lt;/em&gt; lasted until March 13,
1940 and &lt;em&gt;unofficially&lt;/em&gt; continued throughout all of &lt;a class="reference external" href="https://en.wikipedia.org/wiki/World_War_II"&gt;World War II&lt;/a&gt;.&lt;/p&gt;
&lt;a class="reference external image-reference" href="https://www.nationsonline.org/oneworld/map/Finland-map.htm"&gt;
&lt;img alt="A political map of Finland and its shared 1,000 mile border with Russia. Shamelessly taken from the Nations Online Project" class="right" src="https://0x42.sh/upheaval/map-of-finland.png" /&gt;
&lt;/a&gt;
&lt;p&gt;The Finnish volunteers used creative tactics to fight the Soviets, taking
advantage of the Finn's small population and their knowledge of the terrain,
ensuring the Soviets would have to pay a high price to capture Finland (8
Soviets per Finn killed).&lt;/p&gt;
&lt;p&gt;However, the size of the Soviet army meant that defeat was still a very real
possibility and Finland would need outside help defending itself. So the Finns
asked for help from the Allied Powers, and controversially, after the Allies
refused (they were busy fighting Nazi Germany) they asked for and received help
from &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Nazi_Germany"&gt;Nazi Germany&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Finland would eventually find that Nazi Germany wasn't an ally they wanted and
began to realize their main problem was something totally out of their control,
their &lt;em&gt;geography&lt;/em&gt;, (one of the success factors). So they began to refuse support
to the Nazis (possibly the turning point in the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Siege_of_Leningrad"&gt;Battle of Leningrad&lt;/a&gt;) and started to evaluate
their situation.&lt;/p&gt;
&lt;p&gt;The Soviet army was so large in comparison to Finland's that if the Soviets
really wanted to control Finland, they could. And although Finland to this day
remains a &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Sisu"&gt;very proud nation&lt;/a&gt;, it is also
realistic. Instead of ignoring the Soviets (like they've been doing up to World
War II) they began negotiating with them, and persuaded the Soviets that they
would gain nothing by occupying Finland.&lt;/p&gt;
&lt;p&gt;Amazingly, this approach allowed Finland to maintain its independence, and
eventually became an important source of western technology to the Soviets
during the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Cold_War"&gt;Cold War&lt;/a&gt; (making Finland
a powerful friend to the Soviets and ensuring Finland's independence). However,
true to all compromises, it also came with drawbacks.&lt;/p&gt;
&lt;p&gt;On top of having to drive &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Automotive_industry_in_the_Soviet_Union"&gt;crappy Soviet cars&lt;/a&gt; during the Cold War, Finnish
newspapers were expected to censor themselves from reporting on Soviet abuses
(a large compromise for a liberal democratic nation like Finland) in order to
keep from offending Soviet sensibilities.&lt;/p&gt;
&lt;p&gt;During the Cold War, western diplomats would coin the term &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Finlandization"&gt;&amp;quot;Finlandization&amp;quot;&lt;/a&gt; to mean weaker countries
(Finland) pandering to stronger ones (USSR), but Diamond reminds us that these
diplomats were from countries that don't share the same geography as Finland
and never offered help when Finland was desperately trying to remain
independent during the Soviet invasion in World War II.&lt;/p&gt;
&lt;p&gt;In the last sections of the book, Diamond switches from looking at historical
examples of nations in crises, and begins focusing on the current and future
crisis we're dealing with, everything from &lt;a class="reference external" href="https://climate.nasa.gov"&gt;climate change&lt;/a&gt; to &lt;a class="reference external" href="https://www.pewresearch.org/politics/2017/10/05/the-partisan-divide-on-political-values-grows-even-wider/"&gt;political polarization&lt;/a&gt;,
and applies the 12 success factors so that we might come out better at the end
(much like Finland was able to after World War II),&lt;/p&gt;
&lt;p&gt;Over all, while Jared Diamond doesn't go as far as to predict if we'll be able
to successfully navigate through our future challenges, he does do an amazing job
showing how other nations have creatively solved their biggest issues and
provides a 12 step process to successfully navigate ours if we choose to follow
it.&lt;/p&gt;
</content><category term="Books"/></entry><entry><title>A Gentleman In Moscow</title><link href="https://0x42.sh/a-gentleman-in-moscow/" rel="alternate"/><published>2020-01-05T00:00:00+00:00</published><updated>2020-01-05T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2020-01-05:/a-gentleman-in-moscow/</id><summary type="html">&lt;a class="reference external image-reference" href="https://www.amazon.com/A-Gentleman-in-Moscow/dp/0143110438"&gt;
&lt;img alt="The cover of A Gentleman in Moscow by Amor Towles" class="right" src="https://0x42.sh/a-gentleman-in-moscow/a-gentleman-in-moscow.png" /&gt;
&lt;/a&gt;
&lt;p&gt;At this point in my reading career (amateur status), most of the books I read
can easily be classified as non-fiction, and when I do read fiction, it's rarely
beyond a comic strip like &lt;a class="reference external" href="https://xkcd.com/"&gt;xkcd&lt;/a&gt;, &lt;a class="reference external" href="https://sarahcandersen.com/"&gt;Sarah's Scribbles&lt;/a&gt; or &lt;a class="reference external" href="https://www.gocomics.com/calvinandhobbes"&gt;Calvin &amp;amp; Hobbes&lt;/a&gt;. However, after watching Jordan read &lt;a class="reference external" href="http://www.amortowles.com/amor-towles-bio/"&gt;Amor Towles'&lt;/a&gt; A Gentleman in …&lt;/p&gt;</summary><content type="html">&lt;a class="reference external image-reference" href="https://www.amazon.com/A-Gentleman-in-Moscow/dp/0143110438"&gt;
&lt;img alt="The cover of A Gentleman in Moscow by Amor Towles" class="right" src="https://0x42.sh/a-gentleman-in-moscow/a-gentleman-in-moscow.png" /&gt;
&lt;/a&gt;
&lt;p&gt;At this point in my reading career (amateur status), most of the books I read
can easily be classified as non-fiction, and when I do read fiction, it's rarely
beyond a comic strip like &lt;a class="reference external" href="https://xkcd.com/"&gt;xkcd&lt;/a&gt;, &lt;a class="reference external" href="https://sarahcandersen.com/"&gt;Sarah's Scribbles&lt;/a&gt; or &lt;a class="reference external" href="https://www.gocomics.com/calvinandhobbes"&gt;Calvin &amp;amp; Hobbes&lt;/a&gt;. However, after watching Jordan read &lt;a class="reference external" href="http://www.amortowles.com/amor-towles-bio/"&gt;Amor Towles'&lt;/a&gt; A Gentleman in Moscow, it was
easy to add this great book to my reading list.&lt;/p&gt;
&lt;p&gt;At one point Jordan burst into laughter (not &lt;em&gt;too&lt;/em&gt; uncommon) envisioning the
Count, our main character,  witness guests at the hotel dropping different
objects (including an egg) from the second floor balcony of the Metropol's
ballroom to test the accuracy of &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Newton%27s_law_of_universal_gravitation"&gt;Newton's law of universal gravitation&lt;/a&gt; they
had learned about in school. Towles' ability to illustrate scenes like this and
his quirky attention to detail brings the Count to life in a way that by the end
of the book you really feel like friends.&lt;/p&gt;
&lt;p&gt;This surprisingly upbeat book follows the life of Count Alexander Ilyich Rostov,
an unrepentant (according to the Bolshevik tribunal) Russian aristocrat,
sentenced to life under house arrest in &lt;a class="reference external" href="https://metropol-moscow.ru/en/"&gt;Moscow’s Metropol Hotel&lt;/a&gt; in 1922, after the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/October_Revolution"&gt;Great October Socialist
Revolution&lt;/a&gt; when the
&lt;a class="reference external" href="https://en.wikipedia.org/wiki/Bolsheviks"&gt;Bolshevik Party&lt;/a&gt;, founded by
&lt;a class="reference external" href="https://en.wikipedia.org/wiki/Vladimir_Lenin"&gt;Vadimir Lenin&lt;/a&gt;, took power of
the newly formed &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Soviet_Union"&gt;Soviet Union&lt;/a&gt;
currently under &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Russian_Provisional_Government"&gt;provisional governmental&lt;/a&gt; control after &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Grand_Duke_Michael_Alexandrovich_of_Russia"&gt;Grand Duke Michael&lt;/a&gt;
declined to take power after &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Nicholas_II_of_Russia"&gt;Tsar Nicholas II&lt;/a&gt; abdicated in 1917.&lt;/p&gt;
&lt;p&gt;And while the book is fictional (and the Count as far as I know), the events
happening outside of the very real &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Hotel_Metropol_Moscow"&gt;Metropol Hotel&lt;/a&gt; are true to the interesting (and volatile) Russian
history in the 20&lt;sup&gt;th&lt;/sup&gt; century. The book stays fairly focused on the
Count's life inside the hotel, who feels like a man stuck in time with the
Metropol, watching life go on around them, which allows large historical events,
like &lt;a class="reference external" href="https://en.wikipedia.org/wiki/World_War_II"&gt;World War II&lt;/a&gt;, to get little
mention outside of the footnotes (a great story by themselves). So you don't
need to be a &lt;a class="reference external" href="https://www.dictionary.com/browse/russophile"&gt;Russophile&lt;/a&gt; to
understand this addicting and fun book.&lt;/p&gt;
&lt;p&gt;Without spoiling too much of this rich story, my favorite scenes of the book
were of the Count, who together with the cook and &lt;a class="reference external" href="https://www.merriam-webster.com/dictionary/ma%C3%AEtre%20d'"&gt;maître d’&lt;/a&gt;, have managed to
secretly collect all of the ingredients needed for a decadent seafood stew,
called &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Bouillabaisse"&gt;bouillabaisse&lt;/a&gt;, to share
amongst themselves. After they've prepared the dish, they spend the afternoon
telling stories and reliving memories. This moment in the book, together with
Towles' ability to bring these moments to life, really do make you feel like
you've missed a wonderful afternoon with friends.&lt;/p&gt;
&lt;p&gt;Overall &lt;a class="reference external" href="https://www.amazon.com/A-Gentleman-in-Moscow/dp/0143110438"&gt;A Gentleman in Moscow&lt;/a&gt; is such a fun read with such
a rich and amazing story spanning the full spectrum of romance, fun, politics,
and thrills, making this book easy to recommend to friends for their 2020
reading.&lt;/p&gt;
</content><category term="Books"/></entry><entry><title>Scraping the MLB for Science &amp; Glory</title><link href="https://0x42.sh/scraping-the-mlb-for-science-glory/" rel="alternate"/><published>2019-09-14T00:00:00+00:00</published><updated>2019-09-14T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2019-09-14:/scraping-the-mlb-for-science-glory/</id><summary type="html">&lt;p&gt;If you’ve been following baseball long enough you might have noticed nobody
likes baseball anymore, and &lt;a class="reference external" href="https://www.usatoday.com/story/sports/mlb/columnist/bob-nightengale/2018/06/20/mlb-bad-baseball-attendance-strikeouts/718162002/"&gt;everyone&lt;/a&gt; has an &lt;a class="reference external" href="https://www.nydailynews.com/sports/baseball/ny-mlb-attendance-20190824-btrdnjepunas3cejlknbjovjhu-story.html"&gt;opinion&lt;/a&gt; or &lt;a class="reference external" href="https://www.npr.org/2019/08/07/748972052/opinion-speeding-up-baseball-to-save-it"&gt;stance&lt;/a&gt; or &lt;a class="reference external" href="https://www.usatoday.com/story/sports/mlb/columnist/bob-nightengale/2019/08/19/mlb-baseballs-old-timers-decry-state-modern-game/2047025001/"&gt;view&lt;/a&gt; or &lt;a class="reference external" href="https://www.cbc.ca/sports/the-buzzer/the-buzzer-whats-wrong-with-baseball-1.5020638"&gt;insight&lt;/a&gt; into &lt;a class="reference external" href="https://bleacherreport.com/articles/2791455-i-find-it-very-difficult-to-watch-why-mlb-greats-think-baseballs-in-trouble"&gt;how&lt;/a&gt; to &lt;a class="reference external" href="https://www.mlb.com/news/commissioner-rob-manfred-talks-pace-of-play-c266818890"&gt;fix&lt;/a&gt; it.&lt;/p&gt;
&lt;p&gt;As someone who likes baseball &lt;em&gt;(and statistics)&lt;/em&gt;, I want to analyze these
trends myself and find out if …&lt;/p&gt;</summary><content type="html">&lt;p&gt;If you’ve been following baseball long enough you might have noticed nobody
likes baseball anymore, and &lt;a class="reference external" href="https://www.usatoday.com/story/sports/mlb/columnist/bob-nightengale/2018/06/20/mlb-bad-baseball-attendance-strikeouts/718162002/"&gt;everyone&lt;/a&gt; has an &lt;a class="reference external" href="https://www.nydailynews.com/sports/baseball/ny-mlb-attendance-20190824-btrdnjepunas3cejlknbjovjhu-story.html"&gt;opinion&lt;/a&gt; or &lt;a class="reference external" href="https://www.npr.org/2019/08/07/748972052/opinion-speeding-up-baseball-to-save-it"&gt;stance&lt;/a&gt; or &lt;a class="reference external" href="https://www.usatoday.com/story/sports/mlb/columnist/bob-nightengale/2019/08/19/mlb-baseballs-old-timers-decry-state-modern-game/2047025001/"&gt;view&lt;/a&gt; or &lt;a class="reference external" href="https://www.cbc.ca/sports/the-buzzer/the-buzzer-whats-wrong-with-baseball-1.5020638"&gt;insight&lt;/a&gt; into &lt;a class="reference external" href="https://bleacherreport.com/articles/2791455-i-find-it-very-difficult-to-watch-why-mlb-greats-think-baseballs-in-trouble"&gt;how&lt;/a&gt; to &lt;a class="reference external" href="https://www.mlb.com/news/commissioner-rob-manfred-talks-pace-of-play-c266818890"&gt;fix&lt;/a&gt; it.&lt;/p&gt;
&lt;p&gt;As someone who likes baseball &lt;em&gt;(and statistics)&lt;/em&gt;, I want to analyze these
trends myself and find out if I need a new “national pastime”, or if these
&lt;a class="reference external" href="https://en.wikipedia.org/wiki/Charlatan"&gt;charlatans&lt;/a&gt; are just talking out
their &lt;em&gt;…er…&lt;/em&gt; padding their word counts.&lt;/p&gt;
&lt;p&gt;So for this essay, we'll begin a multi-step journey, analyzing baseball
statistics collected from the MLB to analyze the cause of &amp;quot;baseball's declining
fan base&amp;quot; in later essays.&lt;/p&gt;
&lt;div class="section" id="the-plan"&gt;
&lt;h2&gt;The Plan&lt;/h2&gt;
&lt;p&gt;Just like &lt;a class="reference external" href="https://www.history.nasa.gov/moondec.html"&gt;landing on the moon&lt;/a&gt;,
blind dates, or playing with data science, having a plan &lt;em&gt;before&lt;/em&gt; you start is
&lt;em&gt;usually&lt;/em&gt; a good idea.&lt;/p&gt;
&lt;p&gt;As for our plan, we'll need to define a scientific question from the internet's
&lt;em&gt;&amp;quot;opinions&amp;quot;&lt;/em&gt; about baseball's decline before we can find sources of information
we'll use to answer it.&lt;/p&gt;
&lt;div class="section" id="the-question"&gt;
&lt;h3&gt;The Question&lt;/h3&gt;
&lt;p&gt;While there are literally &lt;em&gt;TONS&lt;/em&gt; &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt; of articles with opinions on &lt;a class="reference external" href="https://duckduckgo.com/?q=%22how+to+fix+baseball%22&amp;amp;t=canonical&amp;amp;ia=web"&gt;&amp;quot;how to fix
baseball&amp;quot;&lt;/a&gt;, the main claim I want to answer is:&lt;/p&gt;
&lt;img alt="Are strikeouts making baseball boring?" src="https://0x42.sh/scraping-the-mlb-for-science-glory/are-strikeouts-making-baseball-boring.png" /&gt;
&lt;p&gt;This question seems to have been asked every baseball season, or at least before
I stopped searching the internet at a &lt;a class="reference external" href="https://www.chicagotribune.com/news/ct-xpm-1999-03-07-9903070427-story.html"&gt;1999 Chicago Tribune article&lt;/a&gt;
interviewing &lt;a class="reference external" href="https://www.baseball-reference.com/players/w/willite01.shtml"&gt;Ted Williams&lt;/a&gt; about the rise in strikeouts with the &amp;quot;new wave&amp;quot; of power
hitters like &lt;a class="reference external" href="https://www.baseball-reference.com/players/m/mcgwima01.shtml"&gt;Mark McGuire&lt;/a&gt; and &lt;a class="reference external" href="https://www.baseball-reference.com/players/s/sosasa01.shtml"&gt;Sammy Sosa&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Which I get, if no one can hit the ball then baseball is really 9 dudes throwing
the ball to themselves for 3 hours, which &lt;strong&gt;is&lt;/strong&gt; boring.&lt;/p&gt;
&lt;p&gt;So, with a solid question defined, let's start looking for sources of
information that could answer it.&lt;/p&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-1" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;&amp;quot;literally &lt;em&gt;TONS&lt;/em&gt;&amp;quot; measured by page count from a &lt;a class="reference external" href="https://www.google.com/search?hl=en&amp;amp;q=%22how%20to%20fix%20baseball%22"&gt;google search&lt;/a&gt;,
not by weight which the internet is estimated to only &lt;a class="reference external" href="https://adamant.typepad.com/seitz/2006/10/weighing_the_we.html"&gt;weigh ~50 grams&lt;/a&gt; or
0.00005 metric tons.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;div class="section" id="what-data-do-we-need"&gt;
&lt;h3&gt;What Data Do We Need&lt;/h3&gt;
&lt;p&gt;Our question is dealing with trends in pitching and hitting statistics. So we'll
need to find a source of baseball statistics going back to the beginning of
baseball (&lt;a class="reference external" href="https://en.m.wikipedia.org/wiki/Anno_Domini"&gt;anno domini&lt;/a&gt; ~1876) or
as far back as we can find.&lt;/p&gt;
&lt;p&gt;While there are many websites to choose from, I feel the data from the &lt;a class="reference external" href="https://www.mlb.com"&gt;MLB&lt;/a&gt; is a good choice, mainly so we can say &lt;em&gt;&amp;quot;according to
the official MLB data…&amp;quot;&lt;/em&gt; before we start arguing with strangers on the
internet.&lt;/p&gt;
&lt;p&gt;Unfortunately though, as is often the case, the MLB doesn't give us a link to
download their statistics for further analysis.&lt;/p&gt;
&lt;p&gt;So we'll need to build a &amp;quot;web scraper&amp;quot; to download the data directly from their
website, essentially making our own &amp;quot;download&amp;quot; button.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="some-rules-before-we-start"&gt;
&lt;h2&gt;Some Rules Before We Start&lt;/h2&gt;
&lt;p&gt;Before we begin collecting data from the MLB, I feel I should give a few words
(warnings) about internet etiquette:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;Respect the &lt;a class="reference external" href="http://gdx.mlb.com/components/copyright.txt"&gt;MLB's terms and conditions&lt;/a&gt;. The MLB are &lt;a class="reference external" href="https://twitter.com/Jomboy_/status/1151971547147583488?s=20"&gt;savages in the box&lt;/a&gt; and can defy the laws of physics
to destroy you if you mess with their copyright.&lt;/li&gt;
&lt;li&gt;Don't stress their servers. Our scrapers can easily make thousands of
requests per second which their servers will struggle to fulfill. (See rule
#1 for why)&lt;/li&gt;
&lt;li&gt;The code &lt;strong&gt;will&lt;/strong&gt; break. Websites change their layouts like I change my
&lt;em&gt;…er…&lt;/em&gt; socks. Make sure you save the data you've downloaded so you only have
to do this once.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Now with all the etiquette out of the way, let's start collecting some data.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="collecting-some-data"&gt;
&lt;h2&gt;Collecting Some Data&lt;/h2&gt;
&lt;p&gt;Back &amp;quot;&lt;a class="reference external" href="https://www.urbandictionary.com/define.php?term=back%20in%20the%20day"&gt;in the day&lt;/a&gt;&amp;quot; this step would require &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Web_scraping"&gt;&amp;quot;web scraping&amp;quot;&lt;/a&gt;, using python libraries like
&lt;a class="reference external" href="https://www.crummy.com/software/BeautifulSoup/"&gt;BeautifulSoup&lt;/a&gt;, to parse the
raw HTML for the data we want.&lt;/p&gt;
&lt;p&gt;Now-a-days, with the increasing popularity of tools like &lt;a class="reference external" href="https://angularjs.org/"&gt;AngularJS&lt;/a&gt; and &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Representational_state_transfer#Applied_to_web_services"&gt;RESTful APIs&lt;/a&gt;, websites are using
&lt;a class="reference external" href="https://www.javascript.com/"&gt;JavaScript&lt;/a&gt; to render the web-page on our
browsers (client-side) instead of on their servers (server-side).&lt;/p&gt;
&lt;p&gt;The MLB is one of these &amp;quot;now-a-days&amp;quot; websites.&lt;/p&gt;
&lt;img alt="Source code of the mlb.com's team pitching stats page, which is missing it's content because the JavaScript hasn't downloaded it yet." src="https://0x42.sh/scraping-the-mlb-for-science-glory/data-should-be-here.png" /&gt;
&lt;p&gt;With a client-side app, the static HTML our browsers download is a template
(skin) for the data. The JavaScript will download the data using the RESTful API
and convert it into something we can easily read by applying the template.&lt;/p&gt;
&lt;p&gt;As complicated as that sounds, because the data returned by the API is in a form
our computers can use, it's often &lt;em&gt;easier&lt;/em&gt; than scraping the web-page directly.&lt;/p&gt;
&lt;p&gt;We just have to find the API.&lt;/p&gt;
&lt;div class="section" id="finding-the-data"&gt;
&lt;h3&gt;Finding The Data&lt;/h3&gt;
&lt;p&gt;Because our browsers are doing all the work of building the MLB's client-side
web-page, we can use our browser's built-in developer tools to find where the
data is coming from.&lt;/p&gt;
&lt;p&gt;Depending on which browser you prefer, open your developer tools:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;How to open &lt;a class="reference external" href="https://developers.google.com/web/tools/chrome-devtools/#open"&gt;Chrome's Developer Tools.&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;How to open &lt;a class="reference external" href="https://developer.mozilla.org/en-US/docs/Learn/Common_questions/What_are_browser_developer_tools#How_to_open_the_devtools_in_your_browser"&gt;Firefox's Developer Tools.&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With the developer tools open:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Find and open the &lt;strong&gt;Network&lt;/strong&gt; tab.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For Firefox 69.0 the Network tab is in &lt;strong&gt;Tools → Web Developer → Network&lt;/strong&gt; or if
you're into key bindings Ctrl+Shift+E.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Next, go to the &lt;a class="reference external" href="https://www.mlb.com/stats/team"&gt;MLB's stats page&lt;/a&gt; and
refresh the page with the Network tab open.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You should see a list of all the requests your browser made while building the
MLB's web-page. Something similar to this (I'm using Firefox 69.0):&lt;/p&gt;
&lt;img alt="Screenshot of the Network tab in Firefox's Developer Tools showing the 128 requests made by the mlb.com client side web application." src="https://0x42.sh/scraping-the-mlb-for-science-glory/mlb-network-waterfall.png" /&gt;
&lt;p&gt;&lt;a class="reference external" href="https://time.gov/"&gt;If you're in the year 2019&lt;/a&gt;, our browsers made 128
individual requests to their servers &lt;a class="reference external" href="https://httparchive.org/reports/page-weight#reqTotal"&gt;(54 more than average)&lt;/a&gt; for information to build the
web-page. Which is 127 more requests I'm willing to look through by hand, so
let's start filtering these requests down:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Click on the &lt;strong&gt;XHR&lt;/strong&gt; icon to filter out unrelated requests.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The requests we're interested in are XHR (&lt;a class="reference external" href="https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest"&gt;XMLHttpRequest&lt;/a&gt;) objects. These are
requests made by the web-page's JavaScript to get data (usually in &lt;a class="reference external" href="https://en.wikipedia.org/wiki/XML"&gt;XML&lt;/a&gt; or &lt;a class="reference external" href="https://en.wikipedia.org/wiki/JSON"&gt;JSON&lt;/a&gt; form) from an API.&lt;/p&gt;
&lt;img alt="Screenshot of the Network tab in Firefox's Developer Tools showing the 23 filtered requests after applying the XHR filter." src="https://0x42.sh/scraping-the-mlb-for-science-glory/mlb-xhr-waterfall.png" /&gt;
&lt;p&gt;With the requests filtered, we can start exploring for the data we want. We know
we're looking for data which usually comes in either XML or JSON format, so we
can ignore the &amp;quot;js&amp;quot; types.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Click on each &lt;strong&gt;Response&lt;/strong&gt; looking for the data.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I usually look at responses with the largest size first and work my way down.
After a few attempts, we'll find a JSON blob with keys that match the data we
want.&lt;/p&gt;
&lt;img alt="Screenshot of the JSON response from the MLB's &amp;quot;team_pitching_season_leader_master&amp;quot; endpoint." src="https://0x42.sh/scraping-the-mlb-for-science-glory/mlb-api-response.png" /&gt;
&lt;p&gt;&lt;em&gt;Huzzah!&lt;/em&gt; We've found the API endpoint with the data we need.&lt;/p&gt;
&lt;p id="the-response"&gt;Before we move on, take note of the data's shape in the response. Notice the
list &lt;code&gt;rows&lt;/code&gt;, with all the baseball stats, is inside &lt;code&gt;queryResults&lt;/code&gt;
inside &lt;code&gt;team_pitching_season_leader_master&lt;/code&gt;. We'll need this when we
start building the web scraper later on.&lt;/p&gt;
&lt;p&gt;Our next step is to find how we can recreate the request using python:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Switch to that request's &lt;strong&gt;Headers&lt;/strong&gt; tab and copy the URL.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This will show us the URL and the Headers we sent to the server.&lt;/p&gt;
&lt;img alt="Screenshot about the details of the request made to the API to download the stats data we need to answer our question." src="https://0x42.sh/scraping-the-mlb-for-science-glory/mlb-api-request.png" /&gt;
&lt;p&gt;If everything went according to plan, when you&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Copy the URL into a new browser tab,&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;you should see the same information we found in the Developer Tools just a
moment ago telling us we've found the URL to the API we want.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Good Job!&lt;/em&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="the-code"&gt;
&lt;h2&gt;The Code&lt;/h2&gt;
&lt;p&gt;With the API discovered, we can begin building the web scraper to download the
pitching statistics that will help us answer our initial question.&lt;/p&gt;
&lt;p&gt;To make things easier on ourselves, let's install some libraries to help us
download the data.&lt;/p&gt;
&lt;p&gt;We'll be using:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;a class="reference external" href="https://2.python-requests.org/en/master/"&gt;requests&lt;/a&gt; to create the HTTP
requests to the MLB's API and return its content.&lt;/li&gt;
&lt;li&gt;and &lt;a class="reference external" href="https://pandas.pydata.org/"&gt;pandas&lt;/a&gt; to help us manipulate and analyze
&lt;a class="reference external" href="https://en.wikipedia.org/wiki/Data_wrangling"&gt;(data wrangle)&lt;/a&gt; later on,
after we've downloaded the data from the MLB.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Both can easily be installed via &lt;code&gt;pip&lt;/code&gt; like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$ pip install requests pandas
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="section" id="writing-the-code"&gt;
&lt;h3&gt;Writing the Code&lt;/h3&gt;
&lt;p&gt;From our &lt;a class="reference internal" href="#the-response"&gt;exploration above&lt;/a&gt;, we know the URL to our endpoint is this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;team_pitching&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;http://lookup-service-prod.mlb.com/&amp;#39;&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;json/named.team_pitching_season_leader_master.bam?&amp;#39;&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;season=2019&amp;amp;sort_order=&lt;/span&gt;&lt;span class="si"&gt;%27a&lt;/span&gt;&lt;span class="s1"&gt;sc%27&amp;amp;sort_column=%27whip%27&amp;#39;&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;&amp;amp;game_type=%27R%27&amp;amp;sport_code=%27mlb%27&amp;amp;recSP=1&amp;amp;recPP=50&amp;#39;&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;However, this will only download each team's stats for the 2019 season. This API
uses the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Query_string"&gt;query string parameters&lt;/a&gt;
as a kind of filter for the content it provides to us.&lt;/p&gt;
&lt;p&gt;For example, changing &lt;code&gt;season&lt;/code&gt; from 2019 to 2018 would yield us each
team's pitching statistics for the 2018 season or changing &lt;code&gt;game_type&lt;/code&gt; to
&lt;em&gt;&amp;quot;%27S%27&amp;quot;&lt;/em&gt; will give us statistics on spring training games.&lt;/p&gt;
&lt;p&gt;So let's break &lt;code&gt;team_pitching&lt;/code&gt; down into its component parts, allowing
us to make requests for the different seasons the MLB has in their database.&lt;/p&gt;
&lt;p&gt;We can start by separating the domain, path, and query string parameters into
separate variables like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;domain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;https://lookup-service-prod.mlb.com&amp;#39;&lt;/span&gt;
&lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;/json/named.team_pitching_season_leader_master.bam&amp;#39;&lt;/span&gt;
&lt;span class="n"&gt;query_string&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;season&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2019&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;sort_order&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;%27a&lt;/span&gt;&lt;span class="s1"&gt;sc%27&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;sort_column&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;%27whip%27&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;game_type&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;%27R%27&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;sport_code&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;%27mlb%27&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;recSP&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;recPP&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;After looking up what the &lt;em&gt;%27&lt;/em&gt; means (they're URL encoded &lt;strong&gt;'&lt;/strong&gt; apostrophes),
let's make our future selves happy by making a simple function to replace the
&lt;em&gt;%27&lt;/em&gt; (with a note) and turn our &lt;code&gt;query_string&lt;/code&gt; into this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# words are quoted with apostrophes&lt;/span&gt;
&lt;span class="n"&gt;quote&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;&amp;#39;&lt;/span&gt;&lt;span class="si"&gt;{}&lt;/span&gt;&lt;span class="s2"&gt;&amp;#39;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;

&lt;span class="n"&gt;query_string&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;season&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2019&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;sort_order&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;quote&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;asc&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;sort_column&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;quote&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;whip&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;game_type&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;quote&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;R&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;sport_code&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;quote&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;mlb&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;recSP&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;&amp;#39;recPP&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;With our URL broken down, we can make a simple &lt;code&gt;for&lt;/code&gt; loop to go through
each year (1876 - 2019), making a request to the MLB to download the pitching
statistics for that season and save the results into a list creatively called
&lt;code&gt;data&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;time&lt;/span&gt;

&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;  &lt;span class="c1"&gt;# we&amp;#39;ll save all the stats here&lt;/span&gt;

&lt;span class="c1"&gt;# go through each season and make a request&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;season&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1876&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2019&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# rule #2 don&amp;#39;t hammer the servers&lt;/span&gt;

    &lt;span class="c1"&gt;# update the query_string for the correct season&lt;/span&gt;
    &lt;span class="n"&gt;query_string&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;season&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;season&lt;/span&gt;

    &lt;span class="c1"&gt;# make the request to the MLB&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;domain&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;query_string&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# raise an error if the MLB gives&lt;/span&gt;
    &lt;span class="c1"&gt;# an invalid response back&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# pull the list of stats out of the response&lt;/span&gt;
    &lt;span class="n"&gt;stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;
        &lt;span class="s1"&gt;&amp;#39;team_pitching_season_leader_master&amp;#39;&lt;/span&gt;
    &lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;queryResults&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;row&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# add the season to each row, and&lt;/span&gt;
    &lt;span class="c1"&gt;# append it to our &amp;#39;data&amp;#39; list&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;team&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;team&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;season&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;season&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;team&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And finally, after we've collected all the data from the MLB, we put the data
into a &lt;a class="reference external" href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html"&gt;pandas DataFrame&lt;/a&gt; to start the &amp;quot;data wrangling&amp;quot; phase of
our journey.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;pandas&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;pd&lt;/span&gt;
&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;For now though, &lt;a class="reference internal" href="#some-rules-before-we-start"&gt;remembering rule #3 above&lt;/a&gt;, we'll
save our data to disk and wait until we have more time to answer the question
&lt;em&gt;&amp;quot;Are strikeouts making baseball boring&amp;quot;&lt;/em&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_pickle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;mlb-team-pitching-statistics.pkl&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Good Job Everyone!&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</content></entry><entry><title>You, Me &amp; Machines... Learning</title><link href="https://0x42.sh/you-me-machines-learning/" rel="alternate"/><published>2019-08-28T00:00:00+00:00</published><updated>2019-08-28T00:00:00+00:00</updated><author><name>bryan brattlof</name></author><id>tag:0x42.sh,2019-08-28:/you-me-machines-learning/</id><summary type="html">&lt;p&gt;In this essay, we’ll introduce the terms and how each component of a neural
network works together to tackle a classic computer vision problem: analyze
thousands of MNIST images of handwritten numbers and sort them (with 97.88%
accuracy) into the digits they represent.&lt;/p&gt;
&lt;img alt="a hand written number 6 on the left side, with an arrow in the middle pointing to a digitally rendered number 6 on the right." src="https://0x42.sh/you-me-machines-learning/handwritten-six-to-type.png" /&gt;
&lt;div class="section" id="the-problem"&gt;
&lt;h2&gt;The Problem&lt;/h2&gt;
&lt;p&gt;For us, our …&lt;/p&gt;&lt;/div&gt;</summary><content type="html">&lt;p&gt;In this essay, we’ll introduce the terms and how each component of a neural
network works together to tackle a classic computer vision problem: analyze
thousands of MNIST images of handwritten numbers and sort them (with 97.88%
accuracy) into the digits they represent.&lt;/p&gt;
&lt;img alt="a hand written number 6 on the left side, with an arrow in the middle pointing to a digitally rendered number 6 on the right." src="https://0x42.sh/you-me-machines-learning/handwritten-six-to-type.png" /&gt;
&lt;div class="section" id="the-problem"&gt;
&lt;h2&gt;The Problem&lt;/h2&gt;
&lt;p&gt;For us, our brains have no trouble looking at low quality (28 by 28 pixel)
images and deciphering their meaning. We’re essentially hardwired to find
patterns in everything we see.&lt;/p&gt;
&lt;img alt="a very low resolution, 28 pixels tall by 28 pixels wide, image of the number 6." src="https://0x42.sh/you-me-machines-learning/low-resolution-6.png" /&gt;
&lt;p&gt;But how do we program a computer to do the same thing? Assuming there is no
mathematical function we can use and a “6” can come in so many different shapes
and forms, we can't rely on a specific part of an image to tell us what the
number is.&lt;/p&gt;
&lt;img alt="105 different, 15 wide by 7 tall, hand written numbers" src="https://0x42.sh/you-me-machines-learning/105-images-of-numbers.png" /&gt;
&lt;p&gt;This is where neural networks come to the rescue.&lt;/p&gt;
&lt;p&gt;The idea behind modern neural networks, and all machine learning applications,
is to analyze data and try to discover general patterns in that data so it can
make predictions about new data it hasn't seen before. &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Meaning, we can use a neural network to let our computer discover the patterns
in our images, and then use that pattern to sort each image into the digit it's
depicting.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="the-plan"&gt;
&lt;h2&gt;The Plan&lt;/h2&gt;
&lt;p&gt;So to solve our goal of accurately sorting images into the digits they
represent, we’ll:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;Download and prepare thousands of images of hand written numbers.&lt;/li&gt;
&lt;li&gt;Build a feed forward neural network to analyze the images and sort them.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Then after we've built and trained the neural network to sort images for us:&lt;/p&gt;
&lt;ol class="arabic simple" start="3"&gt;
&lt;li&gt;Test the neural network using another set of images it's never seen before to
grade how well it classifies images.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;First-up is preparing our computer to “learn.&amp;quot;&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="preparing-the-environment"&gt;
&lt;h2&gt;Preparing The Environment&lt;/h2&gt;
&lt;p&gt;Before we get started, I should point out that I’ve published the jupyter
notebooks I’ve used for writing this on a &lt;a class="reference external" href="https://gitlab.com/bryanbrattlof/mnist-digit-recognizer"&gt;GitLab repository&lt;/a&gt; and &lt;a class="reference external" href="https://colab.research.google.com/drive/10JZBJ4wRTiVabE_aMXNX59O7Y3_KWWc6"&gt;Google
Collab&lt;/a&gt;, so you can play with the code without installing anything on your
computer.&lt;/p&gt;
&lt;p&gt;If you want to set this up on your own computer, and you already have python
installed, a simple pip command will install all the libraries we’ll need for
this project.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;$&lt;span class="w"&gt; &lt;/span&gt;pip&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;keras&lt;span class="w"&gt; &lt;/span&gt;tensorflow&lt;span class="w"&gt; &lt;/span&gt;numpy&lt;span class="w"&gt; &lt;/span&gt;mnist
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;We’ll use &lt;a class="reference external" href="https://www.tensorflow.org/"&gt;TensorFlow&lt;/a&gt; (a powerful deep learning
library published by Google) as a backend to &lt;a class="reference external" href="https://keras.io"&gt;Keras&lt;/a&gt;, a
library that dramatically simplifies the programming required to build a neural
network.&lt;/p&gt;
&lt;p&gt;As per usual scientific and “big-math” related python projects, we’ll also need
&lt;a class="reference external" href="https://numpy.org/"&gt;NumPy&lt;/a&gt; to give us fast multidimensional array support in
python.&lt;/p&gt;
&lt;p&gt;And finally we’ll utilize the &lt;a class="reference external" href="http://yann.lecun.com/exdb/mnist/"&gt;MNIST dataset&lt;/a&gt;, which is a subset of a large
database called the &lt;a class="reference external" href="https://www.nist.gov/srd/shop/special-database-catalog"&gt;NIST Special Database&lt;/a&gt;. The MNIST dataset is maintained by
&lt;a class="reference external" href="http://yann.lecun.com/"&gt;Yann LeCun&lt;/a&gt;, contains the 70,000 images of handwritten numbers we’ll use to
train and test our neural network.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="loading-exploring-the-data"&gt;
&lt;h2&gt;Loading, &amp;amp; Exploring The Data&lt;/h2&gt;
&lt;p&gt;Now that we’ve built our working environment, we can download the images and
begin preparing them for our neural network. Thankfully the MNIST library makes
downloading thousands of images as simple as importing a module in a new python
file.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;mnist&lt;/span&gt;

&lt;span class="n"&gt;training_images&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mnist&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;train_images&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;training_labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mnist&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;train_labels&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;testing_images&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mnist&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;test_images&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;testing_labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mnist&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;test_labels&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;After the images have downloaded (could take a while on slower connections) we
can see the MNIST library has helpfully put our images into a NumPy array for
us.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;training_images&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# &amp;lt;class &amp;#39;numpy.ndarray&amp;#39;&amp;gt;&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;training_images&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# (60000, 28, 28)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;training_images&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;min&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;  &lt;span class="c1"&gt;# 0&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;training_images&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;  &lt;span class="c1"&gt;# 255&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Printing out some statistics about our NumPy array tells us each image (60,000
images in the training set) is in a 28 by 28 pixel matrix, with each pixel
having a value between 0 and 255 (an 8-bit number) to tell our computers how
“on” that pixel should be. The higher the pixel’s value the more “on” that pixel
will be.&lt;/p&gt;
&lt;img alt="a low resolution, 28 pixels wide by 28 pixels tall, image with the value of each pixel superimposed on top." src="https://0x42.sh/you-me-machines-learning/number-6-with-pixel-values.png" /&gt;
&lt;p&gt;And while this is great for our human eyes, this isn’t a great format for our
neural network. Meaning, like all true data scientists, we have to massage some
data.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="preparing-the-data-normalizing"&gt;
&lt;h2&gt;Preparing The Data (Normalizing)&lt;/h2&gt;
&lt;p&gt;An important step in preparing our images for our neural network is to
&amp;quot;&lt;a class="reference external" href="https://developers.google.com/machine-learning/data-prep/transform/normalization"&gt;normalize&lt;/a&gt;&amp;quot; them.&lt;/p&gt;
&lt;p&gt;This (usually) speeds up the time it takes to train our neural network, as well
as avoid any problems that arise, if we have multiple datasets that use different
ranges.&lt;/p&gt;
&lt;p&gt;For our images, we'll only need to scale their range from their current values
[0 - 255] into a more standard range [0 - 1]. We can do this using a min-max
feature scaling funciton:&lt;/p&gt;
&lt;img alt="x' = (x -x min) / (x max - x min)" src="https://0x42.sh/you-me-machines-learning/min-max-feature-scaling-function.png" /&gt;
&lt;p&gt;And, because our minimum value is 0, we can simplify the formula to just dividing
by our maximum value in our range (255).&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# ensure our numpy arrays use floating point decimals&lt;/span&gt;
&lt;span class="n"&gt;training_images&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;training_images&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;astype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;float32&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;testing_images&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;testing_images&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;astyle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;float32&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# reduce range to [0 - 1] (normalize)&lt;/span&gt;
&lt;span class="n"&gt;training_images&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;training_images&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;training_images&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;testing_images&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;testing_images&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;testing_imaoges&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="reshaping-our-data-flattening"&gt;
&lt;h2&gt;Reshaping our Data (Flattening)&lt;/h2&gt;
&lt;p&gt;Our neural network will expect each image to be in a long 1 dimensional list of
pixels. This means we'll need to &amp;quot;flatten&amp;quot; our images by removing it's
2&lt;sup&gt;nd&lt;/sup&gt; dimension before we can start training our neural network.&lt;/p&gt;
&lt;p&gt;To “flatten” our images , we simply need to call NumPy’s &lt;a class="reference external" href="https://numpy.org/doc/stable/reference/generated/numpy.reshape.html"&gt;reshape() method&lt;/a&gt; and
specify a list of dimensions we want our new matrix in.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# flatten our images into 1 dimension&lt;/span&gt;
&lt;span class="n"&gt;training_images&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;training_images&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reshape&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;784&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;testing_images&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;testing_images&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reshape&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;784&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And just like that we’ve transformed our images from human readable to neural
network readable.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# stats of our normalized data&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;training_images&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# (60000, 784)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;training_images&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="c1"&gt;# 1&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;training_images&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;min&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="c1"&gt;# 0&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="one-hot-encoding"&gt;
&lt;h2&gt;One-Hot Encoding&lt;/h2&gt;
&lt;p&gt;Now that we’ve finished preparing our images (questions) for our neural network,
we can begin working on the labels (answers).&lt;/p&gt;
&lt;p&gt;In order to train our neural network, we need to convert our labels from their
perfectly human readable base-10 digit into a 10 item list format called
“one-hot” encoding.&lt;/p&gt;
&lt;img alt="the number 6 equals a 10 element array, all zero except the 6th element is equal to 1." src="https://0x42.sh/you-me-machines-learning/one-hot-6.png" /&gt;
&lt;p&gt;Without getting into too much detail, our neural network will have 10 output
neurons with each neuron representing the probability of an image representing
any digit in our 0-9 range.&lt;/p&gt;
&lt;p&gt;We need to transform our labels into the probabilities that those 10 output
neurons should output, effectively telling our neural network that we’re 100%
sure this image is a 6.&lt;/p&gt;
&lt;p&gt;To convert our labels into one-hot format, we need to use another NumPy method
called &lt;a class="reference external" href="https://numpy.org/doc/stable/reference/generated/numpy.eye.html"&gt;eye()&lt;/a&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# convert labels to ‘one-hot’ encoding&lt;/span&gt;
&lt;span class="n"&gt;training_labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;eye&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="n"&gt;training_labels&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;testing_labels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;eye&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="n"&gt;testing_labels&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And with that, we’ve finished preparing our data to be processed by our neural
network. Now’s the time to actually start building it.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="building-the-model"&gt;
&lt;h2&gt;Building The Model&lt;/h2&gt;
&lt;p&gt;The term “model” in machine learning has taken on a few different meanings from
its original definition a few years ago &lt;a class="footnote-reference" href="#footnote-2" id="footnote-reference-2"&gt;[2]&lt;/a&gt;. In Keras, a model is used to
describe the overall structure of our neural network.&lt;/p&gt;
&lt;p&gt;There are &lt;a class="reference external" href="https://keras.io/api/models/"&gt;two types of models&lt;/a&gt;
we can build with Keras:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;We can define more complex networks using the &lt;a class="reference external" href="https://keras.io/api/models/model/"&gt;Model API&lt;/a&gt;, which is more
verbose (complex) when setting up, which could lead to errors if we’re not
careful.&lt;/li&gt;
&lt;/ol&gt;
&lt;ol class="arabic simple" start="2"&gt;
&lt;li&gt;Or we can use the &lt;a class="reference external" href="https://keras.io/api/models/sequential/"&gt;Sequential API&lt;/a&gt;, which can only define a simple linear
stack of layers, but makes up for this by removing a lot of the complexities
the Model API has.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The type of neural network we’ll be building is called a &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Feedforward_neural_network"&gt;feed-forward neural
network&lt;/a&gt;. It’s essentially a single stack of layers, where each layer of
neurons feeds their information forward to the next layer, making the Sequential
API a perfect fit for our project.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;keras.models&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Sequential&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p id="the-add-method"&gt;After we’ve initialized our Sequential model, we can use the &lt;code&gt;model.add()&lt;/code&gt;
method to add individual layers to it later on.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="adding-layers"&gt;
&lt;h2&gt;Adding Layers&lt;/h2&gt;
&lt;p&gt;As you’ve probably gathered, a neural network is built by using many nodes
called neurons.&lt;/p&gt;
&lt;img alt="a hand-drawn image of a circle, that I'll use to represent a neuron." src="https://0x42.sh/you-me-machines-learning/neuron.png" /&gt;
&lt;p id="anatomy-of-a-neuron"&gt;Each input, in a neuron, is multiplied by a weight (importance factor) before
being added together along with a bias. Then, to keep this number in a specific
range, the number is then fed into an activation function (we'll learn
about these later) to produce the neuron's output.&lt;/p&gt;
&lt;img alt="4 inputs are multipleid by their 4 weights. Then the sum of these inputs are edjusted by an activation function before becomeing the neuron's sole output." src="https://0x42.sh/you-me-machines-learning/inside-the-neuron.png" /&gt;
&lt;p&gt;In a feed-forward neural network, neurons are organized into layers, where each
neuron in a layer will only connect to the neurons from adjacent layers, forming
a sequential stack of layers.&lt;/p&gt;
&lt;img alt="3 input nerons are densly connected to 4 neurons in the first hidden layer, which are densly connected to the 4 neurons in the second hidden layer, which finally connect to 3 neurons in the final output layer." src="https://0x42.sh/you-me-machines-learning/neural-network-model.png" /&gt;
&lt;p&gt;Typically each neuron in a layer will connect to every neuron from the adjacent
layer, forming a fully interconnected (dense) layer of connections.&lt;/p&gt;
&lt;img alt="2 neurons on the left each have 2 arrows pointing to 2 neurons on the right, representing a fully interconnected (dense) layer." src="https://0x42.sh/you-me-machines-learning/dense-layer.png" /&gt;
&lt;p&gt;In Keras, we have &lt;a class="reference external" href="https://keras.io/api/layers/core_layers/"&gt;many types of layers we can use in our model&lt;/a&gt;, however, for this project, we’ll
only need to use the &lt;strong&gt;Dense&lt;/strong&gt; layer.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;keras.layers&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Dense&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="layer-no-1"&gt;
&lt;h2&gt;Layer No.1&lt;/h2&gt;
&lt;p&gt;For the first layer in our model, we'll be connecting the 784 neurons in our
input layer (the pixels in our images) to the 512 neurons in our hidden layer.&lt;/p&gt;
&lt;img alt="the same neural network model, with the input layer and hidden layer highlighted to indicate we're focusing on this part of our neural network." src="https://0x42.sh/you-me-machines-learning/neural-network-build-part-1.png" /&gt;
&lt;p&gt;To add a dense layer to our model, we simply need to initialize it with some
options,  and add it to our model’s stack using the &lt;code&gt;add()&lt;/code&gt; method &lt;a class="reference internal" href="#the-add-method"&gt;we
talked about above&lt;/a&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Dense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;activation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;relu&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input_shape&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;784&lt;/span&gt;&lt;span class="p"&gt;,)))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;The first argument, called &lt;strong&gt;units&lt;/strong&gt; in the documentation, describes the
number of neurons our hidden layer will have. 512 is a somewhat arbitrary
decision I landed on while testing (playing around).&lt;/li&gt;
&lt;/ul&gt;
&lt;ul class="simple" id="first-layer"&gt;
&lt;li&gt;Because this is the first layer in our stack, we need to tell Keras the
&lt;strong&gt;input_shape&lt;/strong&gt; our images will be in when we start training our neural
network.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;activation&lt;/strong&gt; function, &lt;a class="reference internal" href="#anatomy-of-a-neuron"&gt;like we discussed&lt;/a&gt;, is
how the neurons in our hidden layer will &amp;quot;squash&amp;quot; their outputs before
outputting their results.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We’ll be using the &lt;strong&gt;Re&lt;/strong&gt;ctified &lt;strong&gt;L&lt;/strong&gt;inear &lt;strong&gt;U&lt;/strong&gt;nits activation function
(&lt;a class="reference external" href="https://keras.io/activations/#relu"&gt;ReLU&lt;/a&gt;) which leaves any positive number unchanged but transforms any
negative number into a 0 (no learning).&lt;/p&gt;
&lt;img alt="the ReLU activation function F(x) = max(0, x)" src="https://0x42.sh/you-me-machines-learning/relu-activation-function.png" /&gt;
&lt;/div&gt;
&lt;div class="section" id="layer-no-2"&gt;
&lt;h2&gt;Layer No.2&lt;/h2&gt;
&lt;p&gt;Now that we’ve set up our first layer, we’ll need to build the connections to
our output layer, allowing us to get the predictions out of our neural network.&lt;/p&gt;
&lt;div class="admonition note"&gt;
&lt;p class="first admonition-title"&gt;Note&lt;/p&gt;
&lt;p class="last"&gt;We can add more hidden layers here, but I found that having more than
one did nothing to improve the accuracy of our neural network and
slowed the learning rate considerably. Feel free to play around with
my jupyter notebook to see what you find.&lt;/p&gt;
&lt;/div&gt;
&lt;img alt="the same neural network model, with the output layer and hidden layer highlighted to indicate we're focusing on this part of our neural network." src="https://0x42.sh/you-me-machines-learning/neural-network-build-part-2.png" /&gt;
&lt;p&gt;Our last Dense layer will be the output of our neural network. This layer will
need 10 output neurons, one for each digit our image could be.&lt;/p&gt;
&lt;p&gt;For the activation function, we'll use the &lt;a class="reference external" href="https://keras.io/api/layers/activations/"&gt;softmax function&lt;/a&gt;, which allows us
to calculate the activation of the 10 output neurons relative to each other.
(eg: &amp;quot;I'm 20% sure this is a 4&amp;quot;)&lt;/p&gt;
&lt;img alt="the softmax activation function: each item in a matrix is divided by the sum of the matrix to get a percentage of the total matrix." src="https://0x42.sh/you-me-machines-learning/softmax-activation-function.png" /&gt;
&lt;p&gt;Also, because this is the second layer we're adding to our model, we can
omit the &lt;em&gt;input_shape&lt;/em&gt; argument we added in &lt;a class="reference internal" href="#first-layer"&gt;the first layer we built&lt;/a&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Dense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;activation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;softmax&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And with that, thanks to the simplicity of Keras, we’ve finished building the
structure of our neural network.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="compiling-the-model"&gt;
&lt;h2&gt;Compiling The Model&lt;/h2&gt;
&lt;p&gt;Now that we're done building the structure of our neural network, we can work on
the methods and algorithms it will use to “learn” as we compile our neural
network.&lt;/p&gt;
&lt;p&gt;In Keras, we can simply call the &lt;code&gt;model.compile()&lt;/code&gt; method with a couple of
options to prepare our neural network for service.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;categorical_crossentropy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;adam&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;metrics&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;accuracy&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;The &lt;strong&gt;loss&lt;/strong&gt; argument is telling Keras what type of loss function we want to
use. A loss function is essentially a function used to measure how “wrong” our
neural network is during training.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;While Keras supports &lt;a class="reference external" href="https://keras.io/api/losses/"&gt;many types of loss functions&lt;/a&gt;,
typically a loss function is chosen for us based on the decisions we've made
eairler. For example, our neural network is classifying images, and we've
encoded our labels into a &amp;quot;one-hot&amp;quot; format, meaning we need to use a
&lt;strong&gt;categorical crossentropy&lt;/strong&gt; loss function.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;The &lt;strong&gt;optimizer&lt;/strong&gt; is a function used to adjust the weights and biases of our
neurons in order to minimize the value of our loss function.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Much like with the loss function, Keras supports &lt;a class="reference external" href="https://keras.io/api/optimizers/"&gt;many types of optimizer
functions&lt;/a&gt;. For our project, we’ll be using
the ‘adam’ optimizer function, mainly because it’s usually a great optimizer.&lt;/p&gt;
&lt;ul class="simple" id="metric-argument"&gt;
&lt;li&gt;The &lt;strong&gt;metric&lt;/strong&gt; argument tells Keras how we wish to evaluate how well our
neural network is doing. Usually this will be &lt;code&gt;metrics=[‘accuracy’]&lt;/code&gt; but
for more complex networks with multiple output layers this can be different.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div class="section" id="how-to-train-your-model"&gt;
&lt;h2&gt;How To Train Your Model&lt;/h2&gt;
&lt;p&gt;Now that we have built the structure and defined how our neural network will
“learn”, we can finally start the training process.&lt;/p&gt;
&lt;p&gt;Training our neural network is easily done by calling the &lt;code&gt;fit()&lt;/code&gt; method
as shown below, along with some extra arguments.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;training_images&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# normalized images&lt;/span&gt;
    &lt;span class="n"&gt;training_labels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# one-hot labels&lt;/span&gt;
    &lt;span class="n"&gt;epochs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The first 2 arguments are for our normalized and flattened images along with the
one-hot encoded labels we’ll be using to train our neural network.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;The &lt;strong&gt;epochs&lt;/strong&gt; argument tells Keras the number of times we will go through the
entire set of images. Think of this as how many times you want to take the
test.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;batch_size&lt;/strong&gt; is the number of images our neural network should process
before updating each neuron with the results from our optimizer function.&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="admonition note"&gt;
&lt;p class="first admonition-title"&gt;Note&lt;/p&gt;
&lt;p class="last"&gt;Increasing the batch size is usually an acceptable tradeoff as it
dramatically speeds up the learning process, with limited effects to
overall accuracy. &lt;a class="footnote-reference" href="#footnote-3" id="footnote-reference-3"&gt;[3]&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Running the &lt;code&gt;model.fit()&lt;/code&gt; above will give us an output like this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Epoch 1/5
60000/60000 [======] - 7s 118us/step - loss: 0.2309 - acc: 0.9330
Epoch 2/5
60000/60000 [======] - 7s 113us/step - loss: 0.0904 - acc: 0.9734
Epoch 3/5
60000/60000 [======] - 7s 114us/step - loss: 0.0588 - acc: 0.9824
Epoch 4/5
60000/60000 [======] - 7s 113us/step - loss: 0.0405 - acc: 0.9871
Epoch 5/5
60000/60000 [======] - 7s 114us/step - loss: 0.0302 - acc: 0.9908
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Because we gave the &lt;code&gt;metrics=[‘accuracy’]&lt;/code&gt; argument when we compiled the
neural network, &lt;code&gt;acc:&lt;/code&gt; is printed in this output showing us how accurate
our neural network is after each epoch.&lt;/p&gt;
&lt;p&gt;Looking at the last epoch, we can see our neural network achieved a &lt;strong&gt;99.08%
accuracy&lt;/strong&gt; with our training data. Whoop!!!&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="but-wait-there-s-more"&gt;
&lt;h2&gt;But Wait! There's More!&lt;/h2&gt;
&lt;p&gt;Before we celebrate too hard though, keep in mind this score is from a test our
neural network has already taken 5 times. Like aceing a test we’ve already
taken, we can’t be sure if we just memorized the answer to the question, or
really understood the material.&lt;/p&gt;
&lt;p&gt;When our neural network is better at remembering answers than classifying images,
it’s “overfitted,” &lt;a class="footnote-reference" href="#footnote-4" id="footnote-reference-4"&gt;[4]&lt;/a&gt; meaning it started using “patterns” in our data that
doesn’t help it classify random handwritten numbers, but instead the specific
handwritten numbers we gave it.&lt;/p&gt;
&lt;p&gt;To determine if our neural network is overfitting, we’ll use the extra 10,000
images we prepared but we haven't given to our neural network yet.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="the-ultimate-test"&gt;
&lt;h2&gt;The Ultimate Test&lt;/h2&gt;
&lt;p&gt;This time instead of calling &lt;code&gt;fit()&lt;/code&gt; to train our neural network, we'll
call the &lt;code&gt;evaluate()&lt;/code&gt; method to evaluate our neural network’s accuracy.
This time supplying the testing images and labels that our neural network
hasn't seen before.&lt;/p&gt;
&lt;p&gt;This way, we can get an accuracy score on how well our neural network can sort
any image of handwritten number, instead of just our dataset of handwritten
numbers.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;testing_images&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;testing_labels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Running this code will give us the following output:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.07108291857463774&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.9788&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;The first item in the array is the loss function value.&lt;/li&gt;
&lt;li&gt;The second item is our accuracy metric we asked for when we &lt;a class="reference internal" href="#metric-argument"&gt;compiled the
neural network above&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Meaning &lt;strong&gt;we’ve achieved a 97.88% accuracy score on fresh data!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Not bad for our first neural network.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="more-reading"&gt;
&lt;h2&gt;More Reading:&lt;/h2&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-1" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;Approximation by Superpositions of a Sigmoidal Function - &lt;a class="reference external" href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.441.7873&amp;amp;rep=rep1&amp;amp;type=pdf"&gt;Cybenko, G.&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-2" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-2"&gt;[2]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;Definition of &amp;quot;model&amp;quot; in machine learning - &lt;a class="reference external" href="https://datascience.stackexchange.com/questions/12909/definition-of-a-model-in-machine-learning/12911"&gt;StackExchange&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-3" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-3"&gt;[3]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;On Large-Batch Training for Deep Learning - &lt;a class="reference external" href="https://arxiv.org/abs/1609.04836"&gt;Cornell University&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-4" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-4"&gt;[4]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;The Definition of &amp;quot;overfiting&amp;quot; - &lt;a class="reference external" href="https://www.lexico.com/en/definition/overfitting"&gt;Oxford Dictionaries&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
</content></entry></feed>