I take an interest in what the UK’s Office for National Statistics puts out, especially around employment and the economy. I’m also learning Jupyter and the Python DS tools, so I’ve taken one of their data series and tidied it up to use in Pandas.

# Category: Learning Notes

Notes to or for myself written while learning something.

# MySQL 8.0 on Gitlab CI/CD

I had a few problems getting a MySQL instance up and running in Gitlab CI/CD, so here are the errors I found and the steps I took.

# Crawl test page

This page only exists to let me test a web crawler I’m playing with. Please ignore it!

Some links, which you shouldn’t bother following:

# Rendering the Mandelbrot Set

Code on github.

Per Wikipedia, the Mandelbrot set is the set of complex numbers for which the a particular function does not diverge from the origin when iterated. The particular function is: *f _{c}(z) = z^{2} + c*

The important facts are that the space is the complex plane, and the colour-coding normally seen corresponds to the number of iterations required for the new value of *c* to be seen to diverge (have an absolute value above 4). My code for checking and colouring was:

def iter_score(re, im):

x = 0

y = 0

iters = -1

while iters < ITERATION_LIMIT and x * x + y * y < 4:

next_x = x * x - y * y + re

y = 2 * x * y + im

x = next_x

iters += 1

return iter

def score_colour_map(iter_score):

grey_val = iter_score

return [grey_val, grey_val, grey_val]

Could have been more inventive on the colouring, or at least switched the rendering to greyscale. As it is, it’s RGB, but only giving greys:

Main Python note was the issue around getting PyPNG installed so I could “import png”. Perhaps nor surprisingly, it matters which environment is active when you pip-install something. Presumably the environment includes an indicator of which packages are installed (perhaps a directory name?). Anyway, installing a package with the conda environment running doesn’t help when you’re trying to use the package in PyCharm with its own venv! Lesson learned.

# Applied Data Science with Python and Jupyter, Alex Galea, 2018 – Chapter 1

Basic system for Jupyter is a web front end to little pockets of code that execute on the backend; setup means getting the server running.

Assume this means that each notebook has its own kernel running on the server? Or not *running*, something more like a session.

Notebooks can be saved out as Python or HTML files.

DataFrames: created by Pandas constructor. Have a describe() method that gives summaries of individual variables. corr() gives a correlation matrix.

Seaborn pairplot: exactly what I wanted when working on reports at VCS – pairwise plots of variables against each other.

ndarray.reshape: reshapes the x-y sizes of an array; param values of -1 for a dimension mean that the correct value is inferred from other values.

sklearn.preprocessing.PolynomialFeatures: returns an object capable of transforming data frames, *e.g.* with degree of 2, and one-dimensional input the output frame would contain a frame for each input value, containing the value to the powers zero, one and two (*i.e.* the number one, the input value, and the input value squared).

sklearn.linear_model.LinearRegression: gives an object which can perform linear regression (multi-linear in the example)

There’s a bug in the last section, about categorical features. The cell that starts, “# Color-segmented pair plot” contains this:

sns.pairplot(df[cols], hue='AGE_category',

hue_order=['Relatively New', 'Relatively Old',

'Very Old'], plot_kws={'alpha': 0.5},

diag_kws={'bins': 30})

But this throws an AttributeError – ‘Line2D’ has no property ‘bins’. Removing the parameter diag_kws={‘bins’: 30} leaves the call running properly.

# Jupyter Experiment

I know all the cool kids have been using it for years. I thought I should give it a try. Currently working through *Applied Data Science with Python and Jupyter*, but this is not from the book.

The desired output of this is the line graph showing the relationship between average weekly deaths in England and Wales, and weekly deaths this year. This is it: