Academic Data

Available data

I have done a fair bit of web scraping to get data into usable formats.

Available Data
From my GitHub repos
Title and link Description
Car Scraper ZA

An automated scraper that gets adverts for vehicles off of Gumtree in South Africa every day and stores them on this GitHub repo. Have a look at my Shiny App to understand the kind of data collected.

Bicycle Advert Scraper

Similar to the car scraper, this repo autmotically scrapes adverts for bicycles from Bikehub and Gumtree.

Swedish Job adverts

This repo contains a one percent sample of the enriched jobtech data from jobtechdev.se. They provide a one percent sample of all jobs advertised in Sweden from 2016 to 2022Q2 to help you get to grips with the data. I simply put it into excel format from json

Swedish Agriculture and Livestock

This repo is for the digitization of the SCB reports on Agriculture and Animal Husbandry. They span from 1865 to 1911. You can have a look at my Shiny app to get a better understanding

Swedish patent data

I have scraped Google Patents for 10,000 patents that were registered in Sweden, and collected PDF data from The Swedish Patent Registration Authority for further analysis.

Hot Jobs in Sweden from LinkedIn

This repo has a set of data from LinkedIn's collaboration with the World Bank on talent migration. I've written a short report on the data here

Academic data processing

I have been happy to help a number of my colleagues with scraping data from public websites or processing geographic data. I link these GitHub repositories here, in case you’re looking for inspiration on how to structure a scraping project, or perhaps want to access data on soil suitability or Europe’s urban populations.

Scraping a genealogy wiki of Sweden’s noble families

Here I scraped and structured a database of more than 120,000 individuals belonging to various branches of Sweden’s nobility.

Scraping tax registers from the Stockholm City Archive

Here I scraped an index of records from 1800 to 1880 for residents of Sweden’s most populous city, Stockholm.

Soil suitability calculation

In this project, I helped acquire data on soil suitability for wheat cultivation across Europe, and aggregate the raster spatial data to NUTS 2 and NUT 3 regions for use as control variables in a study of persistence.

European population aggregations

In this project, I helped aggregate the population from cities across Europe to NUTS2 and NUTS3 regions, to show the growth of urban populations across time.