Available Data | |
From my GitHub repos | |
Title and link | Description |
---|---|
Car Scraper ZA | An automated scraper that gets adverts for vehicles off of Gumtree in South Africa every day and stores them on this GitHub repo. Have a look at my Shiny App to understand the kind of data collected. |
Bicycle Advert Scraper | |
Swedish Job adverts | This repo contains a one percent sample of the enriched jobtech data from jobtechdev.se. They provide a one percent sample of all jobs advertised in Sweden from 2016 to 2022Q2 to help you get to grips with the data. I simply put it into excel format from |
Swedish Agriculture and Livestock | This repo is for the digitization of the SCB reports on Agriculture and Animal Husbandry. They span from 1865 to 1911. You can have a look at my Shiny app to get a better understanding |
Swedish patent data | I have scraped Google Patents for 10,000 patents that were registered in Sweden, and collected PDF data from The Swedish Patent Registration Authority for further analysis. |
Hot Jobs in Sweden from LinkedIn | This repo has a set of data from LinkedIn's collaboration with the World Bank on talent migration. I've written a short report on the data here |
Academic Data
Available data
I have done a fair bit of web scraping to get data into usable formats.
Academic data processing
I have been happy to help a number of my colleagues with scraping data from public websites or processing geographic data. I link these GitHub repositories here, in case you’re looking for inspiration on how to structure a scraping project, or perhaps want to access data on soil suitability or Europe’s urban populations.
Scraping a genealogy wiki of Sweden’s noble families
Here I scraped and structured a database of more than 120,000 individuals belonging to various branches of Sweden’s nobility.
Scraping tax registers from the Stockholm City Archive
Here I scraped an index of records from 1800 to 1880 for residents of Sweden’s most populous city, Stockholm.
Soil suitability calculation
In this project, I helped acquire data on soil suitability for wheat cultivation across Europe, and aggregate the raster spatial data to NUTS 2 and NUT 3 regions for use as control variables in a study of persistence.
European population aggregations
In this project, I helped aggregate the population from cities across Europe to NUTS2 and NUTS3 regions, to show the growth of urban populations across time.