Author Archives: MantasCode

Iowa’s Liquor Sale Maps and Visualizations 2020

A look into Iowa’s liquor sales for the last 2.5 years. Time spans from January 2018 through June 2020. The data can be found at data.iowa.gov. Click on images below to enlarge.

Distinct Coordinate by Volume Sold (Liters)

Pandemic Upswing

Top Bottle Sold by Distinct Location

Top Bottle

Top Category Per Distinct Location

C# Console Application was used to parse, aggregate, and output the data into desired structure for Google Charts and Carto. The script below can be used as a reference if you would like to investigate Iowa’s liquor sale data yourself. Note that this particular script will only output the data source of the last map in this post (Top Category Per Distinct Location sized by Bottles Sold).

using CsvHelper;
using System;
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;
 
namespace IowaSpirits2018plus
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Iowa Spirits..");
            Console.BufferHeight = 4000;
            Dictionary<string, Dictionary<string, int>> dictCoord_Item_Count = new Dictionary<string, Dictionary<string, int>>();
            string lastdate = "";
            using (var reader = new StreamReader("C:\\IOWASPIRITS\\iowaspirits2018plus.csv"))
            using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
            {
                csv.Read();
                csv.ReadHeader();
                while (csv.Read())
                {
                    string date = csv.GetField("Date").ToString();
                    if ( lastdate != date)
                        Console.WriteLine(date);
                    lastdate = date;
                    DateTime fDate = csv.GetField("Date");
                    string city = csv.GetField("City").ToString();
                    string categoryName = csv.GetField("Category Name").ToString();
                    string bottlesSold = csv.GetField("Bottles Sold").ToString();
                    int iBottlesSold = int.Parse(csv.GetField("Bottles Sold").ToString());
                    string litersSold = csv.GetField("Volume Sold (Liters)").ToString();
                    string itemDescription = csv.GetField("Item Description").ToString();
                    string storelocation = csv.GetField("Store Location").ToString();
                    storelocation = Regex.Replace(storelocation, "POINT \\(", "");
                    storelocation = Regex.Replace(storelocation, "\\)", "");
                    storelocation = Regex.Replace(storelocation, " ", ",");
 
                    if (dictCoord_Item_Count.ContainsKey(storelocation))
                    {
                        Dictionary&lt;string, int&gt; existingDict = new Dictionary&lt;string, int&gt;();
                        existingDict = dictCoord_Item_Count[storelocation];
                        if (existingDict.ContainsKey(categoryName))
                        {
                            int existingBottleSold = existingDict[categoryName];
                            existingBottleSold += iBottlesSold;
                            existingDict[categoryName] = existingBottleSold;
                        }
                        else
                        {
                            existingDict.Add(categoryName, iBottlesSold);
                        }
                        dictCoord_Item_Count[storelocation] = existingDict;
                    }
                    else
                    {
                        Dictionary&lt;string, int&gt; tempDictItemCount = new Dictionary&lt;string, int&gt;();
                        tempDictItemCount.Add(categoryName, iBottlesSold);
                        dictCoord_Item_Count.Add(storelocation, tempDictItemCount);
                    }
                }
            }
            foreach ( string parent_key in dictCoord_Item_Count.Keys)
            {
                Console.Write(parent_key + ",");
                foreach (var item in dictCoord_Item_Count[parent_key].OrderByDescending(r =&gt; r.Value))
                {
                    Console.Write("{0}, {1}", item.Key, item.Value);
                    break;
                }
                Console.WriteLine();
            }
        }
    }
}

UK Bicycle Theft in Major Cities 2017 to 2020

Revisiting UK’s bicycle thefts. The data comes from https://data.police.uk/. The date ranged used was from May 2017 through April 2020. All surrounding police forces were used to obtain a comprehensive list of bicycle theft incidents. Previous post from 2016 can be found here. Carto was used to render the maps. Cities include: London, Manchester, Birmingham, and Bristol

London Click to enlarge 3.12 mb

London Click to enlarge 3.73 mb

Manchester Click to enlarge 1.07 mb

Birmingham Click to enlarge 706 kb

Bristol Click to enlarge 656 kb

If you would like to create your own UK crime maps, and are frustrated at the way data.police.uk presents its data. You may use this simple C# console application to aggregate it yourself. Simply paste the downloadables into a directory called LONDON4, and use the script below to generate an output file containing incidents from multiple police forces. Cheers.

C# Console Application

string dirPath = @"C:\LONDON4\";
List dirs = new List(Directory.EnumerateDirectories(dirPath));
Dictionary&lt;string, int&gt; dictUniqueCrime = new Dictionary&lt;string, int&gt;();
foreach (var dir in dirs)
{
    Console.WriteLine("{0}", dir.Substring(dir.LastIndexOf("\\") + 1));
    string filepath = @"C:\LONDON4\";
    filepath += dir.Substring(dir.LastIndexOf("\\") + 1);
    Console.WriteLine(filepath);
    string line;
    string[] fileEntries = Directory.GetFiles(filepath);
    foreach (string fileName in fileEntries)
    {
        Console.WriteLine(fileName);
        System.IO.StreamReader file =
            new System.IO.StreamReader(fileName);
        while ((line = file.ReadLine()) != null)
        {
            string[] parts = line.Split(',');
            string month = parts[1];
            string longitude = parts[4];
            string latitude = parts[5];
            string crimetype = parts[9];
            if (crimetype == "Bicycle theft")
            {
                string lines = month + "-1, " + longitude + ", " + latitude + "";
                System.IO.StreamWriter file1 = new System.IO.StreamWriter("C:\\Airplanes\\____BRITISHBICYCLES.csv", true);
                file1.WriteLine(lines);
                file1.Close();
            }
        }
        file.Close();
    }
}

Subreddit Relationships by Strong User Overlap

Checking out some strong subreddit relationships stemming from a chosen root subreddit. The relationships below are ordered by User Overlap Scores from subredditstats.com. An interesting way of discovering related subreddits based on user overlap.

AgainstHateSubreddits

ConsumeProduct

Interesting dichotomy of users between AgainstHateSubreddits and ConsumeProduct.


ProtectAndServe

JoeRogan top 5 of 3

Sino top 5 and 3

Coronavirus top 5 and 3

These glimpses use the User Overlap scores from https://subredditstats.com/subreddit-user-overlaps. Look up a subreddit yourself, for fun.

Top 50 most influential subreddit moderators – May 2020

Plotting number of subreddits moderated within top 1,000 most subscribed subs, vs. Subreddit Subscriber Aggregate.

Click on images below to enlarge.

The data is from http://redditlist.com/ and https://www.reddit.com/. C# was written to parse both websites and generate output formatted for a GoogleCharts Bubble Chart. This script was ran on 5/14/2020.

If you would like to parse the top 1,000 most subscribed-to subreddit mod usernames, and aggregate their frequency yourself. Feel free to download visual studio and run the code below in a simple .NET Console Application. Of course this being a hard parse it is only bound to work so long as redditlist and reddit don’t change their html structure, otherwise it can be used as a reference.

Crawl redditlist.com
Collect the middle column for the first 8 pages listing 125 each.

Crawl reddit.com
Iterate over each subreddit’s /about/moderators page and collect usernames shown below.

Regular Expression patterns used to collect information:

@"_blank(.*?)</div>"
@">(.*?)</a>"
@"listing-stat'>(.*?)</span>"
@"<span class=""user""><a href=""https://old.reddit.com/user/(.*?)/"

C# .NET Console Application

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Net;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;
 
namespace RedditMegaModBubbleParse
{
    class Program
    {
        static void Main(string[] args)
        {
 
            Dictionary<string, int> dictSub_SubscriberCount = new Dictionary<string, int>();
 
            //PARSE TOP 1000 Subreddit names and subscribers//
            List<string> top500Subreddits = new List<string>();
            for (int page = 1; page < 9; page++)
            {
                WebClient wc = new WebClient();
                string htmlString = wc.DownloadString("http://redditlist.com/?page=" + page);
                MatchCollection matches = Regex.Matches(htmlString, @"_blank(.*?)</div>", RegexOptions.Singleline);
                int count = 0;
                foreach (Match match in matches)
                {
                    if (count > 125 && count <= 250)
                    {
                        Console.WriteLine(count);
                        string sHtmlChunk = match.Groups[1].Value;
                        Match mSubName = Regex.Match(sHtmlChunk, @">(.*?)</a>");
                        Match mSubscriberCount = Regex.Match(sHtmlChunk, @"listing-stat'>(.*?)</span>");
                        if (mSubName.Success && mSubscriberCount.Success)
                        {
                            string sSubName = mSubName.Groups[1].Value;
                            Console.WriteLine("Subreddit Name : " + sSubName);
                            string sSubscriberCount = mSubscriberCount.Groups[1].Value;
                            int iSubscriberCount = int.Parse(Regex.Replace(sSubscriberCount, ",", ""));
                            Console.WriteLine("Subreddit Count : " + iSubscriberCount);
                            dictSub_SubscriberCount.Add(sSubName, iSubscriberCount);
                        }
                        Console.WriteLine();
                    }
                    count += 1;
                }
            }
 
            Dictionary<string, int> dictModCount = new Dictionary<string, int>();
            Dictionary<string, int> dictModSubscriberAggregate = new Dictionary<string, int>();
 
            int count_sub = 0;
            foreach (string subreddit_name in dictSub_SubscriberCount.Keys)
            {
                try
                {
                    count_sub += 1;
                    Console.WriteLine("*** Subreddit " + count_sub + " ~~~ " + "[" + subreddit_name + "] ***");
                    WebClient wc = new WebClient();
                    string subreddit_clean = Regex.Replace(subreddit_name, " ", "");
                    string url_glue = "https://old.reddit.com/r/" + subreddit_clean + "/about/moderators";
                    Console.WriteLine(url_glue);
                    string htmlString = wc.DownloadString(url_glue);
                    MatchCollection matches = Regex.Matches(htmlString, @"<span class=""user""><a href=""https://old.reddit.com/user/(.*?)/", RegexOptions.Singleline);
                    foreach (Match match in matches)
                    {
                        Console.WriteLine(match.Groups[1].Value);
                        string modname = match.Groups[1].Value;
 
                        if (dictModCount.ContainsKey(modname))
                        {
                            int existingcount = dictModCount[modname];
                            existingcount += 1;
                            dictModCount[modname] = existingcount;
 
                            int existingSubscriberCount = dictModSubscriberAggregate[modname];
                            existingSubscriberCount += dictSub_SubscriberCount[subreddit_name];
                            dictModSubscriberAggregate[modname] = existingSubscriberCount;
 
                        }
                        else
                        {
                            dictModCount.Add(modname, 1);
                            dictModSubscriberAggregate.Add(modname, dictSub_SubscriberCount[subreddit_name]);
                        }
                    }
                }
                catch (Exception x)
                {
                    Console.WriteLine("ERROR ACCESSING SUBREDDIT  INVITE ONLY!!!!!!!");
                }
            }
 
            Console.WriteLine();
 
            string masteroutput = "";
            foreach (KeyValuePair<string, int> item in dictModCount.OrderByDescending(key => key.Value))
            {
                Console.WriteLine("['"+item.Key + "', " + item.Value+", "+ dictModSubscriberAggregate[item.Key]+", '1' , "+ dictModSubscriberAggregate[item.Key] + " ],");
                masteroutput += "['" + item.Key + "', " + item.Value + ", " + dictModSubscriberAggregate[item.Key] + ", '1' , " + dictModSubscriberAggregate[item.Key] + " ],\r\n";
            }
 
            //save to C:\Airplanes\
            StreamWriter streamWrite;
            streamWrite = File.AppendText("C:\\Airplanes\\output.txt");
            streamWrite.WriteLine(masteroutput);
            streamWrite.Close();
 
            Console.ReadLine();
 
        }
    }
}

Congrats to the May 2020 top 50 Reddit Megamods
In order of the most subreddits modded within top 1,000 most subscribed-to subreddits.

AutoModerator MAGIC_EYE_BOT BotTerminator Blank-Cheque RepostSentinel cyXie Umbresp AssistantBOT BotDefense metastasis_d awkwardtheturtle LeafSamurai Merari01 IranianGenius commonvanilla GallowBoob N8theGr8 love_the_heat ManWithoutModem greatyellowshark justcool393 Lil_SpazJoekp siouxsie_siouxv2 SuzyModQ pHorniCaiTe babar77 PowerModerator maybesaydie davidreiss666 Sunkisty yummytuber SEO_Nuke PhlogistonAster daninger4995 kjoneslol sloth_on_meth ani625 Tornado9797 sidshembekar RalphiesBoogers RepostSleuthBot Noerdy stuffed02 whyhellomichael qgyh2 Llim ModeratelyHelpfulBot EpicEngineer T_Dumbsford Kesha_Paul

126,241 Heroin Arrests Visualized, Chicago 2001 – 2020

A look into heroin related arrests spanning almost 2 decades. The data is collected by the Chicago Police Department and available at Chicago Data Portal – Public Safety section. Click on static maps below to enlarge and/or play animation.

Full city map of all incidents.

West Side of Chicago. Heroin enforcement in the west side is so dense that city blocks are fully illuminated by incident reports.

West Side of Chicago over-time animation (Geo Temporal map). Interesting migration patterns can be seen here.

South Side of Chicago (2133x2917px).

South Side of Chicago over-time animation.

Over the last two decades CPD’s enforcement of heroin has gone down.