Author Archives: MantasCode

Ruby: Solution to Project Euler Problem # 28

http://projecteuler.net/problem=28

Number spiral diagonals

Starting with the number 1 and moving to the right in a clockwise direction a 5 by 5 spiral is formed as follows:

21 22 23 24 25
20  7  8  9 10
19  6  1  2 11
18  5  4  3 12
17 16 15 14 13

It can be verified that the sum of the numbers on the diagonals is 101.

What is the sum of the numbers on the diagonals in a 1001 by 1001 spiral formed in the same way?

Ruby: Solution to Project Euler Problem # 25

Recently, I’ve been typing more Ruby then usual. In Ruby you can have a fairly big number.
http://projecteuler.net/problem=25

1000-digit Fibonacci number

The Fibonacci sequence is defined by the recurrence relation:

Fn = Fn−1 + Fn−2, where F1 = 1 and F2 = 1.

The 12th term, F12, is the first term to contain three digits.

What is the first term in the Fibonacci sequence to contain 1000 digits?

C#: Parse a Sentence Containing a Word from Text using Regular Expressions.

Recently, I had received an email from someone asking me how to obtain all sentences containing a specific word. So, I made this quick post. The code below shows how to use regular expressions to parse all sentences from text, then check to see if the sentence contains a specific word.

//Look for sentences containing the word "bank"
string word = "bank";
//Text String
string fulltext = @"Starting in the early 1960s federal banking regulators interpreted provisions of the Glass–Steagall Act to permit commercial banks and especially commercial bank affiliates to engage in an expanding list and volume of securities activities. By the time the affiliation restrictions in the Glass–Steagall Act were repealed through the Gramm–Leach–Bliley Act of 1999 (GLBA), many commentators argued Glass–Steagall was already “dead.” Most notably, Citibank’s 1998 affiliation with Salomon Smith Barney, one of the largest US securities firms, was permitted under the Federal Reserve Board’s then existing interpretation of the Glass–Steagall Act. President Bill Clinton publicly declared ""the Glass–Steagall law is no longer appropriate."" Many commentators have stated that the GLBA’s repeal of the affiliation restrictions of the Glass–Steagall Act was an important cause of the late-2000s financial crisis.  Some critics of that repeal argue it permitted Wall Street investment banking firms to gamble with their depositors' money that was held in affiliated commercial banks. Others have argued that the activities linked to the financial crisis were not prohibited (or, in most cases, even regulated) by the Glass–Steagall Act. Commentators, including former President Clinton in 2008 and the American Bankers Association in January 2010, have also argued that the ability of commercial banking firms to acquire securities firms (and of securities firms to convert into bank holding companies) helped mitigate the financial crisis.";
 
//Match Collection for every sentence
MatchCollection matchSentences = 
    Regex.Matches(fulltext, @"([A-Z][^\.!?]*[\.!?])");
//Alternative pattern :  @"(\S.+?[.!?])(?=\s+|$)"
 
//counter for sentences.
int foundSentenceWithWord = 0;
foreach (Match sFound in matchSentences)
{
    foreach (Capture capture in sFound.Captures)
    {
        string current_sentence = capture.Value;
        //if you don't want to match for words like 'bank'er  or  'bank'ing
        //use the word boundary "\b"
        //change this pattern to   @"\b"+word+@"\b"
        Match matchWordInSentence = 
            Regex.Match(capture.Value, word, RegexOptions.IgnoreCase);
        if (matchWordInSentence.Success)
        {
            Console.WriteLine("Sentence Found Containing '" + word+"' :");
            Console.WriteLine(current_sentence); Console.WriteLine();
            foundSentenceWithWord++;
        }
    }
}
Console.WriteLine();
Console.WriteLine("Found " + foundSentenceWithWord 
    + " Sentences Containing the word '" + word + "'");
Console.WriteLine();

Output:

C#: Split GIF frames into PNGs using ImageMagick

How to split apart a GIF image frame by frame, and save each frame as PNG into a sub-folder.

To achieve this I will be using ImageMagick
http://www.imagemagick.org/

Download, Install and locate an executable named convert.exe

I created a folder on my C: Drive called
/ImageConversion/

Place your GIFs into the ImageConversion Folder, also copy convert.exe into there as well. So the folder should look like this:

Execute this code:

//get all files in folder
string[] files = Directory.GetFiles(@"c:\ImageConversion\");
 
//loop through each file
foreach (string file in files)
{
    //check if .GIF
    string ext = Path.GetExtension(file);
    if (ext == ".gif")
    {
        Console.WriteLine("Splitting : "+file);
        //Get the name of the file without extension
        string filenameNoExt = Path.GetFileNameWithoutExtension(file);
        //Get the file name with extention
        string filenameExt = Path.GetFileName(file);
        Console.WriteLine();
        //Create a Sub Directory with the same as the GIF's filename
        Directory.CreateDirectory(@"c:\ImageConversion\"+filenameNoExt);
        Process imProcess = new Process();
        //Arguments
        string im_command = @"c:\ImageConversion\"
        + filenameExt + @" -scene 1 +adjoin -coalesce c:\ImageConversion\" + filenameNoExt + @"\" + filenameNoExt + ".png";
        imProcess.StartInfo.UseShellExecute = false;
        imProcess.StartInfo.RedirectStandardOutput = true;
        imProcess.StartInfo.FileName = @"c:\ImageConversion\convert";
        imProcess.StartInfo.Arguments = im_command;
        imProcess.Start();
        imProcess.WaitForExit();
    }
}

That’s it, you’re done.
Now the ImageConversion folder should be populated with a folder for every gif containing every frame as an individual png.

C#: Programmatically download all Images from a website and save them locally.

For this example, I will be downloading all the .gifs from a specific page on imgur. Then, I will save them into a folder called _Images.

This is the url to the specific page I will be downloading images from http://imgur.com/a/GPlx4

What the website looks like:

Below is C# code that will:
-use a WebClient.DownloadString() to download the html into a string
-parse out specific image links from the returned html string using regular expressions
-use another WebClient.DownloadFile() to download each image link parsed

WebClient wchtml = new WebClient();
string htmlString = wchtml.DownloadString("http://imgur.com/a/GPlx4");
int mastercount = 0;
Regex regPattern = new Regex(@"http://i.imgur.com/(.*?)alt=""", RegexOptions.Singleline);
MatchCollection matchImageLinks = regPattern.Matches(htmlString );
 
foreach (Match img_match in matchImageLinks)
{
    string imgurl = img_match.Groups[1].Value.ToString();
    Regex regx = new Regex("http://([\\w+?\\.\\w+])+([a-zA-Z0-9\\~\\!\\@\\#\\$\\%\\^\\&\\*\\(\\)_\\-\\=\\+\\\\\\/\\?\\.\\:\\;\\'\\,]*)?",
    RegexOptions.IgnoreCase);
    MatchCollection ms = regx.Matches(imgurl);
    foreach (Match m in ms)
    {
        Console.WriteLine("Downloading..  "  + m.Value);
        mastercount++;
        try
        {
            WebClient wc = new WebClient();
            wc.DownloadFile(m.Value, @"C:\_Images\bg_" + mastercount + ".gif");
            Thread.Sleep(1000);
        }
        catch (Exception x)
        {
            Console.WriteLine("Failed to download image.");
        }
        break;
    }
}

Output: C:\_Images

C#: Parse a website and save specific content as XML.

In this tutorial I will show you how to iterate through a website using a WebClient,  Save each page’s content into a string, parse it using regular expressions, and save it as XML.

I will be parsing airplane data from airliners.net

Here is a picture of what the website looks like.

Get started with a .NET C# Console application and create a WebClient.

Use the URL and Output the html returned.

WebClient wc = new WebClient();
string htmlString = wc.DownloadString("http://www.airliners.net/aircraft-data/stats.main?id=2");
Console.WriteLine(htmlString);

Lets take a look at both  the console output and the source of the page to make sure we got the right html back.

Console and Page Source

Success, we found the Airplane title!  Now lets see if we can parse it, and only it, from the entire string of html.

Try to see if there is a pattern that’s unique to what you need.

WebClient wc = new WebClient();
string htmlString = wc.DownloadString("http://www.airliners.net/aircraft-data/stats.main?id=2");
Match mTitle = Regex.Match(htmlString, @"<center><h1>(.*?)</h1>");
if (mTitle.Success)
{
    string airplaneTitle = mTitle.Groups[1].Value;
    Console.WriteLine(airplaneTitle);
}

Success.

Now lets get Country of Origin and the other categories.

string airplaneCountry = "";
Match mCountry = Regex.Match(htmlString, @"<b>Country of origin</font>(.*?)<p>", RegexOptions.Singleline);
if (mCountry.Success)
{
     airplaneCountry = mCountry.Groups[1].Value;
     Console.WriteLine(airplaneCountry);
}
Console.WriteLine("***************************************************************");
//manicure the pattern string
//Replace everything before the last (greater than) sign with and empty string
airplaneCountry = Regex.Replace(airplaneCountry, "(.*?)>", "").Trim();
Console.WriteLine(airplaneCountry);

Now I am tired of typing, so lets make a method for everything else.

static void Main(string[] args)
{
 
    WebClient wc = new WebClient();
    string htmlString = wc.DownloadString("http://www.airliners.net/aircraft-data/stats.main?id=2");
 
    Console.WriteLine("Name       :"+ parsePattern(@"<center><h1>(.*?)</h1>",htmlString));
    Console.WriteLine("Country    :" + parsePattern(@"<b>Country of origin</font>(.*?)<p>", htmlString));
    Console.WriteLine("Powerplants:" + parsePattern(@"<b>Powerplants</font>(.*?)<p>", htmlString));
    Console.WriteLine("Performance:" + parsePattern(@"<b>Performance</font>(.*?)<p>", htmlString));
    Console.WriteLine("Weights    :" + parsePattern(@"<b>Weights</font>(.*?)<p>", htmlString));
    Console.WriteLine("Dimentions :" + parsePattern(@"<b>Dimensions</font>(.*?)<p>", htmlString));
    Console.WriteLine("Capacity   :" + parsePattern(@"<b>Capacity</font>(.*?)<p>", htmlString));
    Console.WriteLine("Type       :" + parsePattern(@"<b>Type</font>(.*?)<p>", htmlString));
    Console.WriteLine("Production :" + parsePattern(@"<b>Production</font>(.*?)<p>", htmlString));
 
    //lol   :D   Damn this History!
    //Console.WriteLine(parsePattern(@"<b>History</font>(.*?)<table border=0 cellpadding=1 cellspacing=0 >", htmlString));
    //History breaks my method pattern cause there's too many <p>'s
    //Doing History Manually, Due to Different Pattern.
    Match mHistory = Regex.Match(htmlString, @"<b>History</font>(.*?)<table border=0 cellpadding=1 cellspacing=0 >", RegexOptions.Singleline);
    if (mHistory.Success)
    {
        string strContent = mHistory.Groups[1].Value;
        //Google your problems "C# regex to remove html" - thanks stackoverflow
        //Get @"<[^>]*>"
        strContent = Regex.Replace(strContent, @"<[^>]*>", "").Trim();
        Console.WriteLine("History : " + strContent);
    }
}
public static string parsePattern(string pat, string htmlString)
{
    Match mCategory = Regex.Match(htmlString, @pat, RegexOptions.Singleline);
    if (mCategory.Success)
    {
        string strContent = mCategory.Groups[1].Value;
        if (strContent.Contains('>'))
        {
            strContent = Regex.Replace(strContent, "(.*?)>", "").Trim();
            return strContent;
        }
        else
            return strContent;
    }
    return "";
}

Now lets save everything into xml :D
I Created a Folder in my C: Drive called C:\Airplanes\
Iterate through each page and save the content.

Here’s the full Code to do the Whole Website:

static void Main(string[] args)
{
    //Since we don't want to hold a glock up to the webserver's dome piece:
    //Let's pretend to be a human who goes to each page within 1 - 3 seconds randomly.
    WebClient wc = new WebClient();
    int pageCount = 1;
    int randomWait = 0;
    Random random = new Random();
    while (true)
    {
        randomWait = random.Next(1000, 3000);
        Thread.Sleep(randomWait);
        string htmlString = wc.DownloadString("http://www.airliners.net/aircraft-data/stats.main?id=" + pageCount);
        ReadAndAppend(htmlString);
        pageCount++;
        Console.WriteLine("Saving :" + pageCount);
    }
}
public static void ReadAndAppend(string htmlString)
{
    string name = parsePattern(@"<center><h1>(.*?)</h1>", htmlString);
    string country = parsePattern(@"<b>Country of origin</font>(.*?)<p>", htmlString);
    string power = parsePattern(@"<b>Powerplants</font>(.*?)<p>", htmlString);
    string perf = parsePattern(@"<b>Performance</font>(.*?)<p>", htmlString);
    string lb = parsePattern(@"<b>Weights</font>(.*?)<p>", htmlString);
    string dim = parsePattern(@"<b>Dimensions</font>(.*?)<p>", htmlString);
    string cap = parsePattern(@"<b>Capacity</font>(.*?)<p>", htmlString);
    string type = parsePattern(@"<b>Type</font>(.*?)<p>", htmlString);
    string prod = parsePattern(@"<b>Production</font>(.*?)<p>", htmlString);
    string hist = "";
    Match mHistory = Regex.Match(htmlString, @"<b>History</font>(.*?)<table border=0 cellpadding=1 cellspacing=0 >", RegexOptions.Singleline);
    if (mHistory.Success)
    {
        string strContent = mHistory.Groups[1].Value;
        strContent = Regex.Replace(strContent, @"<[^>]*>", "").Trim();
        hist = strContent;
    }
    //make the xml
    string xmlString = "<plane>";
    xmlString += "  <name>" + name + "</name>\r\n";
    xmlString += "  <country>" + country + "</country>\r\n";
    xmlString += "  <power>" + power + "</power>\r\n";
    xmlString += "  <perf>" + perf + "</perf>\r\n";
    xmlString += "  <lb>" + lb + "</lb>\r\n";
    xmlString += "  <dim>" + dim + "</dim>\r\n";
    xmlString += "  <cap>" + cap + "</cap>\r\n";
    xmlString += "  <type>" + type + "</type>\r\n";
    xmlString += "  <prod>" + prod + "</prod>\r\n";
    xmlString += "  <hist>" + hist + "</hist>\r\n";
    xmlString += "</plane>\r\n\r\n";
 
    //Show me the saves and count
    Console.WriteLine(xmlString);
 
    //save to C:\Airplanes\
    StreamWriter streamWrite;
    streamWrite = File.AppendText("C:\\Airplanes\\airData.xml");
    streamWrite.WriteLine(xmlString);
    streamWrite.Close();
}
 
public static string parsePattern(string pat, string htmlString)
{
    Match mCategory = Regex.Match(htmlString, @pat, RegexOptions.Singleline);
    if (mCategory.Success)
    {
        string strContent = mCategory.Groups[1].Value;
        if (strContent.Contains('>'))
        {
            strContent = Regex.Replace(strContent, "(.*?)>", "").Trim();
            return strContent;
        }
        else
            return strContent;
    }
    return "";
}

Heres the XML file Created.