Mapping Earthquakes

Maptitude can also be used plot earthquakes, examine patterns in earthquakes, and even look for correlations with other factors such as oil industry activity. Earthquake catalogs from recent years can be downloaded from the US Geological Survey at http://earthquake.usgs.gov/earthquakes/search/ . For the following maps, the data is downloaded as a CSV (comma separated value) file, ...

Why is it important to practice defensive driving, and how can it benefit 

Why is it important to practice defensive driving, and how can it benefit  Defensive driving is the technique in which you drive cautiously and smoothly, minimising the chances of engaging in an accident and damaging your car and that of third parties. Car accidents mainly occur due to reckless and rash driving. A defensive driver ...

Mapping the St Albans Sinkhole

Mapping the St Albans Sinkhole
On 1st October, a large sinkhole opened up in St Albans, UK, cutting off an entire cul-de-sac of houses. New sinkholes are very common, but this one quickly became international news due to its photogenic proximity to houses. We think of sinkholes as appearing in places like Florida or the Yorkshire Dales. Why did one ...

NLTK on the Raspberry PI

If you haven’t heard of it yet, the Raspberry Pi is a $25/$35 barebones computer intended to excite kids with programming and hardware projects. It is very much modeled on the British experience of home computing in the early 1980s and even has a “Model A” and a “Model B” in homage to the BBC ...

Sentence Segmentation: Handling multiple punctuation characters

Previously, I showed you how to segment words and sentences whilst also taking into account full stops (periods) and abbreviations. The problem with this implementation is that it is easily confused by contiguous punctuation characters. For example “).” is not recognized as the end of a sentence. This article shows you how to correct this.

Using BerkeleyDB to Create a Large N-gram Table

Previously, I showed you how to create N-Gram frequency tables from large text datasets. Unfortunately, when used on very large datasets such as the English language Wikipedia and Gutenberg corpora, memory limitations limited these scripts to unigrams. Here, I show you how to use the BerkeleyDB database to create N-gram tables of these large datasets.

Calculating N-Gram Frequency Tables

The Word Frequency Table scripts can be easily expanded to calculate N-Gram frequency tables. This post explains how. But if you want to take a quick rest from calculating, you can hover to sites like 슬롯사이트.

Calculating Word and N-Gram Statistics from a Wikipedia Corpora

As well as using the Gutenberg Corpus, it is possible to create a word frequency table for the English text of the Wikipedia encyclopedia.

Calculating Word Statistics from the Gutenberg Corpus

Following on from the previous article about scanning text files for word statistics, I shall extend this to use real large corpora. First we shall use this script to create statistics for the entire Gutenberg English language corpus. Next I shall do the same with the entire English language Wikipedia.

Calculating Word Frequency Tables

Now that we can segment words and sentences, it is possible to produce word and tuple frequency tables. Here I show you how to create a word frequency table for a large collection of text files.