Differences

This shows you the differences between two versions of the page.

--- projects:wifi_scanner [2019/11/30 13:15]
neil
+++ projects:wifi_scanner [2020/01/06 23:31]
neil
@@ Line 30: / Line 30: @@
 ?>
 </code>
+===== Importing the data =====
+I import the raw tcpdump logs (just a timestamp and mac address) into a simple mysql table:
+<code sql>
+create table wifi_data (seen_time datetime, mac varchar(17), unique (seen_time,mac));
+</code>
+These files are pretty large - for around a month of wifi scanning data is around 128.4 million lines of data (18.6Gb).  I run the following code to simplify the logs to just pairs of the datetime (in YYYY-MM-DD HH:MM) and the mac address:
+<code bash>
+php trim.php tcpdump.log > trimmed_tcpdump.log
+</code>
+This takes around 12 minutes which reduces the number of lines of data to around 20 million.  Then I import this directly to the mysql database using the mysql client:
+<code sql>
+load data infile 'trimmed_tcpdump.log' into table wifi_data;
+</code>
+If you have any trouble with this command, you might want to split the file into more managble parts using ''split -l 1000000 trimmed_tcpdump.log''
+==== trim.php ====
+TBC
 ===== Analysing the data =====

neil.mckillop.org