This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
projects:wifi_scanner [2019/11/30 13:15] neil |
projects:wifi_scanner [2020/01/06 23:31] neil |
||
---|---|---|---|
Line 30: | Line 30: | ||
?> | ?> | ||
</code> | </code> | ||
+ | |||
+ | ===== Importing the data ===== | ||
+ | I import the raw tcpdump logs (just a timestamp and mac address) into a simple mysql table: | ||
+ | <code sql> | ||
+ | create table wifi_data (seen_time datetime, mac varchar(17), unique (seen_time,mac)); | ||
+ | </code> | ||
+ | |||
+ | These files are pretty large - for around a month of wifi scanning data is around 128.4 million lines of data (18.6Gb). I run the following code to simplify the logs to just pairs of the datetime (in YYYY-MM-DD HH:MM) and the mac address: | ||
+ | |||
+ | <code bash> | ||
+ | php trim.php tcpdump.log > trimmed_tcpdump.log | ||
+ | </code> | ||
+ | |||
+ | This takes around 12 minutes which reduces the number of lines of data to around 20 million. Then I import this directly to the mysql database using the mysql client: | ||
+ | <code sql> | ||
+ | load data infile 'trimmed_tcpdump.log' into table wifi_data; | ||
+ | </code> | ||
+ | |||
+ | If you have any trouble with this command, you might want to split the file into more managble parts using ''split -l 1000000 trimmed_tcpdump.log'' | ||
+ | |||
+ | ==== trim.php ==== | ||
+ | TBC | ||
===== Analysing the data ===== | ===== Analysing the data ===== |