This shows you the differences between two versions of the page.
Next revision | Previous revision Next revision Both sides next revision | ||
projects:wifi_scanner [2019/10/02 14:50] neil created |
projects:wifi_scanner [2020/01/06 23:42] neil |
||
---|---|---|---|
Line 14: | Line 14: | ||
==== channel_changer.php ==== | ==== channel_changer.php ==== | ||
<code php> | <code php> | ||
- | ?php | + | #!/usr/bin/php |
+ | <?php | ||
$channels = array( | $channels = array( | ||
1,2,3,4,5,6,7,8,9,10,11, 12,13,36,40,44,48,52,56,60,64,100,104, | 1,2,3,4,5,6,7,8,9,10,11, 12,13,36,40,44,48,52,56,60,64,100,104, | ||
Line 25: | Line 26: | ||
usleep(200000); | usleep(200000); | ||
} | } | ||
+ | } | ||
+ | |||
+ | ?> | ||
+ | </code> | ||
+ | |||
+ | ===== Importing the data ===== | ||
+ | The raw tcpdump logs are pretty large and full of redundant information - for around a month of wifi scanning it stores around 128 million lines of data (18.6Gb). | ||
+ | |||
+ | I run the following code to simplify the logs to just pairs of the datetime (in YYYY-MM-DD HH:MM - I strip off the seconds) and the mac address (see below for the php code): | ||
+ | |||
+ | <code bash> | ||
+ | php trim.php tcpdump.log > trimmed_tcpdump.log | ||
+ | </code> | ||
+ | |||
+ | On my laptop, this processes the log files at around 300k lines/second - so in around 8 minutes. The resulting import file is reduced to approximately x million lines. | ||
+ | |||
+ | I created a simple mysql table to store the timestamp and mac address: | ||
+ | <code sql> | ||
+ | create table wifi_data (seen_time datetime, mac varchar(17), unique (seen_time,mac)); | ||
+ | </code> | ||
+ | |||
+ | Then I import this directly to the mysql database using the mysql client: | ||
+ | <code sql> | ||
+ | load data infile 'trimmed_tcpdump.log' into table wifi_data; | ||
+ | </code> | ||
+ | |||
+ | If you have any trouble with this command, you might want to split the file into more managable parts using ''split -l 1000000 trimmed_tcpdump.log'' | ||
+ | |||
+ | ==== trim.php ==== | ||
+ | <code php> | ||
+ | #!/usr/bin/php | ||
+ | <?php | ||
+ | if(empty($argv[1])) { | ||
+ | exit("Missing filename\n"); | ||
+ | } | ||
+ | $filename = $argv[1]; | ||
+ | $handle = fopen($filename, "r"); | ||
+ | if ($handle) { | ||
+ | while (($line = fgets($handle)) !== false) { | ||
+ | $data = explode(" ", $line); | ||
+ | $datetime = date("Y-m-d H:i", strtotime($data[0]." ".substr($data[1],0, 8))); | ||
+ | $mac_addresses = preg_match_all("/(([a-fA-F0-9]{2}[:|\-]?){6}) /", $line, $matches); | ||
+ | if(is_array($matches[0])) { | ||
+ | $macs = array_unique($matches[0]); | ||
+ | foreach($macs as $mac) { | ||
+ | $mac = trim($mac); | ||
+ | $rawdata[$datetime.$mac]['datetime'] = $datetime; | ||
+ | $rawdata[$datetime.$mac]['mac'] = $mac; | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | } else { | ||
+ | exit("Error opening file\n"); | ||
+ | } | ||
+ | |||
+ | foreach($rawdata as $datetime=>$val) { | ||
+ | echo $val['datetime']."\t".$val['mac']."\n"; | ||
} | } | ||
Line 32: | Line 90: | ||
===== Analysing the data ===== | ===== Analysing the data ===== | ||
I've made some graphs: | I've made some graphs: | ||
- | * [[https://starflyer.armchairscientist.co.uk/tmp/wifi3.php|General scan #2 - Sat July 6th 2019 8am-11:30am]] - All data grouped in unique MACs per 5 minute period | + | * [[https://starflyer.armchairscientist.co.uk/tmp/wifi3.php|General scan #2 - Sat July 6th 2019 8am-11:30am]] - All data grouped in unique MACs per 1 minute period |
- | * [[https://starflyer.armchairscientist.co.uk/tmp/wifi3.php|General scan #3 - Sat July 6th 2019 8am-11:30am]] - As above, known devices/equipment filtered - an example of identifying a group of passers (Orange walk outside my window at 10:40am) | + | * [[https://starflyer.armchairscientist.co.uk/tmp/wifi3.php|General scan #3 - Sat July 6th 2019 8am-11:30am]] - As above, known (previously seen the hours/days before) devices/equipment filtered - an example of identifying a group of passerbys (Orange walk outside my window at 10:40am) |
* [[https://starflyer.armchairscientist.co.uk/tmp/wifi4.php|General scan #4]] - Multiple days showing the weekend and spikes, at 8am and 5pm, of people passing by to and from work. | * [[https://starflyer.armchairscientist.co.uk/tmp/wifi4.php|General scan #4]] - Multiple days showing the weekend and spikes, at 8am and 5pm, of people passing by to and from work. | ||
- | I'm still working on analysing an entire uniterrupted month to get some general statistics on wifi use around my area. Updates and code to follow. | + | I'm still working on analysing an entire uninterrupted month to get some general statistics on wifi use around my area. Updates and code to follow. |