User Tools

Site Tools


projects:wifi_scanner

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
projects:wifi_scanner [2020/01/06 23:31]
neil
projects:wifi_scanner [2020/08/03 16:11] (current)
admin
Line 1: Line 1:
 ====== Wifi Scanner ====== ====== Wifi Scanner ======
 +**Goals:** Two goals for this project. ​ I wanted to detect who was in the flat at any given time (see my [[Dynamic photo frame]] project) and I also wanted to see if I could detect spikes of activity near by (like a protest walking past the building etc).
  
-Basic steps.  Identify your wifi device (in my case wlp3s0), the enter monitor mode, use tcpdump to capture mac addresses and a short php script to switch between available channels:+==== Data ==== 
 +    * [[Projects/​Wifi/​Scan 1]]: First preliminary results from a full week wifi scan of my local area (11th Sep 2019 to 19th Sep 2019) 
 +    * [[Projects/​Wifi/​Scan 2]]: Full two-month scan from 1st Nov 2019 to 6th Jan 2020.  ​Looking for general daily activity 
 +    * [[Projects/​Wifi/​Scan 3]] - COVID-19: Are people self isolating. ​ Wifi activity for March 2020 
 + 
 +==== Setting up the hardware ==== 
 +I have a dedicated mini PC for this.  It's sitting on the window ledge of my living room with a PCI-e 5G wifi card and multiple external antennas. 
 + 
 +**Basic steps:​** ​Identify your wifi device (in my case wlp3s0), the enter monitor mode, use tcpdump to capture mac addresses and a short php script to switch between available channels:
 <code bash> <code bash>
 sudo ip link set wlp3s0 down sudo ip link set wlp3s0 down
Line 32: Line 41:
  
 ===== Importing the data ===== ===== Importing the data =====
-I import the raw tcpdump logs (just a timestamp ​and mac address) into simple mysql table: +The raw tcpdump logs are pretty large and full of redundant information - for around ​month of wifi scanning it records around 128 million lines of data (18.6Gb).  ​
-<code sql> +
-create table wifi_data ​(seen_time datetime, mac varchar(17),​ unique (seen_time,​mac))+
-</​code>​+
  
-These files are pretty large - for around a month of wifi scanning data is around 128.4 million lines of data (18.6Gb).  ​I run the following code to simplify the logs to just pairs of the datetime (in YYYY-MM-DD HH:MM) and the mac address:+I run the following code to simplify the logs to just pairs of the datetime (in YYYY-MM-DD HH:MM - I strip off the seconds) and the mac address ​(see below for the php code):
  
 <code bash> <code bash>
Line 43: Line 49:
 </​code>​ </​code>​
  
-This takes around 12 minutes which reduces ​the number of lines of data to around ​20 million.  Then I import this directly to the mysql database using the mysql client:+On my laptop, this processes ​the log files at around 300k lines/second - so in around ​8 minutes.  ​The resulting import file is reduced to approximately 3.8 million lines. 
 + 
 +I created a simple mysql table to store the timestamp and mac address: 
 +<code sql> 
 +create table wifi_data (seen_time datetime, mac varchar(17),​ unique (seen_time,​mac));​ 
 +</​code>​ 
 + 
 +Then I import this directly to the mysql database using the mysql client:
 <code sql> <code sql>
 load data infile '​trimmed_tcpdump.log'​ into table wifi_data; load data infile '​trimmed_tcpdump.log'​ into table wifi_data;
 </​code>​ </​code>​
  
-If you have any trouble with this command, you might want to split the file into more managble ​parts using ''​split -l 1000000 trimmed_tcpdump.log''​+Once I imported all the data I added an index to the mac address column: 
 +<code sql> 
 +alter table simple_data add index idx_mac(mac);​ 
 +</​code>​ 
 + 
 +If you have any trouble with this command, you might want to split the file into more managable ​parts using ''​split -l 1000000 trimmed_tcpdump.log''​
  
 ==== trim.php ==== ==== trim.php ====
-TBC+<code php> 
 +#​!/​usr/​bin/​php 
 +<?php 
 +if(empty($argv[1])) { 
 +    exit("​Missing filename\n"​);​ 
 +
 +$filename = $argv[1]; 
 +$handle = fopen($filename,​ "​r"​);​ 
 +if ($handle) { 
 +    while (($line = fgets($handle)) !== false) { 
 +        $data = explode("​ ", $line); 
 +        $datetime = date("​Y-m-d H:i", strtotime($data[0]."​ "​.substr($data[1],​0,​ 8))); 
 +        $mac_addresses = preg_match_all("/​(([a-fA-F0-9]{2}[:​|\-]?​){6}) /", $line, $matches);​ 
 +        if(is_array($matches[0])) { 
 +            $macs = array_unique($matches[0]);​ 
 +            foreach($macs as $mac) { 
 +                $mac = trim($mac);​ 
 +                $rawdata[$datetime.$mac]['​datetime'​] = $datetime;​ 
 +                $rawdata[$datetime.$mac]['​mac'​] = $mac; 
 +            } 
 +         } 
 +    } 
 +} else { 
 +    exit("​Error opening file\n"​);​ 
 +
 + 
 +foreach($rawdata as $datetime=>​$val) { 
 +    echo $val['​datetime'​]."​\t"​.$val['​mac'​]."​\n";​ 
 +
 + 
 +?> 
 +</​code>​
  
 ===== Analysing the data ===== ===== Analysing the data =====
projects/wifi_scanner.1578353484.txt.gz · Last modified: 2020/01/06 23:31 by neil