Table of Contents

Wifi Scanner

Goals: Two goals for this project. I wanted to detect who was in the flat at any given time (see my Dynamic photo frame project) and I also wanted to see if I could detect spikes of activity near by (like a protest walking past the building etc).

Data

Setting up the hardware

I have a dedicated mini PC for this. It's sitting on the window ledge of my living room with a PCI-e 5G wifi card and multiple external antennas.

Basic steps: Identify your wifi device (in my case wlp3s0), the enter monitor mode, use tcpdump to capture mac addresses and a short php script to switch between available channels:

sudo ip link set wlp3s0 down
sudo iw wlp3s0 set monitor control
sudo ip link set wlp3s0 up
sudo tcpdump -i wlp3s0 -e -ttttnn > tcpdump.log
sudo ./channel_changer.php

You can find out which $channels your device supports using iw list (it's the numbers in the 'Frequencies' section). There are about 14 on older g/n devices and more on the 5G access points. I switch channel every 0.2 seconds (same as Kismet's default channel hop time) which seems to work fine.

channel_changer.php

#!/usr/bin/php
<?php
$channels = array(
  1,2,3,4,5,6,7,8,9,10,11, 12,13,36,40,44,48,52,56,60,64,100,104,
  108,112,116,120,124,128,132,136,140,149,153,157,161,165);
 
while(true) {
        foreach($channels as $channel) {
        echo "Setting channel: #$channel\n";
        exec("iw dev wlp3s0 set channel $channel");
        usleep(200000);
        }
}
 
?>

Importing the data

The raw tcpdump logs are pretty large and full of redundant information - for around a month of wifi scanning it records around 128 million lines of data (18.6Gb).

I run the following code to simplify the logs to just pairs of the datetime (in YYYY-MM-DD HH:MM - I strip off the seconds) and the mac address (see below for the php code):

php trim.php tcpdump.log > trimmed_tcpdump.log

On my laptop, this processes the log files at around 300k lines/second - so in around 8 minutes. The resulting import file is reduced to approximately 3.8 million lines.

I created a simple mysql table to store the timestamp and mac address:

CREATE TABLE wifi_data (seen_time datetime, mac VARCHAR(17), UNIQUE (seen_time,mac));

Then I import this directly to the mysql database using the mysql client:

LOAD DATA INFILE 'trimmed_tcpdump.log' INTO TABLE wifi_data;

Once I imported all the data I added an index to the mac address column:

ALTER TABLE simple_data ADD INDEX idx_mac(mac);

If you have any trouble with this command, you might want to split the file into more managable parts using split -l 1000000 trimmed_tcpdump.log

trim.php

#!/usr/bin/php
<?php
if(empty($argv[1])) {
    exit("Missing filename\n");
}
$filename = $argv[1];
$handle = fopen($filename, "r");
if ($handle) {
    while (($line = fgets($handle)) !== false) {
        $data = explode(" ", $line);
        $datetime = date("Y-m-d H:i", strtotime($data[0]." ".substr($data[1],0, 8)));
        $mac_addresses = preg_match_all("/(([a-fA-F0-9]{2}[:|\-]?){6}) /", $line, $matches);
        if(is_array($matches[0])) {
            $macs = array_unique($matches[0]);
            foreach($macs as $mac) {
                $mac = trim($mac);
                $rawdata[$datetime.$mac]['datetime'] = $datetime;
                $rawdata[$datetime.$mac]['mac'] = $mac;
            }
         }
    }
} else {
    exit("Error opening file\n");
}
 
foreach($rawdata as $datetime=>$val) {
    echo $val['datetime']."\t".$val['mac']."\n";
}
 
?>

Analysing the data

I've made some graphs:

I'm still working on analysing an entire uninterrupted month to get some general statistics on wifi use around my area. Updates and code to follow.