This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
projects:wifi_scanner [2020/01/06 23:42] neil |
projects:wifi_scanner [2020/01/06 23:53] neil |
||
---|---|---|---|
Line 32: | Line 32: | ||
===== Importing the data ===== | ===== Importing the data ===== | ||
- | The raw tcpdump logs are pretty large and full of redundant information - for around a month of wifi scanning it stores around 128 million lines of data (18.6Gb). | + | The raw tcpdump logs are pretty large and full of redundant information - for around a month of wifi scanning it records around 128 million lines of data (18.6Gb). |
I run the following code to simplify the logs to just pairs of the datetime (in YYYY-MM-DD HH:MM - I strip off the seconds) and the mac address (see below for the php code): | I run the following code to simplify the logs to just pairs of the datetime (in YYYY-MM-DD HH:MM - I strip off the seconds) and the mac address (see below for the php code): | ||
Line 40: | Line 40: | ||
</code> | </code> | ||
- | On my laptop, this processes the log files at around 300k lines/second - so in around 8 minutes. The resulting import file is reduced to approximately x million lines. | + | On my laptop, this processes the log files at around 300k lines/second - so in around 8 minutes. The resulting import file is reduced to approximately 3.8 million lines. |
I created a simple mysql table to store the timestamp and mac address: | I created a simple mysql table to store the timestamp and mac address: | ||
Line 50: | Line 50: | ||
<code sql> | <code sql> | ||
load data infile 'trimmed_tcpdump.log' into table wifi_data; | load data infile 'trimmed_tcpdump.log' into table wifi_data; | ||
+ | </code> | ||
+ | |||
+ | Once I imported all the data I added an index to the mac address column: | ||
+ | <code sql> | ||
+ | alter table simple_data add index idx_mac(mac); | ||
</code> | </code> | ||