Tl;dr
I wake up the fitness tracker once every 30s
, it can receive 60
measurements from the Accelerometer
sensor and I don’t have memory space to log all the measurements.
In this blog post I will try to explore what data filters are more representative of a population, depending on what we are looking for.
For example, the mean
and the variance
can give you an idea of the representation of the data, with a minimum data size, instead of sending 30
data items , you will only send 2
data items (the mean
and the variance
).
So what are the properties we can keep in this data?
It depends on our main query:
- In case we want to know if a certain data exists, we use a
Bloom Filter
. - In case where we are looking for outliers we will keep some of them.
- In case we are searching for patterns we can:
- Derive data to find when it increases and when it decreases.
- Cluster data using a
Similarity function
, and report on characteristics of the clusters. - Compress the data by removing redundant data.
- Represent the data using a
Model
, likeMarkov Chain
.
- Other data may include :
min
,max
,mean
,variance
,median
,mad
,Mod z-score
, etc.
My favorite ones are : the Mod z-score
for outliers detection, mean
and features
(I will write about that later - Be the first to know).
The utility
Compress the data. In IoT, the host (
Main Unit
) can get a lot of data from the sensors ( eg:Accelerometer
,GPS
, …), it has a small memory (16MB
in average), so you have to optimize what you want to store:- Using the
Accelerometer
data, I wanted to know which side of the fitness tracker was facing up. The data raw data has 3 axes of two bytes each. So I ended up with a simple data composed of3 bits
to represent the6 faces
. Moreover, if the face won’t change, then, don’t report.
- Using the
Privacy by design. Using filters at the very beginning, means you get only what you need, and you don’t expose the user data for other adversaries.
Real time decisions. Filters can give you a simple data formats upon which, you can make decisions in the IoT device itself, here are some examples :
When the
Accelerometer
detects motion, then, activateGPS
. This can be found as a hardware feature calledWOR
(Wake on Motion) implemented inside theAccelerometer
itself using theDMP
(Digital Motion Processor), their aim is to offload the host processor (Only consuming68uA
instead of10mA
).When the
GPS
locations are on the same place for a long time, then, deactivateGPS
.
Finally, here is my wish, all of this would be simpler with the ReactiveX
library, as it has its pre-built data reducers called Observable Operators
, where you can apply operators ( eg:Debounce
, Reduce
, Distinct
).
My only wish is to find a simple C
implementation with the Amazon FreeRTOS
, it would simplify the design of all the operations (Observe
=> Map
=> Filter
=> Combine
), and give us a normalized way to solve these problems and communicate around it.