Thursday, June 7, 2018

digital communications - Extracting Binary Magnetic-Strip Card Data from raw WAV


I am faced with a tricky challenge: To extract binary data from an iPhone magnetic strip card reader. This is what the magnetisation on the card looks like:


enter image description here
Source


Here is the .WAV the iPhone receives when you swipe a card (don't get your hopes up too much, it is a bonus loyalty card ;)). That's three swipes by the way, at different speeds. This is the raw SInt16 dump for the swipe I am using.


Someone seems to have done it here but the actual data I capture isn't particularly easy to process.


The reading starts (and finishes) with an indeterminate number of 'zeros ' -- note that the wave only repeats after 2 ZEROS have been collected, this represents N-S followed by S-N:


enter image description here


(note that each of the three lines represents me swiping a different card; the bottom card in this image is 15 years old, so the magnetic field is clearly severely degraded in some places, not visible in this shot)



This will allow an algorithm to ascertain a clock tick.


The magnetic field reverses on each clock tick. Also for a binary 1, the magnetic field reverses exactly in the middle of a tick:


enter image description here


The sequence always starts with a 1101+0(parity bit) start sentinel. You can pick this out in all three readings in the above graph. It is indicated more clearly in the cosmodro article I linked at the top of the question.


Here is an example of magnetic degradation (taken further along on the bottom card reading): enter image description here


I am trying to figure out a sensible way to convert this waveform into its corresponding binary sequence.


I have found one PDF that goes into some detail, but I can't figure out the algorithm they are using.


This PDF contains one interesting image: enter image description here


If I could extract the red and blue lines as per this diagram, I could use either one of them to extract the data, but I can't figure out the logic behind the construction.


So this is my question: How do I extract the binary sequence?



PS. Note that the swipe speed is not going to be constant. So once the clock has been determined, in needs to be constantly adjusted from one tick to the next.


PPS. Would autocorrelation catch pairs of ticks? (seeing as ticks will alternate N-S S-N ... )




EDIT (June '12): I needed a lot of help on this one, but finally I have completed a solid reader ( http://www.magstripedecoder.com/ ). Thanks for everyone who helped! I recommend #musicdsp on IRC's efnet channel for anyone dedicated enough to take the challenge of getting to grips with the maths -- it's really really hard!



Answer



This was quite a challenge. I tried at least four approaches before cracking it. This is how I did it:


enter image description here


I start by smoothing the data (first reading) with a simple...


x_new = 0.9 * x_prev + 0.1 * x_in


... IIR filter. I do this in both directions (second reading). This gets rid of all the fuzzy noise, however it creates discontinuities which come back with a vengeance in the derivatives.


I then get all derivatives up to the fourth (third and fourth readings represent the third and fourth derivative), and create a new function:


g(x) = f'''(x)^2 + k*f''''(x)^2

Why? because I noticed that by the time we get to the third derivative, what we have is effectively a sinusoid inside an envelope:


enter image description here


...and everyone knows from high school that:


sin^2 + cos^2=1 

enter image description here



and that sin and cos differentiate into each other:


enter image description here


Hence the implied envelope can be recovered.


Why derivatives 3 and 4? basically each higher derivative purifies the signal. That which is sinusoidal remains sinusoidal (just shifts phase 90° so sin->cos etc) whereas that which isn't falls away.


I wanted to use 11 & 12 or something crazy, but the derivatives fall apart quite quickly, 4 is the highest I can get before things go haywire, even then the little derivative lines you see in the picture are heavily smoothed.


This produces a wonderful little bump on every flux transition (fifth reading).


Next I walk through the turning points, rejecting duds (sixth reading)..


Finally I walk through the maxima (seventh reading), evaluating whether each skip is a half step or a whole step, and then reconstruct the binary.


Yay!


EDIT: It is now several months since I completed this project. the most difficult challenge is to construct some transform that isolates flux transitions; technically speaking, ' retrieving the amplitude envelope '. this is done by constructing the π/2 phase shift signal from the original (this is also known as quadrature signal). then E(t)^2 = S(t)^2 + Q(S(t))^2.



To get the quadrature signal, I simply did an FFT, and rotated each bin a quarter turn, then recombined the modified spectral components.


There is a lot of confusing abusive terminology in this field; keywords are ' analytic signal ', ' Hilbert transform '... I've avoided using those keywords as different disciplines assign different meanings to them.


There is a much smarter way of achieving this amplitude envelope using digital filters, thus avoiding the Fourier transform. This allows the algorithm to run on very low powered microcontrollers.


This process produces a waveform that should have a unique bump over each flux transition.


Decoding this waveform into a binary sequence is still a nontrivial task. the complexity and this component is algorithmic rather than mathematical; the difficulty is comparable.


All in all this is an extremely difficult problem. It took me the best part of three months to achieve on their performance algorithm. I will in the fullness of time document my approach and produce a publicly available decoder engine.


No comments:

Post a Comment

periodic trends - Comparing radii in lithium, beryllium, magnesium, aluminium and sodium ions

Apparently the of last four, $\ce{Mg^2+}$ is closest in radius to $\ce{Li+}$. Is this true, and if so, why would a whole larger shell ($\ce{...