[ntp:questions] Accuracy of audio tones via VOIP

unruh unruh at invalid.ca
Mon Jul 15 19:27:12 UTC 2013

On 2013-07-15, Robert Scott <no-one at notreal.invalid> wrote:
> On Mon, 15 Jul 2013 15:51:25 GMT, unruh <unruh at invalid.ca> wrote:
>>On 2013-07-15, Robert Scott <no-one at notreal.invalid> wrote:
>>> On Wed, 10 Jul 2013 17:57:40 GMT, unruh <unruh at invalid.ca> wrote:
>>>>But again. why does the OP not just measure it? I guess he is hoping
>>>>that someone already had done his work for him.
>>> There are several reasons why I have not "just measured it".  One is
>>> that I do not have a Skype subscription.  Another reason is that a
>>> single measurement on a single computer means nothing.  If the sound
>>> card in that computer happens to have a crystal oscillator that is set
>>> very close to its nominal frequency, the frequency reproduction error
>>> would be nearly zero, even if Skype did nothing to compensate for it.
>>Skype nor anything will compensate for the sound card frequency. Noone
>>in that field cares about a few cents ( 100ths of a semitone) difference
>>in frequencies.
> I agree that pitch accuracy itself is not important to Skype et. al.
> But indirectly a mismatch between the originating audio sampling rate
> and the playback sampling rate results in an ever increasing buffer
> overflow or underflow that could only be corrected by dropping packets

And there are lots of silent packets that could be dropped in a speech.
Or at times the sound is pretty garbled anyway. 

> or inserting extra packets.  Perhaps you are right and that is all
> they do.  I suppose if the packets are small enough that would not

That is why experiment would be useful

> interfere noticeably with speech.  But for my purposes it would result
> in occasional phase jumps in the recovered tones, which is just as
> problematic for me as an inaccuracte playback sample rate.  I am

Sure is. Of course you could also compensate for those. 
By the way, there will be a rate difference between the clock on the
soundcard and the clock in the computer as well. It is the computer that
determines the frequency, which it is the soundcard that digitizes it.
So what do you do about that mismatch in your software?

> leaning heavily toward simply recommending that my users not use any
> VOIP connection to do their calibration.  That would be consistent
> will all the advice I have gotten here so far.
>>What in the world makes you think skype
>>corrects anything? They have a desperate time of it getting most of the
>>packets to you, never mind correcting for frequency errors in a
>>soundcard. (and how would they know that the soundcard had frequency
>>errors to correct?)
>>And what is the range of frequency errors in the soundcards?
> I described earlier the one possible method whereby they might know
> the soundcard playback rate error.  If their software running on my
> computer is able to monitor the long-term trend of the number samples
> in their buffer, that trend would directly correlate with the audio
> playback sampling rate error.  The trouble is the fluctuations in the
> number of samples buffered is, in the short run, more dependent on
> random internet latency than it is on playback rate error.  Only over
> very long time periods would the internet randomness be dominated by
> the playback rate error.  But most phone calls do not last long enough
> to establish any meaningful measurment, which is why I doubt that they
> use such a method.  However I am not so bold as to assume that just
> because I am not clever enough to figure out how VOIP might correct
> for playback rate errors, that there is no such method, thus my
> questions here.

I suspect tht they simply drop packets. (have underruns or overruns).
Speech is so redundant that that will not make much of a difference to
the intelligibility of the speech. 

Have you tested your sound cards with an "over the phone" tone to see
how accurate it is?

> As for your last question, I have measured sound boards that are 11
> cents off from their nominal playback rate (22200 sps instead of the
> nominal 22050 sps).  Most soundcards are less than 1 cent off.  But
> the standard among my competitors in the field of electronic aids to
> piano tuning is 0.02 cents.  So if I want to compete then my app must
> be able to calibrate to that accuracy too.

We have discussed this in the past. It is lunacy, and furthermore, I do
not believe that they can actually measure the frequency of the string
accurately enough, especially since the "harmonics" of a piano string
are out of tune anyway, so there is no periodicity in the sound from a
piano.  And each mode of a multi-strig note is mistuned as well due to
coupling between the strings and the soundboard. 
Ie, their numbers are completely made up. 

>>> I am the developer of a professional piano tuning app for
>>> smarthphones.  My customers are professional piano tuners.  Although
>>> the sound sampling rate on most smartphones is very close to nominal,
>>> there is as need to calibrate each device on which the app runs.  The
>>> means that I offer in my app is to have the user call up the NIST
>>> tones over the telephone and let the app listen to the 500 Hz or 600
>>> Hz tones for 30 seconds.  The call is placed using a different phone
>>> and the sound is transferred acoustically by careful positioning of
>>> the microphone.  With this method I have been able to achieve a
>>> frequency calibration of 6 ppm.  This worked fine when everyone had
>>You know this how? You tested the frequency calibration with some
>>independent way? Of is this a theoretical  estimate based on how you
>>detect the frequency?
> I tested the calibration against the audio of WWV received over
> shortwave radio during times when a strong signal was received with no
> fading.

That does not tell you what the frequency offset of the sound card wrt
the computer clock is. 

> Robert Scott
> Hopkins, MN

More information about the questions mailing list