What is the Optimal Sampling Rate for Audio? (It's All About the Aliasing)

Рет қаралды 37,374

2 жыл бұрын

Honestly, I don't actually answer the question in the video. I give some reasons why some engineers say it's 60 kHz - but that's about it. What I do cover is Aliasing spurred on by a very interesting question about being the hear the difference between Sine Waves and Square Waves.
Sorry for all the "click-baity" title but there isn't a way to present this nerdy topic in a traditional way.
Here's the 7kHz demo I promised:
drive.google.com/file/d/1IoLJ...
Videos Mentioned:
D/A and A/D | Digital Show and Tell (Monty Montgomery @ xiph.org)
• D/A and A/D | Digital ...
Nyquist-Shannon; The Backbone of Digital Sound
• Nyquist-Shannon; The B...

Пікірлер: 887

@robinmiller7958 2 жыл бұрын

I’m a musician who has played loud music live for over fifty years and I hear the a and b examples 24/7. Thanks for identifying the frequency of my tinnitus!

@julesc8054 2 жыл бұрын

Lmao. My tinitua is a little higher.

@alexatkin 2 жыл бұрын

Never been able to figure out where mine is as it doesn't seem to be a single tone.

@wado1942 2 жыл бұрын

7.3K in my right ear 😢

@MrBitflipper 2 жыл бұрын

I have no problem letting the "Hz"/"KHz" misstep slide, as it's a common-enough slip-up. But sorry, I gotta call you out on repeatedly referring to a low-pass filter as a "limiter". Otherwise, this explanation - like all of your tutorials - strikes a comfortable balance between technical accuracy and accessibility to a broad audience. Well done! Btw, be forewarned: you just know somebody's going to link to this video in a Gearslutz post.

@henri.witteveen 2 жыл бұрын

I used to work in sonar engineering in which we used digital signal processing. To avoid aliasing the first 'processing' step was an analog filter which would cut off frequencies that could cause trouble because of this aliasing.

@yaakoubberrgio5271 2 жыл бұрын

Hello 👋 👋 👋 I need a course of Signal Processing Can you help me Thanks in advance

@Lantertronics 2 жыл бұрын

@@yaakoubberrgio5271 I've put up the lectures from my ECE3084: Signals and Systems course at Georgia Tech: kzfaq.info/get/bejne/jNqDn9CV2M7Von0.html

@kensmith5694 2 жыл бұрын

A well designed delta modulation ADC does part of filtering for you. A part like the AD7768 uses a modesty high order integrator so you get more than one pole of anti-alias for free.

@Lantertronics 2 жыл бұрын

@@kensmith5694 I've never been able to fully wrap my head around delta-sigma converters. Like... I can sort of follow the math line by line, but I can't really develop and intuition for the "heart" of how they work.

@henri.witteveen 2 жыл бұрын

@@kensmith5694 When I mentioned working in sonar engineering I was talking about 1979 and 1980. We had to construct our own processing unit by using a 5 MHz 32 bit 'high speed' multiplier as the heart of our system.

@MrSpeakerCone 2 жыл бұрын

Engineer here. This is a good explanation and easily the best visualisation of aliasing I've seen. nice!

@brayoungful 2 жыл бұрын

Wave scientist, here (not audio). I agree this is an excellent demonstration of aliasing. However, I think this video seems like an argument primarily for *mastering* above 44.1 kHz, particularly if you're generating a lot of synthetic sounds, rather than recording or playing back audio above 44.1 kHz. I wouldn't expect human voices or musical instruments to produce a lot of power in frequencies above the human range of hearing, so you're probably not going to get a lot of audible aliasing if you record audio at 44.1 kHz. And then if that aliasing hasn't been baked into your digital audio file to begin with, then you won't be hearing it. An exception I could imagine would be if you're recording in a noisy environment where the "noise" isn't Gaussian--in which case, perhaps you could get some beat-like pattern of "noise" in your audible range. Edit: the other caveat would be that if you have high-fidelity audio and you're playing at back at a lower sampling rate, it's anyone's guess how that downsampling/resampling algorithm is working. It might introduce it's own wonkiness. But then if you're trying to drive speakers at a frequency beyond which they've been tested, you might get non-linear weirdness, too.

@FilmmakerIQ 2 жыл бұрын

I'm really focusing on the capture side of things. My interest is really in the "making" side of things. Some cymbals can shimmer up in the high range of human hearing - I've heard tell of some loss of the "brightness" of cymbals because recording at 44.1kHz and compounding of a bunch of low pass filters causes that upper range to just lose power... but that's theoretical to me. But I really never answer the question in the title "What is the Optimal Sampling Rate"... haha. I think my purpose was really to understand what the argument is, not necessarily advocate for it. That's my impression when I was trying to work through Lavry's paper.

@brayoungful 2 жыл бұрын

@@FilmmakerIQ this is an interesting discussion and has me sitting here on a Saturday afternoon tinkering with MATLAB and Audacity.... :-) So what I just tried was I created a 7 kHz square wave sampled at 44.1 kHz. It looks like a typical square wave. Sounds like your video Then I generated a 7 kHz square wave in a 192 kHz track, and applied a bunch of aggressive 22.05 kHz lowpass filters so it has no frequency content above 22.05 kHz. Then I made another 7 kHz square wave in a 192 kHz track, and just told Audacity to resample it to 44.1 kHz The two 7 kHz square waves generated in 192 kHz tracks, one lowpass filtered, and one resampled, sound like square waves, and sound almost.the same. The one 7 kHz square wave generated in the 44.1 kHz space has dozens of overtones and sounds totally different. I don't have an easy to way generate a square sweep, but this would be an interesting experiment for you to try on a square wave sweep and see what happens.

@FilmmakerIQ 2 жыл бұрын

So here's the deal with 7 kHz square wave and any flavor of 48kHz (96kHz, 192kHz) - they will all sound the same! Remember in my video where I did the frequency analysis? There was the fundamental at 7kHz - that first harmonic at 21kHz and then aliases separated by 2kHz starting at 1kHz and going up. That works out because cycle of reflections caused by the 24khz Nyquist limit cycle around and around on odd numbers. Doubling the sample rate to 96 or quadrupling it to 192 doesn't change the cycle, it just changes where the cycle ends and picks up again! And since a square wave is suppose to be an infinite series, it doesn't matter where the cycle picks up.... Change the fundamental frequency of that square wave to 7001hz - and it won't cycle around like that and you'll hear the difference between 48 and 192khz.

@DamjanB52 2 жыл бұрын

@@FilmmakerIQ "aliases separated by 2kHz starting at 1kHz and going up. That works out because cycle of reflections caused by the 24khz Nyquist limit cycle around and around on odd numbers" - sorry, don't understand this: where does the 2kHz separation come from ? How do you get from 7x5, 7x7, 7x9, ... to all those frequencies ?

@nickwallette6201 2 жыл бұрын

There's a lot that gets said about "KZfaq compression" and how it affects audio. Generally, the degree to which it affects the sound of any given audio demo is nearly moot. These days, few of us are hearing _anything_ that hasn't already passed through a perceptual audio encoder of some sort (MP3, AAC, Bluetooth audio codecs, Netflix / Hulu / YT, and so on...) and nearly all of those codecs are going to brick-wall filter the highest of the high frequencies to avoid wasting data bandwidth on stuff only our pets will hear anyway. The exception to this rule is the rare fabricated audio example like in this video, which uses a signal that is rarely something you'll encounter in a typical audio presentation of any sort. Yep. Those are affected by compression. Sure enough. But most of the time, when somebody is comparing a direct feed of a source audio file with one picked up through a lavalier microphone from sound being played through a 3" cube smart speaker, and then says "you won't get the full impact of this because of KZfaq audio compression", I just roll my eyes. haha I _think_ that 128kbps Ogg stream can adequately capture the sonic differences you were trying to convey, don't you worry about that.

@laurenpinschannels 2 жыл бұрын

don't underestimate the degree to which lossy compression might actually be doing a better job of preserving the signal than you think - eg, check out dan worrall "wtf is dither"; it's a long video and I don't remember exactly where in the video he does it, but somewhere in the middle he compares mp3 to 16bit wav in a situation where the mp3 *unequivocally destroys wav* in terms of which one represented the data better. wav was more lossy than mp3. That's because quantizing to 16bit integer naively actually introduces more noise than mp3 compression, if your signal is simple enough. it's all about what bitrate mp3 or ogg needs in order to near-losslessly compress a given section; and ogg vorbis is based on wavelets, not discrete cosine, which was why ogg vorbis can handle certain kinds of phasing sounds much better than mp3. so - yeah, as long as you're in a high enough quality mode that the bitrate compression is in the -100db range, you'll probably be able to hear whatever -70db effect they're trying to show. it's only when you turn down to 240p and your mp3 noise is -10db that we have a serious problem from audio compression. now, video on the other hand... :D

@alexatkin 2 жыл бұрын

In my experience watching a movie/TV show on Netflix and watching it on Bluray is usually night/day difference. Its not so much that you obviously lose highs, you seem to lose dynamic range, it sounds flat and dull. Of course its not always enough to spoil the experience, but sometimes it definitely is. Same with the picture quality.

@jhoughjr1 2 жыл бұрын

disagree comoletely. YT audio is highly compressed and I can tell the difference between songs in YT vs Apple Music. No contest

@alexatkin 2 жыл бұрын

@@jhoughjr1 Music videos strangely are often the worst offenders, whereas some youtubers use music and it sounds fine. I'm very sensitive to lossy codecs too. Hated Bluetooth audio until LDAC and Samsung Scalable came along.

@erewrw1906 2 жыл бұрын

@@jhoughjr1 i dont know super exactly what youre taking about. but ive heard youtube uses aac codec. Imho for certain bassheavy generes, youtube is miserable. bass just doesent translate well on it. guitars are ok, but i still preffere mp3s of mine.. Apple music also uses aac i heard. But i found it bit better, dont know if its a specialized aac version they use. Other than that i seen a test video that compared waveforms to lok for normal compression (audioplugin compression) , and nothing was found.

@yuan-jia 2 жыл бұрын

Hey John, this is great seeing you do some new technical and concise teaching videos. Your work is so helpful for anyone digging in a bit in the subjects you tackle, so thank you for that!

@stephenwong9723 2 жыл бұрын

What you explained is just that, either have a good recorder (A/D converter) which can do a good job at filtering out > Nyquist frequency signal, or, record at higher sampling rate (for example, 96kHz is much more than enough), and then, down sample to 44.1kHz/48kHz when you do your post. In digital domain, you can do (very close to) exact calculation, and at the end, save a few bytes on the final product (without jeopardizing quality). However, to those crazy guys who insist to get so called high res files for PLAY BACK, they are just crazy, forget them!

@harrison00xXx 2 жыл бұрын

I mean, even cheap equipment like 100$ DVD Players in 2007 had already 192kHz DACs avoiding any problems like this at all. But for the final media, more than 44,1kHz doesnt make much sense since anyways most released music is still in 44,1/16Bit. Even most(or all!) vinyl records are made from 44,1kHz samples. Tidal even dare to upsample/"remaster" 44,1kHz/16Bit originals to expand their "HiRes" collection... Since anyways every HiFi gear have HPF for anything above 20kHz, in combination with internal 96kHz+ processing, more like 384kHz nowadays, no. 44,1 is just fine... more is acceptable in cases of digitalized vinyl records or yeah - why not. 44,1/48 vs 96+ is like comparing 4K vs 8K... it doesnt make practical sense, probably a bit with the perfect circumstances... but hey its possible. Thats why my AVR has 9x(or 11x idk) 384kHz/32Bit (32Bit!!! wtf?!) DACs, by numbers even better than my Hi End Stereo Gear with "only" 192kHz/24Bit Wolfson DACs. Only in recording and mastering more than 44kHz are needed, and these are anyways at 96kHz+ since its possible. I dont get people when they complain about "only CD quality"/44,1kHz... damn! Thats at least completely uncompressed, not like the lossy! MQA garbage for example. In fact (and already proven...) CD quality is better and more accurate than MQA (which is another compression format like mp3 - but worse, and with high license fees haha). Some of my friends are completely addicted to HiRes and/or Tidal/MQA, only because they see any blue light or 96/192kHz on their receivers screen... despite having absolutely the same sound as a 44,1kHz CD with the same mastering. Damn, they use soundbars, garbage "HiFi" gear, BT headphones and they dare to complain about 44,1 kHz only! I also prefer HiRes source material, but mostly because of different masterings, less loudness, more audiophile/dynamic, easily for the "demanding" people mastered.

@arsenicjones9125 2 жыл бұрын

@@harrison00xXx I believe you are incorrect about vinyl masters. Mastering for vinyl is a separate master than the master for CD. Professional mastering engineers want to work with the highest quality mix which means NOT 44.1kHz/16bit. And most likely the vinyl press wants to make their master from the highest quality version available. At least for the major label artists. Independent artists, well ya know they get what they pay for and can’t be reasonably used to make statements about what’s used to make vinyl records.

@harrison00xXx 2 жыл бұрын

@@arsenicjones9125 Ofc its differently mastered for vinyl, but still, the samples used to make the "negative" are for probably 99,9% of the (non quadrophonic) records 44,1kHz/16Bit, as CD quality, that was my point As if CD Quality is "bad"....cmon, thats the most accurate and "lossless" quality standard we ever got. Ofc there is now "HiRes" but thats more of a voodoo/too much...

@arsenicjones9125 2 жыл бұрын

@@harrison00xXx no I’m afraid you’re again incorrect. In major studio albums they regularly record at high sample rates then down sample to 48khz 24bit to edit & mix. Some major studios do all their editing and mixing work in 96khz/32bit floating. Then it will be down sampled again after mastering. Again we can dismiss what independents do because they don’t do anything in any standardized format. CD quality is not the most accurate, lossless standard available. 🤦‍♂️🤣 An original recording made in a 96khz/32bit wav file is more a more accurate representation of the analog signal. If there are more samples w greater bit depth it MUST be more accurate than a lower sample rate and bit depth. Just because you cannot discern a difference in every piece of music you hear doesn’t mean there is no difference or that there is no difference which affects the experience. Just to be clear I don’t think CD quality is bad just that it’s not without flaws either. Upsampling won’t increase fidelity in anyway but a higher sampled recording is higher fidelity

@harrison00xXx 2 жыл бұрын

@@arsenicjones9125 So you have proof that the source material for making vinyls ins more than 44,1kHz? Sure they edit and master at higher bitrates, but the end result is mostly 44,1/16Bit sampled. This change probably with HiRes for the customers slowly, but its a known fact that 44,1kHz were used for vinyls FOR DECADES at least.

@wngimageanddesign9546 2 жыл бұрын

Double blind tests of Redbook 16 bit 44.1kHz digital audio vs. high res 24bit, 96kHz digital audio, played for average listeners, audiophiles, and high res audio 'experts'....all couldn't accurately pick out the the high res files. The average listeners had a 50/50 probability, while the rest of the audiophile/experts scored even lower! As an EE, and music lover, I've always stressed the importance of the master recording being the great deciding factor on the quality. Quality in, quality out. No amount of oversampling, upscaling, or bit rate will improve a crappy initial master source.

@noop9k 2 жыл бұрын

This is about extra noice introduced during processing of the audio. Not about the output format really.

@JAmediaUK 2 жыл бұрын

The problem with the 1st group mentioned (44.1 vs 48 etc) reminded me of "Complex Problems have simple, easy to understand, wrong answers." The same is true for Flat Earthers, young earth creationists etc. They have a very simple solution that seems to work because the [majority of the] people they are talking to don't understand the complexities. The problem Group 3, the Audio Engineers, have is the majority don't understand the solution as presented mathematically and say "that is just your opinion!" and no more important than their opinion.... You see a lot of this these days. It is Great to have videos like this one that go far enough to explain simply the problem for the majority without going of in to deep (group 3) Audio Engineer Geek speak of MSc maths.

@FilmmakerIQ 2 жыл бұрын

That is really an insightful way to look at it.

@JAmediaUK 2 жыл бұрын

@@FilmmakerIQ Hi John, You call me "insightful" again and I will sue! :-)

@FilmmakerIQ 2 жыл бұрын

Need to put a low pass filter on that comment.

@TheTechnoPilot 2 жыл бұрын

This was FABULOUS as always John! Amazing description!

@taragwendolyn 2 жыл бұрын

Love the deliberate error 🥳 also thought my hearing was failing with the sine sweep until you pointed out KZfaq hard cuts at 16khz. I'm one of those weirdos in their 40s who can still hear when shopping malls have a mosquito device... Or could during the before times at least .. haven't been to a mall in 2 years

@timbeaton5045 2 жыл бұрын

@@MyRackley Hmm, sadly i know mine doesn't at 65, but then i've played in too many bands with overloud guitarists, and in one case, a drummer who overhit his cymbals all the time, where we rehearsed in a small room. Still have a low level of tinnitus in my right ear, but luckily it's not really noticeable unless things are really quiet, and i guess i've become quite good (or at least my brain has!) at filtering it out of consciousness!

@RJasonKlein Жыл бұрын

Excellent video. You dealt with complex issues in an easy to understand and fun way - nice job, man.

@TheAnimeist 2 жыл бұрын

19:07 "I just want to cover some interesting notes" Clever ... John, Thanks for sending me down the rabbit hole. It took me 5 days to finish your video. Your instruction is always good, because of the practical examples you provide. Your videos inspire conversations outside of KZfaq and outside of film making. Thanks for that too. edit: sorry wrong time stamp, could not find original ...

@jmitzenmacher5 2 жыл бұрын

So here’s the thing, KZfaq does support 48 kHz audio, and it does support higher frequencies than 16 kHz... sometimes. Every time you upload a video to KZfaq, the encoder creates about 6 different versions of the audio with different codecs, sample rates, bitrates, etc. On playback, it will automatically choose the audio based on your network, decoding capabilities, etc. Just because the video was ruined after you checked the download, that doesn’t mean it would have been ruined for all listeners. Really it’s KZfaq’s technical inconsistency you have to worry about (I think that might also be true for your video about cutting the video 1 frame early) TLDR; Your description of KZfaq’s capabilities wasn’t strictly true, but you were still right to cater to the worst case scenario. Very interesting video!

@DrakiniteOfficial 2 жыл бұрын

My electrical communications systems prof literally just covered the sampling theorem in class today, and by chance I saw this on my recommended. This video is an EXCELLENT demonstration of aliasing. Thanks so much for making this. BTW: I can totally hear the difference between A and B on YT, but I can't tell the difference on the 7kHz one. But that could be my Bluetooth headphones. I'll edit this comment when I get home and try my corded headphones/speakers.

@kaneltube 2 жыл бұрын

Great video. Equally entertaining and informative as always!

@Lantertronics 2 жыл бұрын

I'm a professor of Electrical and Computer Engineering at Georgia Tech, and have taught courses in signal processing for 20 years. Besides an excellent tutorial by Dan Worrall, this is the only video on the topic I've seen on KZfaq that doesn't make me cringe. In fact, your video is superb. :)

@croolis 2 жыл бұрын

Excellent and interesting video .. I would to like one term here, that is 'oversampling' .. When digitising an analog waveform it is quite normal to have a relatively tame analog filter, but run the sampling at a much higher frequency than the output requires, 8x or 16x oversampling is common. The next step is to have digital filter operating on this high frequency sampled signal and then downsample to the required frequency eg. 44.1kHz. The 10kHz squarewave has audible undertones because it was simply generated mathematically - there is no oversampling or anti-aliasing going on at all - if the signal was filtered properly before being recorded the 10kHz squarewave and 10kHz sinewave would, of course, sound exactly the same (since the next harmonic is not captured).

@MovieMongerHZ 2 жыл бұрын

So in depth. Thank you so much!

@wimdouwe 2 жыл бұрын

Nice video and thanks for the link to Monty Montgomery's explanation

@stucorbishley 2 жыл бұрын

This was amazing, fantastic explanation! I've been curious about this for a long time..

@mhoover 2 жыл бұрын

As usual a very thorough and clear exposition.

@Wegetsignal 2 жыл бұрын

Very informative and clearly a ton of research went into this!

@ThisSteveGuy 2 жыл бұрын

As soon as you mentioned Monty, I knew that you got it right.

@butson89 2 жыл бұрын

Always the best videos!

@mikes9939 2 жыл бұрын

A truly great video about this complex subject with an appropriate amount of humor concerning the state of the commenting on KZfaq in these times. Thank you for your efforts, they are well appreciated.

@TheDingusBoy 2 жыл бұрын

Absolutely fantastic work as usual. I'd love to see how this whole thing compares to analog sound though. I've only ever worked digitally before, but I've always been fascinated by the physical manifestation of sound and it's analog recordings.

@davidasher22 2 жыл бұрын

Omg! So glad you mentioned the hard cut-off YT does at 16. I thought I was loosing my hearing during those sine wave sweeps.

@eddievhfan1984 2 жыл бұрын

An exceptional video, sir, especially for going the extra mile and looking into KZfaq's own codec shenanigans with your own examples. I regret to say, I didn't hear much difference in the 7kHz files, but considering I'm getting older and adults lose top end in their hearing range over time, I'm not surprised anyways. (I can barely hear CRT yoke noise anymore, which I definitely could as a kid) Aside from pure monophonic sound, I think higher sampling rates have a dedicated purpose when doing any kind of stereo/surround real-time recording, or doing any audio processing involving pitch/duration manipulation. In the first case, human hearing becoming more sensitive to phase differences between ears as frequency increases, such differences in phase and arrival time contribute to our sense of a physical space the audio is occurring in. (Worth noting here that the Nyquist-Shannon sampling theorem assumes a linear and time-invariant process, where it doesn't matter how much or how little the signal is delayed from any arbitrary start point-human hearing, however, is definitely NOT a time-invariant process) When dealing with sampled audio, at higher frequencies, the number of discrete phases a wave can take drops off considerably: assuming a wave at exactly half the sampling frequency, you can have it however loud you want (within the limits of bit depth), but you can only have two phases of the signal (0° and 180°). One octave down, you only have 4 available phases (0, 90, 180, 270), and so on. This might contribute to the sense of "sterility" and "coldness" associated with older digital recordings that didn't take this into account. So if you're mixing audio that relies heavily on original recordings of live, reverberant spaces (drum kit distant-miked in a big room, on-set XY pair, etc.), it's an advantage to get the highest sample rate you can afford when recording/mixing, then downsample your audio for mastering/publishing, if needed. This way, you can preserve as much detail as possible, and give your audio the best shot at being considered realistic. In the second case, having extra audio samples helps when you want to pitch audio up/down or time compress/stretch. Since some of the algorithms for doing these techniques involves deletion of arbitrary samples or otherwise bring normally inaudible frequencies into human hearing range, having that extra information can also be a benefit for cleaner processing, depending on your artistic intent.

@FilmmakerIQ 2 жыл бұрын

Yes, I haven't factored in pitch alterations.

@nickwallette6201 2 жыл бұрын

That's not entirely true, actually. The Xiph video mentioned in the content here covers the waveform phase topic as well. The reconstruction filter post-DAC is basically turning discrete samples into a Bezier curve. Just sliding the sampled points around on the X/Y axis (if X is sample, and Y is word value -- i.e., the amplitude of an individual sample) will alter the resulting wave's phase. Another way to think of this is to imagine using a strobe light to capture an object moving in a circle. If the speed of the object rotating about the circumference was perfectly aligned with the flashing frequency such that there are exactly two flashes per revolution, it would look like the object is appearing in one spot, then another spot 180 degrees from the first, and repeating indefinitely. This is basically the Nyquist frequency. From that, you could construct a perfect circle because you have the diameter. So now, imagine altering the "phase" of that object so that the strobed captures place those objects at different places around that circumference. You can still construct a perfect circle. Same with audio samples. It doesn't matter if the phase changes. As the Xiph video says (I'm paraphrasing because it has been a while since I watched it), there is one and only one solution to the waveform created by a series of samples, _provided that the input waveform and output waveform have both been band-limited to below the Nyquist frequency._

@eddievhfan1984 2 жыл бұрын

@@nickwallette6201 Well, yes, for any arbitrary signal, you can still reconstruct it with sampling, but I was mostly thinking psychoacoustically, where delay and phase variations between ears plays such a big deal in stereo sound. And one of the side effects of sampling is that you get phase constraints, like I described above. For example, with a signal at half the Nyquist frequency, how do you distinguish between a full-amplitude sine wave and a cosine of -3dB intensity, when they both share the exact same sample representation (alternating between .707 and -.707)? Since that phase information can spell the difference between a centered (in-phase) or diffused (out-of-phase) stereo sound space, preserving phase and delay information is super important, and with finite sample intervals, there's only so many phase states you can have at high frequencies. I also acknowledge, however, that bandlimiting filters induce their own phase delays as well, which can have a significant effect on the perceived audio-hence one of the other advantages of higher sample rate is to relax the requirements of bandlimiting and reconstruction filters to minimize their coloration of the audio.

@FilmmakerIQ 2 жыл бұрын

Delay is not an issue with the sample rate. Sample rate does not affect the precision of timing of the wave in any respect

@nickwallette6201 2 жыл бұрын

@@eddievhfan1984 With two samples per cycle, you can reconstruct a waveform with any phase you want. You could indeed have phase and anti-phase waveforms at 20kHz with a 44kHz sample rate. Try it. Use an audio editor to create 20kHz sine, then invert the phase. Zoom in to the sample level and look at the waveform it draws. This is a representation of what the reconstruction filter does. I think it would be an academic exercise though, as 1) who's going to be able to determine relative phase between channels at the theoretical threshold of human hearing?, and 2) that's going to be in the knee of the low-pass filter curve, where any passive components on the output are going to affect the signal. It would not be unlikely to have a mismatch between L and R channels. High-end stuff might try to match capacitors to 1% or so, but there's plenty of gear out there (even respectable gear) that uses electrolytics rated at +/-20%. There's a lot of concern over perfection that is not at all practically relevant.

@darrenlucas804 2 жыл бұрын

Well done, brilliantly explained

@wado1942 2 жыл бұрын

Another great video. One thing about your sine/square test, you can simulate what would happen in a real-world situation by generating your waves at a sample rate like 3,072KHz (64x48K) and convert to 48KHz to listen to it. That's because all modern ADCs sample at at least 64fs, often 128 or 256fs, filter out everything above 20KHz, then down-sample to your capture rate. Another experiment I ran a few years ago was record a series of sweep tones to my blackface ADAT, which allows the sample rate to be continuously varied from about 40KHz to 53KHz. At 53KHz, aliasing is *almost* eliminated where it's quite audible at 40KHz. Yes, those converters are out of date, but it's still a valuable learning tool. That said, I'm a huge proponent of 96KHz in digital mixers, where the ADCs are working low-latency mode. At 48Khz, an unacceptable amount of aliasing is allowed to keep latency through the mixer below, say 1ms (not a problem in analogue mixers). At 96Khz, the converters can run in low-latency mode and have no audible aliasing. When I'm working in the box on material that was captured by dedicated recording devices (latency is not an issue), 48KHz is fine.

@toddhisattva 2 жыл бұрын

The Fourier Transform tells you how loud each sine wave in your signal is - a spectrogram, if you plot it. It also can tell you the phase, so all 3 parameters - frequency, amplitude, and phase - of a sine wave are covered. The Inverse Fourier Transform puts all those sine waves back together. In computers we use Discrete Fourier Transforms, and usually a "fast" implementation known as an FFT for "Fast Fourier Transform." (Which BTW is one of the top 3-5 hacks in all of computer science.)

@FilmmakerIQ 2 жыл бұрын

Yes but the how gets way more complicated

@aaronsmith4746 2 жыл бұрын

Very informative, thanks John. Reminds me of my electrical engineering classes back in school :)

@geoffstrickler 2 жыл бұрын

When you first brought up harmonics and square waves, I thought about posting a correction cause it sounded like you were about to make a big mistake by ignoring band limiting filtering, but I watched the rest of the video…and you handled it all. Well done, including your edit post KZfaq processing. Yes, I did hear a tiny difference between your 5.2khz sine wave and the 5.2/15.6khz additive construction square wave synthesis. I do have exceptionally good high frequency hearing for a 55yr old, however, it’s also important to note that music is never a pure sine wave, nor a square wave, so you would never hear even the tiny (barely noticeable even to excellent hearing and only because it was a pure note of extended duration) differences I heard in an actual piece of music. The important part, as other have pointed out as that your waveform must have and appropriate low pass filter applied. That could be a 20khz analog filter, with sampling at 48khz or higher, or 20-24khz filter before 57.6khz, or 20-25khz before 60’hz, or a 20-35khz analog filter and sampling at 88.2 or higher sampling. And it’s always good to lower the noise floor by recording at 20 or 24 bit depth. Do all your editing and mixing at something above 48khz and above 20bot depth, then master for 44.1/48 16/18/20 bit sure, you can master for 24bit depth, but no one will actually be able to tell the difference.

@Goodmanperson55 2 жыл бұрын

4:50 a tiny bit of correction on this part. If you actually activate the "stats for nerds" option, you would see that KZfaq actually uses a much newer audio compression format called Opus, developed by the same xiph foundation that Monty himself works for. And what's interesting about this audio codec is that the developers have decided to restrict the sampling frequency to 48 kHz (44.1 kHz sources get upsampled upon conversion, hi-res sources get downsampled and 48 kHz sources are essentially no-op and passes through). The reason for this is exactly the same reason you mentioned a few seconds ago, the math is just easier that way. You will only get 44.1 kHz if for whatever reason, your device requests KZfaq to fall back on to the old AAC or Vorbis codecs for compatibility reasons which will almost never happen especially if you're watching from a web browser or using an Android phone. But considering that Opus is still a lossy format, it's still gonna cut off any frequency above 20 kHz anyways.

@davewestner Жыл бұрын

Thanks man....really useful info, but the main reason I wanted to leave a comment is that I really dig your set! Looks cool!

@AndrewAliferis 2 жыл бұрын

Great job. Thank you for this interesting info.

@dodgingrain3695 2 жыл бұрын

As a mixing engineer for over a decade I'm glad to see you got this right. I'm also glad that at over 50 years old I can still hear the difference between waves A and B. And for the vast majority of people listening to audio on crappy playback systems it doesn't matter one bit.

@LocalAitch 2 жыл бұрын

You switched it up between A and B lmao. Interestingly, the frequency of the harmonic you used is really close to NTSC horizontal refresh rate (15734Hz), which a CRT’s flyback makes audible as it deflects the electron gun left to right and back. I’m 41 and so far I’ve always been able to hear 15kHz flyback

@FilmmakerIQ 2 жыл бұрын

Yep

@GoodOlKuro 2 жыл бұрын

So that's why you can hear this high pitch noise from CRT TVs?

@sivalley 2 жыл бұрын

39 and oh gods do I NOT miss working on TVs and that wretched noise. I can only imagine how horrific that nose must be to cats and dogs. We practically used to torture our pets with those damnable things.

@jhoughjr1 2 жыл бұрын

yep. as a kid I could hear if a TV was on even if the screen was dark.

@ClosestNearUtopia 2 жыл бұрын

I remember as a kid freaking want to smash all school tv’s what a trash they let us watch in the first place and then the fucking beep, will hear even now I think, I did run out the classroom sometimes and told the teacher to blast herself with this earpiercing beep! She was like: what beep!? Bitch.. the older the crt, the more chance you may use it to deflect vermin out of your garden..

@MiddleMalcolm 2 жыл бұрын

Glad to see you dug in a little more to check out the difference between the theoretical "ideal", and what actually works in practice. There are still, of course, many other variables, but the answer to "which sample rate?" is always "it depends". Jumping back to the last video, my comment was only that I found it interesting that the original concept sample rate being 60K was almost a happy accident of ending up with that ideal range suggested by folks like Dan Lavry. It would likely have radically changed the course of digital audio development as we all know it.

@peregreena9046 2 жыл бұрын

I remember some article in an audiophile magazine about a study in the early days of CDs. A recording company recorded a classic orchestra on both reel-to-reel tape and a PCM processor. When played back to an audience, there was no clear line between the media. Depending on the piece played, the majority preferred one or the other. The conclusion at this point was that each recording added some specific artifacts to the music, which might benefit one piece, but not the other. After this, they went to analogue and digital mastered vinyl records and high end tape cassettes on one hand, CD on the other. All of the same performance. Oddly enough, here the lines were defined more clearly. The digital camp voted for the CD, the analogue camp for vinyl and cassette. Then one of the technicians had an idea: They went back to the master recordings, but added noise from a blank vinyl record or a blank tape. The result was that everyone voted for their favoured medium. Vinyl enthusiasts picked up on the clicking noise from the blank record, the tape guys picked up the tape noise. So either consciously or subconsciously, they confirmed their bias. I wish I could find that study online, maybe someone reading this can help? Different sample rates, compression methods and bitrates affect music recordings. The artifacts become part of the music and some will prefer the sound of one type over another. A lot of it also depends on how much care has been taken during production, from recording to mastering to compression of the publishing file. The audible difference between low and high sample rate might be minuscule, but because more care has been taken to produce the high end recording, the result sounds better. Now throw in confirmation bias, and everyone will say they are right because ...

@4i20 2 жыл бұрын

great content, thank you 💚

@shiraga0516 2 жыл бұрын

It’s a great video! Many thanks.

@overheardatthepub1238 2 жыл бұрын

Crazy technical and interesting. I learned more bout audio encoding than I ever knew. And I learned how little I know.

@cjc363636 2 жыл бұрын

This is so cool. As a former TV audio mixer, this just rocs. And, by the way, the square wave sweep reminded me of some unknown 60s era Saul Bass movie credit animation.

@FlamingChickenG 2 жыл бұрын

I think it is interesting how many people rag on CD quality, CD sound pretty good and I think most people have a colored memory of it. It is the same thing that Techmoan talks about in his video about cassettes most people where not listening on quality equipment and I know for my generation we most used CDs that we burned which had mp3s which are lower quality then CD audio. Spotify only recently got "CD quality" audio but people don't complain about there quality.

@lamecasuelas2 2 жыл бұрын

CD's rule baby!

@Carlos-M 2 жыл бұрын

My earliest memories from the early 90's regarding CDs is that, a) they sounded really, really good, and b) my mom will get REALLY mad if we play with her discs (they were expensive)! My dad had a Panasonic component stereo setup, nothing high-end or audiophile grade but it was half-decent at least. He had some Type-II cassettes too which sounded really good on that player. By the mid to late 90's CDs were starting to replace cassettes as the on-the-go medium for portable players, boomboxes, and car audio, which tended to sound bad to start with, but no matter how good your system is all of these are frankly crappy listening environments. Whereas vinyl was never a portable medium so even now if you had a vinyl player you'd probably have it in a dedicated listening room at the very least.

@peteblazar5515 2 жыл бұрын

1st Harmonic with 3 times fundamental frequency? Where is harmonic with 2 times frequency?

@Carlos-M 2 жыл бұрын

@@peteblazar5515 the components of a square wave are the sum of infinite _odd_ harmonics. So the first harmonic is 3x the fundamental frequency, the next is 5x, and then 7x, etc.

@negirno 2 жыл бұрын

I wouldn't rag on mp3s either. Unless the bitrate is really low or it's encoded with an old encoder I just can't tell the difference.

@GaryFerrao 2 жыл бұрын

Wow this video is so interesting!~ I was (thinking) sure it's not just twice the frequency because if i downsampled some audio file to just 22.1 kHz (after checking that the treble was well below 10kHz) to save space on my CDs, it just didn't sound right, almost like sandpaper trebles. Well, now i know, thanks to your helpful explanations. Harmonics do affect the timbre of the sound, even though we can't hear them directly.

@proletaire6442 2 жыл бұрын

One of the best videos here

@35milesoflead 2 жыл бұрын

Nice video. There's an interesting tidbit that I have noticed with the whole 44.1 v 48.8 - you need to be consistent even though it doesn't matter. If you playback a 44.1 file inside a 48.8 project (or vice versa) you get pitch drift phenomena. This is why consistency is key even though sampling rate doesn't matter. The real key is "mastering for your platform" as it were. Understanding the playback limitations of youtube and making sure you sort your audio for playback. Tis redundant to do all your audio at 88/-8dB if youtube is going to down sample to 44.1./ -15dB.

@frenchcreekvalley 2 жыл бұрын

Thank you. I learned a lot.

@hkgerrard 2 жыл бұрын

Great explanation

@squidcaps4308 2 жыл бұрын

Project and storage samplerate at 48k with each processing stage using oversampling has been proven to be optimal. You have to increase project sample rate to 384kHz to get the same. The trick is in the oversampling, allowing for wider bandwidth while processing to reduce artifacts and then filtering the unnecessary frequencies out keeps it cleaner. 48k is not enough for some signal processing, while it is plenty for other. A gain change can be done in 48k but compressing, anything that modifies the phase or time domain in anyway has to be oversampled to decrease overall antialiasing. The strangest thing is that despite having additional filtering stage at each processing block (for ex, each plugin in a project) and converting back and forth, it is less CPU intensive. Higher samplerates by far most of the time run "empty" signal, the entire bandwidth is processed at each stage while oversampling is not needed for linear operations. This is not very known thing, which is a bit odd in my opinion. You can test this at any point, device antialiasing stress tests and compare 192k project rate to same processing done in 48k base and oversampling. The latter has less artifacts.

@Ruben.Pueyo.Bernini Жыл бұрын

from Argentina I say THANKS! TU CONTENIDO ES BRILLANTE!

@k7iq 2 жыл бұрын

GREAT video and explanation ! I won't mention the 439 vs. 493 😃 BTW, I used to work with Dan. Very smart guy. Nice guy too :)

@DEtchells 2 жыл бұрын

Excellent, excellent video! It does an excellent job of cutting through the woo-woo and uninformed opinions out there. You even corrected a misconception I had of aliasing, namely that the aliased frequency “wrapped around” to the low end of the spectrum vs being “folded back”! That is, I thought that a 25 KHz signal samples at 48 KHz would appear as a 1 KHz one, not 23 KHz. (I was going to correct you, but I couldn’t explain the Audition display, so went and looked it up. Duh…) Thanks for correcting a misconception I’ve held ever since my Signals and Systems class in undergrad :-) One note and a question for you and/or the audience though: I had understood that one of the problems with CD-grade audio wasn’t the potential for aliasing so much as it was the “brick wall” low pass filter you had to use to allow 20 KHz to get through but cut out anything beyond 22.1 entirely. AFAIK, filters with such abrupt frequency cutoffs mess with signal phase well down into the audible range. Is this the case? My knowledge of such things dates back a good 35 years, so it’s possible that modern technology has found a way around the problem. (This would of course be a further argument in favor of a 60kHz sampling frequency, you could use a less-abrupt filter that wouldn’t impact phase relationships in the pass and.) ==> So my question: With current technology, can you get a flat passband, sharp cutoff and linear phase all at the same time (in the analog domain)? Thanks again for the fantastic video!

@FilmmakerIQ 2 жыл бұрын

Yeah I got that misconception as well... But there's another issue that shows up as low frequency "beating" when you start to get close to the Nyquest limit but still under. I don't know what it's called but it's worth looking into. The brick wall limiter was definitely the issue, I think it's particularly prominent in a mastering scenario... After you add up all the little tiny decrease in high frequency in all the audio stages plus all the filter chains... it becomes substantial enough to notice.

@RobertHancock1 2 жыл бұрын

In the old days that was an issue. These days ADCs are sampling much faster internally and downsampling on the output, so the brick wall filter requirements are much less steep.

@71GA 2 жыл бұрын

Really nice video.

@peto348 2 жыл бұрын

Any video that links to the Monty's video is good video 👍

@LeutnantJoker 2 жыл бұрын

I once studied electrical engineering with a bit of signal processing but then went into energy (the big kilo volts stuff) and finally computer science. And in computer graphics I was right back in the fourier transform again, because yep.. its exactly the same thing in computer graphics. And while all the theory has been ages ago so I really need a refresher myself, I find this discussion everywhere: This debate of higher sampling rate, completely ignoring aliasing is going on in graphics just as well. Just look at all the "graphics mods" for games that upload huge textures for absolutely everything and then change the engine settings so bigger textures are being sampled for small objects, then wonder why performance goes down the toilet while aliasing artifacts appear and make things look worse instead of better. Its almost as if game and engine developers know about these engineering principles and optimize for them. Like... as if they know what they're doing :D Same goes for mesh level of detail too btw. Rendering a triangulated mesh is nothing but sampling. The sampling rate is your screen resolution. If you make an insanely detailed mesh that will show up small on your screen, you'll get mesh aliasing which will also look crap. People always thing smaller textures, mipmaps, and LODs are only used for performance, and if my PC is kick-ass, I should always load everything at the biggest size (bigger/more is better), completely ignoring signal processing principles and aliasing.

@FilmmakerIQ 2 жыл бұрын

Fascinating analogy

@LeutnantJoker 2 жыл бұрын

@@FilmmakerIQ at the end of the day it's all about sampling at a limited frequency. Doesn't matter what the data is

@tommccaff 2 жыл бұрын

Thank you for this excellent explanation. I am an audio engineer for a living, for many years I used a digital mixing console (a Panasonic Ramsa WR-DA7) which can operate at both 44.1k and 48k. I was always able to hear the difference between the two even when only recording voiceover, which I've done a lot of. I also have read Lavry's work in the past, when he previously insisted that there was no difference whatsoever between the two sampling rates and no need to ever use above 44.1K, and knew something had to be wrong. I also have used high sample rates, particularly 96k, and agree that they require a LOT of processing power, which translates into a lower track count and fewer native plugins that can be used, which makes those high rates inconvenient at best, at least for now. Coincidentally, it always seemed to me that the best compromise between computing power and the audio problems I was hearing would a sample rate of 64kHz (since in computing we like to use powers of 2 as factors, mostly because it's easy to clock-divide by 2 or 4, etc.). It's interesting that Lavry's proposed sample rate of 60K is very close to my own thoughts, and personally I'm glad to see that he has come around from his prior position that 44.1k was just fine. I also knew that when using wave generation software just like you illustrated in Adobe Audition, when generating a 16K sine wave at 48k sampling rate, the result is a wave with only three data points per cycle: one at zero, one near the peak, and one near the trough - which is of course a 16K TRIANGLE wave, not a sine wave, albeit a someone oblique one. Yes, those overtones are outside the range of hearing, and yet you could hear that something was wrong - it definitely was not a sine wave that was playing back. Aliasing is exactly the problem - there was no anti-aliasing applied to the data generated by Audition or any other similar program, or any anti-aliasing generated by the WR-DA7 that was outputting it and that the computer was digitally connected to - and there still isn't today on most high-end professional equipment. So there's just no question that the VAST majority of digital playback equipment out there simply applies no anti-aliasing filtering at all and never did. To my trained ear, this has been quite annoying indeed. I also remember the very early days of CDs, and the first CD player I bought, a Sony. I didn't like it, because the top end sounded "brittle", which was a common complaint in those days. And in fact it wasn't until CD players introduced "oversampling" that the problem went away - basically moving the aliasing frequencies so they are all hypersonic, by extrapolating and outputting one or three "samples between the samples" caused later generation CD players to sound significantly better. The bottom line is that Nyquist really doesn't handle the concept of aliasing very well, as you aptly point out. And what is needed, particularly for audio production, is a sampling rate that allows all of the alias frequencies to be moved above the 20kHz threshold of hearing. Computing power is a temporary problem, so I have a feeling that in the not too distant future all professional audio production will be done at 96k, even though we don't really need it to be quite that high. Thank you for what I believe settles this issue hopefully for good.

@FilmmakerIQ 2 жыл бұрын

Sorry but three sample points do not produce a sawtooth wave, it's produces a sine wave. You don't connect the dots with straight lines, you draw a sine wave through the dots. A saw tooth wave has integer harmonics, it would need to be constructed with many sine waves which works probably be above Nyquist if the wave is only 3 samples wide. Lastly, I don't think you understand why Lavry suggests 60. He stated in the paper that 44.1 is if not perfect, close to perfect.

@tommccaff 2 жыл бұрын

@@FilmmakerIQ I think you misunderstood what I said - "triangle", not "sawtooth". And I wasn't referring to an actual triangle wave, I was only referring to the shape created by the three points if you connect them, which isn't exactly what's going to happen in the DAC anyway, because DACs don't transition from one point to the next in any smooth way, they simply jump to the next value. The bottom line is that for a 16kHz sine wave, only three data points are created, and only three data points are going to be output by a DAC. The DAC itself is not going to "draw a sine wave through the dots". It's just going to output stairsteps at three data points and that's it (unless of course we're talking about oversampling, which would instead use spline interpolation or some similar approach to approximate where the additional samples would be. But to my knowledge no production hardware - such as Pro Tools or UAD Apollo etc. - utilizes oversampling on output). For example, if you create a 16kHz 24-bit sine wave at -3.0db, each cycle will have exactly three points - one at zero, one at -4.2 db above zero (sample value 5,143,049) and one at -4.2 db below zero (sample value -5,143,049). The DAC isn't going to transition smoothly between those points, it's simply going to output a zero for 20.83 milliseconds, followed by a sample value of 5,143,049 for 20.83 ms, and then a sample value of -5,143,049 for 20.83 ms. If DACs did indeed "draw a sine wave through the dots", then aliasing wouldn't be a problem, because the DAC itself would be reacting perfectly to the INTENTION of the data - just as analog tape used to do. But the problem is of course, as with many things computer-related, DACs simply don't do that. They just output a voltage corresponding to a number for a specified number of milliseconds as dictated by the sampling rate. It is of course this behavior that causes the alias frequencies to result, as you have very correctly and articulately described. As for Lavry's 60, correct me if I'm wrong, but my understanding is that the advantage here is twofold: 1) it pushes the vast majority of alias frequencies into the supersonic range, making them a non-problem, and 2) it provides more headroom for creating anti-aliasing filters, should a playback hardware developer choose to do so, which sadly, very few ever seem to. My point was merely to essentially agree with Lavry, but I'm suggesting that when taking into account the fact that digital hardware designers prefer to do things in powers of 2, that a better choice for "optimal sampling rate" should be 64kHz specifically. Personally, I wish hardware developers provided that option in addition to 48k and 96k because that's what I would use for production instead of 48k or 96k. It would be quite a good compromise.

@FilmmakerIQ 2 жыл бұрын

That's completely incorrect. Yes the DAC does draw a sine wave because it's coverting it back to analog. The speakers cone is a physical object and it moves through space with inertia, it can just jump to each sample point and hold for the next one. So If you produced three samples you will not get a triangle, you will get a sine wave. Watch Monty's video in my description. Samples are not stair steps, they define the points of a sinusoidal wave. This is the key to Fourier transform and Nyquest theorem. Aliasing has nothing to do with stair steps (because there aren't any stairsteps). Aliasing is the result of frequencies that are higher than the sampling frequency. Your understanding of Lavry's 60 is incorrect as well. It doesn't push alias frequencies in the ultrasonic... You don't push alias frequencies... it provides enough headroom for anti aliasing filters to work without affecting the audible range. Lastly clock speed has zip to do with binary. 64khz is meaningless because time is irrelevant construct. Look at the history of computing you will not see any clock speed correlating with any binary numbers... Because it's been simply not how that works... Also 64khz isn't a binary number. The closest is 2^16 which is 65.636khz

@SamichHunter 2 жыл бұрын

Great video. LOVE the Monty video. It is awesome in it's clarity. 13:10 "If we had an infinite sampling rate ..." Isn't an infinite sampling called 'Analog'? :oP Kinda defeats the purpose of digital which can be MUCH smaller in storage size. 16:50 I could pick out that the frequency was transitioning from sine wave to square wave, but the tone was indistinguishable to my ears. (yes I listened to your linked video) Thank you for the time and effort to produce this video. It is appreciated!

@paulsmith554 2 жыл бұрын

Same, I can't tell the difference between those last 2 waves but I can hear the transition. Almost sounds like there's a short crossfade

@FilmmakerIQ 2 жыл бұрын

There is a short cross fade. I couldn't get the waves to cut exactly at the same amplitude so it was either a crossfade or a pop on the switch and I chose the cross fade.

@DripDripDrip69 2 жыл бұрын

Analog is not "infinite sample", your tape has a frequency range based on how fast you run it, normal copper wires will struggle with RF frequency, even the air has a frequency range because it's made of individual molecules. Natural is more "digital" than "analog" in the sense that energy comes in discrete packets because of quantum mechanics.

@GaryFerrao 2 жыл бұрын

Thank you for also talking about and checking the audio uploaded to YT. Years ago, some science documentary on NatGeo or Discovery was being broadcast on TV, explaining how adults can't hear above 16kHz and to test it out with your friendly adult (or parent) nearby. To my shock, i myself couldn't hear the 16kHz wave they were "playing". Not wanting to age so quickly (and good thing i had a computer as well), i generated a 16 kHz sine wave, and i was _so relived_ to know that i could hear it lol. And sadly the TV didn't have a comment section like here to complain. Rant: Then, wanting to check "how old i was", i tried with higher frequencies, and found out that i couldn't hear more than 18 kHz. Still not wanting to age so quickly, i was sure something was amiss. Then i found out. My speaker system itself had a frequency response range from 18 Hz to 18 kHz. argh lol. I bought better speakers with response up to 20 kHz and sure enough, i could hear it. This just makes me wonder. Do we really "age" out of this frequency or do we just "waste it away" because we don't use it any more. I still practise hearing 18 kHz (with good speakers/earphones) every now and then. And also have a save file on my phone to test out earbuds; before i can buy them and it making me lose my hearing range. P.S: i couldn't hear 20 kHz sine wave. I don't know if it's my limitation or the speaker's. Until i can get a volunteer who can blind test, i'll still be searching. (i'm not sure earbuds/speakers produce enough power anyway at the 20 kHz frequency, to use resonance on other objects.)

@FilmmakerIQ 2 жыл бұрын

It has to do with the hairs in the cochlea of the ear. The ones responsible for the highest frequencies are in the smallest part of the cochlea (they have to vibrate the fastest). As we age, the cochlea becomes more rigid and inflexible to those high frequencies, and that's why we lose the high range.

@GaryFerrao 2 жыл бұрын

@@FilmmakerIQ oh my…

@adrianstephens56 2 жыл бұрын

Another engineer here. In my laziness I was edging into camp 2. Thank you for showing me the error of my ways, and reminding me of what I knew 40 years ago. Nyquist's sampling theorem is correct, and it assumes a perfectly band-limited signal. You band-limit a wider bandwidth signal using a low-pass (anti-aliasing) filter. Precision analogue filters can be expensive and difficult to create. Further, if you have a sharp transition in the filter, you introduce artefacts which are visible on transients in the signal, and might be audible, although I really don't know. To allow an easy-to-implement gentle roll-off filter without attenuating your wanted signal in the passband, you need a lot of headroom. BTW, to me this is all theoretical. As somebody of retirement ago, with loud tinnitus, 20 KHz sampling rate would be just fine.

@wdavem 2 жыл бұрын

This is great!!

@4Nanook 2 жыл бұрын

I'm glad someone GETS IT, regarding aliasing. I've had this argument with so many tone-deaf wanna be engineers that do not understand why percussion sampled at 44 Khz sounds like so much white noise but sampled at 192 Khz sounds like percussion instruments.

@nitram419 2 жыл бұрын

>> percussion sampled at 44 Khz sounds like so much white noise but sampled at 192 Khz sounds like percussion instruments...

@napalmhardcore 2 жыл бұрын

I'm so happy I watched this video because a while back I watched a video which contained a sweep up to 20kHz and noticed that the sound cut off abruptly at 16kHz. I was unsure whether the culprit was KZfaq, some other link in my audio chain or if the limit of human hearing is experienced as a hard limit (intuitively, this didn't seem right). I really need to have my hearing properly tested. I'm 38 now and I was "still in the game" comfortably up to 16kHz and I can definitely hear below 20Hz (I think it was somewhere around 16-18Hz when I stopped experiencing it as sound when I tested a while back). My mother told me that when I had my hearing tested as a kid by my school, they said my hearing was above average and that I could hear tones most couldn't. The funny thing is, the reason my hearing was being tested was because they thought I was deaf. My brother used to throw tantrums at home and I learned to "tune out" sounds I found annoying. Turns out I found the teachers annoying too.

@ABaumstumpf 2 жыл бұрын

It is a similar story with image resolution where people claim that a 4K TV is way better than their old 1080p TV - but the difference was not really due to resolution but size. You need a rather large screen at a close distance for any visual difference between 1080p and 4K, and now with 8K.... you need like 60" monitor at 1m distance for there to be any visual difference. 44 kHz 16 bit is enough for humans - for us that can be called "perfect". There has not been a single human that has ever shown to be able to accurately hear anything above 21kHz. For the bit-depth - kinda debatable as without noise-shaping, dithering or anything like that this is "only" ~96 dB SNR - so from the faintest sound perceivable (you'd need to be literally dead to not have the sound of blood flowing through your veins) up to soundlevels that cause permanent hearing-damage with just half an hour of exposure per day. You could literally have an audio-track with the drop of a needle and being on a busy road - and both things would be fully captured. Doing ANYTHING but listening to the audio is a different beast. Just image taking a photo with a resolution just high enough that it looks perfect to you (doesn't even matter what actual size/resolution) - ok. Now take the same image and stretch it to say 5 times the size - oh, it suddenly is no longer perfect. When you want to manipulate any data, be it image, sound, or anything else - you end up introducing distortions and losing some precision, so you'd better make sure that the initial data you got is way more than you actually want to deliver at the end, and do all your manipulations with as much USEFUL data as possible. With audio that often means capturing >20bits of depth at 96 kHz - which allows you to squeeze and stretch the sound a lot before any unwanted distortions become audible. Useful as in like this video is showing the problem of aliasing.You do NOT want that in your data so you better just use >96kHz during manipulation and then filter all the high-frequency stuff out before it ends up getting folded into the audible range. Cause once it is there you are not getting ridd of that anymore.

@benjamindover4337 2 жыл бұрын

Good stuff

@TurboBaldur 2 жыл бұрын

Another thing to consider is that at exactly the Nyquist limit, the signal contains no information whatsoever on the phase of the signal, so if you had a 90 degree phase shift between the left and right channel (or multiple channels in a multi track recording), that information would not register correctly in the audio samples. This may not be so important when listening to the audio as our hearing is not so sensitive to the phase of such short wavelengths, but if you start to do addition of the channels or other signal processing where the different channels interact, the same signals oversampled vs sampled at the Nyquist limit can produce a different sounding result, even after the result has been downsampled back to the Nyquist limit.

@ABaumstumpf 2 жыл бұрын

Nyquist will accurately reproduce the sound, if you THEN add extra modifications on that then that in no way implies anything about nyquist not being 100% correct.

@TurboBaldur 2 жыл бұрын

@@ABaumstumpf Nyquist is correct about the absolute minimum sampling rate, but there are benefits in oversampling.

@ABaumstumpf 2 жыл бұрын

@@TurboBaldur Yes, of course, but that in no way has any effect on what us humans can actually hear, and there the 44kHz 16 bit is enough. if the mastering of the audio is done poorly that is not the fault of the medium not does it make Nyquist any less correct.

@TurboBaldur 2 жыл бұрын

@@ABaumstumpf exactly, if the sampling is being done for playback to a human only then 44.1k is fine. But if you plan to edit the audio it makes sense to get more samples, even if the final export is to 44.1k

@peetiegonzalez1845 2 жыл бұрын

This is a great point, and I believe it may be why many digital recordings made in the early 90s sound "flat" compared to late-generation analog recordings. Too many engineers just relied blindly on the digital technology without thinking of consequences like this. Nowadays of course studios work with much higher bitrates and bit-depths for processing and mastering before producing the 44.1kHz or 48kHz files for release.

@MichaelReznoR 2 жыл бұрын

Thank you for this video. I want to listen to it probably many times to understand more what is happening. I also loved the Monty Montgomery video! It was so neatly presented, even I with no audio degree could understand it. So to sum it up, if I understood it correctly: ✅ Audio engineers use high sample rate (96 kHz+) for recording to avoid aliasing (and therefore avoid any unwanted weird sounds)? ✅ For consumer audio playback (music, games, movies etc.) nothing more than 44.1 kHz is even needed for any human being? As it is a waste of system resources for no benefit.

@tiarkrezar 2 жыл бұрын

So, after you showed the example at 4:40, my first thought was, "well, what if you instead choose a frequency that exactly divides the sampling rate?". So I opened up audacity, made sure both my audio device and the project were set to 48KHz, and tried generating a 12KHz tone - in that case, a square wave sounds just like a sine, but slightly louder. It's easy to make sense of it if you think about it in terms of generated samples - you just get two high ones followed by two low ones, and that pattern repeats *exactly* at a rate of 12KHz. If you choose a frequency that doesn't cleanly divide your sampling rate, you have to resort to an approximation - some runs of high/low samples will be longer, some shorter, so that over a longer period, they average out to the frequency that you're trying to achieve. But in that case, you're essentially creating a longer pattern of samples that takes more time before it repeats, which creates a bunch of other spurious (aliased) frequencies in your signal. I think the real takeaway here is that mathematically ideal square waves are awkward and don't work out that great in reality. Sines are way nicer.

@FilmmakerIQ 2 жыл бұрын

You choose a special case which is square wave with the frequency of the sample rate divided by four! There's two ways to think about that. Either the mathematical sum as you described or as a visual graph. Only one sinusoidal wave can fit the given samples... Instead of the sample defining the top of the square wave, it defines each side of the crest and trough of a sine wave with greater amplitude!

@GodmanchesterGoblin 2 жыл бұрын

Fun fact... People that lived with older TVs with noisy line-output transformers may have developed notches in their hearing at 15734Hz (NTSC) or 15625Hz PAL) although if they are that old they may not now hear much above 12kHz or so anyway (that's me at 63). I remembered this when you picked 5.2 and 15.6kHz for the demonstration. I also wondered how hard that 16kHz wall is that KZfaq apply, and would probably have gone with 5 and 15kHz or even 4 and 12kHz. If interested, it's also instructive to construct square waves visually using a graphing calculator to help with understanding how each odd harmonic improves the squareness of the waveform, although I guess Audition can do that as well. Great video, too, by the way.

@PatrickPoet 2 жыл бұрын

John, this is the worst explanation of the connection between aperture, circle of confusion, and infinite focusing I've ever seen!

@FilmmakerIQ 2 жыл бұрын

I agree.

@Frisenette 2 жыл бұрын

These concepts are connected however.

@wngimageanddesign9546 2 жыл бұрын

LOL!

@KK-pq6lu 2 жыл бұрын

Hey John, I’ve been doing digital signal processing since 1980, 41 years, including spatial digital signals. Nyquist can be grasped with knowing one concept: that sampling at the Nyquist frequency there is no phase information. Phase information is restored as the sample rate is increased above Nyquist. To differentiate a square wave from a sine wave, both still have to be faithfully reproduced, including the phase information. At 10 KHz, a 44.1kHz sample rate only produces 4 samples her sine wave, partially preserving the phase of the signal. Since a square wave is made up of more than one frequency, the phase information becomes important, as it affects the sound not just the amplitude of the sound. 44.1 kHz works because most of what we listen to is under 8kHz. If you want to preserve phase up to 15kHz, really should sample above 60kHz. Now, if you are listening to stereo, you really want to preserve more phase information, so makes even more sense to go 60kHz or higher. Even though to me 44.1 kHz seems fine enough for me. I always wanted to make a spatial audio standard that recorded phase information as well as sampling information, a transformation rather than sledgehammer sampling. This has been done commercially outside the audio industry for over 35 years.

@trulahn 2 жыл бұрын

You are totally ignoring the sound reproduction equipment's role in this. Sure at 10 KHz, a 44.1 kHz sample only produces 4 samples. So? The signal recreated by the DAC sent to the vibrating membrane or paper cone of your headphones or speakers while reconstructing this 4 sample pulse of 1 second, it's plenty. 60 kHz may be useful during mastering of the original, but at the consumer level, we don't benefit from it with proper noise cancelling and anti-aliasing applied.

@collin4555 2 жыл бұрын

Oh, that makes perfect sense, it's kind of frustrating that there's still so much confusion when the explanation is pretty graspable.

@xtrct7303 2 жыл бұрын

Signal Engineer here. You also just made a brilliant explanation of Gibbs phenomenon in less than 1 minutes too!

@samihawasli7408 2 жыл бұрын

Hello, apologies for being slightly picky. Electrical engineer here, I work with high speed data converters, >1Gbit type work. You nailed pretty much everything, but I’d like to make 1 tiny (but SUPER important) note. BTW, this mistake is even in some EE text books: nyquist theorem doesn’t say you should sample at twice the maximum frequency in your signal, but rather twice the maximum bandwidth of your signal. This ensures your entire signal falls within the first nyquist zone, allowing your anti-aliasing filter to cut out all unwanted signals. The reason square waves never work well, their bandwidth is technically infinite. Again, don’t want to take anything away from your video, great work! Edit: I wrote that first paragraph not trying not get to technical, but I feel it leaves a bit to be desired. I don't do anything audio, and I just realized the maximum frequency in an audio signal usually describes the maximum bandwidth (please correct me if this assumption is wrong), making the first paragraph a distinction without a difference to audio folks. However, is it import to make that distinction between maximum frequency and bandwidth because it allows for the use of underdamping to still faithfully reproduce your signal. In my world, I often times want to digitize signals in the GHz range, but if i know my signals bandwidth, i can sample a much much MUCH lower frequency, and filter my output to any higher nyquist zone. In this case I am using the aliased signals to faithfully represent the initial signal. This technique requires pretty fancy filtering, and knowing the center frequency and bandwidth of your incoming signal. Often time we dont have that info.

@shootinbruin3614 2 жыл бұрын

When you said you loved Monty's video, I didn't realize you loved it so much you'd make your own (also very informative) video adding to the topic! Makes me glad I shared the links! In regards to gaining a perceptible increase in audio quality, I personally believe that data is better spent increasing the bit depth of the digital recording. Doing so would improve the noise floor, but even this would only make a real difference in the highest end headphone setups or a dedicated speaker room (then again, the people who own these things are generally the ones debating this topic to begin with, right?)

@FilmmakerIQ 2 жыл бұрын

I'm firmly in the increase bit depth camp as well - but from my perspective, it's just buying insurance. I shoot a lot of stuff where I can't really monitor the audio - I'm just capturing everything - and with 24 bits have a LOT more room for error in the volume.

@shootinbruin3614 2 жыл бұрын

@@FilmmakerIQ The nice thing is that relative processing and storage cost is constantly going down, and there's always that consumer who's willing to pay for "the best." Who knows, maybe in our lifetimes 480kHz 64 bit will become mainstream haha

@nickwallette6201 2 жыл бұрын

As was said here, it's cheap insurance, so why not. But, TBH, even good 16-bit converters are already near, at, or better than the noise floor of the analog signal chains on either side. (Especially when you start digging into the technical details of dithering.) Even if you think about the absolute top-shelf reproduction chain some crazy audiophile may have, with a home mortgage poured into their Class A amplifiers and directional speaker cables held up by non-resonant cable guides.... The studio was still combining a dozen tracks together (combining each of their noise floors) through a sound board with a gazillion passive components and a ton of make-up gain after the summing stage, sourcing each of those channels from pre-amps, EQs, and compressors built in the 1960s for that warm analog sound.... Did ANY of that equipment have a -120dB noise floor? God no. When combined, do you think there's any chance that a good CD player's DAC is the bottleneck? :-) About the only thing 24-bit (or higher) DACs can do is handle those digitally-generated fade-outs with a little more accuracy. Again, in the recording chain, there is actually incentive to use higher bit depths: To provide margin for error. In most editing suites, all source material will be converted to 64-bit floating point values on-the-fly anyway, and only re-quantized to integer samples for playback or bouncing to the master files. But still...

@shootinbruin3614 2 жыл бұрын

@@nickwallette6201 I didn't even realize directional cables were a thing. How does that work?

@nickwallette6201 2 жыл бұрын

@@shootinbruin3614 Your guess is as good as mine. Probably about as well as using hospital grade AC outlets, or coloring the edge of CDs with Sharpie to prevent light refraction.

@kernelpickle 2 жыл бұрын

I’ve been recording and mixing for years, and the only time sample rate matters is on the recording. 24-bit audio @ 192KHz is indistinguishable from analog tape, and if you can record your audio at that sample rate, that will give you the option to master it for any format you want, with the least amount of degradation to the sound. For folks that understand how film and video work, it’s similar to folks that are shooting video in 4k if they plan on making 1080p content or in 8k if they’re planning to release something in 4k, because even though they never plan to release anything at that higher resolution, it gives them more options for cropping the footage and doing other stuff that you wouldn’t be able to if you shot video at the intended output resolution of the finished product. Applying high, low or bandpass filtering to audio is essentially the same as cropping an image, and the more detail you have to crop, the better it’s going to look or sound. Just think about an image, if it’s the size of the file you want the final output to be, and you decide to trim off the edges to reframe the photo, and then if you increase the image size so it matches the output resolution you started with, then you’re gonna be looking at something larger and far less detailed and blurry than you would have if the image had started out at a much higher resolution. I will be the first to admit my recordings are all at 44.1KHz or 48Khz, but that’s because I couldn’t afford the hardware (or it didn’t exist when I made the recordings) so the end results that I got with those mixes never sounded as clear or crisp as the stuff you hear that’s been stamped with the official “Mastered for iTunes” label. Another interesting topic I think that builds on this lesson would be to discuss the process of dithering when mastering audio. Some folks might be surprised to find out that the best sounding digital masters deliberately introduce white noise into the file as part of the mastering process, especially when downsampling from something like 192KHz audio to 44.1KHz.

@FilmmakerIQ 2 жыл бұрын

Okay I've gotta do a video on this because that analogy is completely wrong. Also analog tape has less specs than 16bit 44.1.

@kernelpickle 2 жыл бұрын

@@FilmmakerIQ analog tape has way more dynamic range and headroom than 16-bit audio at 44.1KHz. That's why everyone was still recording to analog tape, well after the CD, DAT and other forms of digital audio were invented. Believe me, they didn't do it because it was easier or saved money. Maintaining an analog recording studio with massive tape reels was an expensive and fiddly endeavor, so anyone running a studio back in the day would've jumped on the latest technology if it would've simplified that process. It wasn't until everyone eventually converted to digital recordings in the 2000's, when sample rates and quality of studio gear were high enough to record 24-bit audio, at sample rates well above 44.1KHz. You don't have to like my analogy, because it's not exactly perfect, but people know more about editing photos and videos these days than they do about audio--and they just need something they can wrap their heads around, to know why people choose to record at higher sample rates than what we hear as the finished product. However, my explanation and analogy are not wrong--let alone completely wrong. I not only studied digital audio in college, I also worked in radio, and even helped teach a class on digital audio production. The professor wasn't the most skilled at recording and editing, because he came up in the analog era, and just used the computer like a tape deck and did everything old school. So, I helped him teach students one-on-one, how to actually use a DAW in one of the studios, so that they could record their assignments. I still record, mix and produce music for myself and others in my spare time, so I might not be a KZfaqr but I know what I'm talking about, and I'm not sure you know what I'm talking about, because if you did, you wouldn't call me "wrong" and use that as the catalyst for making a video to correct me. I have no idea what your credentials are or experience in this field is, but I got the impression that you're someone who has some technical understanding, and just learned all of this shit in the process of making your video, and you really don't have more than a decade of actual knowledge. It's funny, because this video was actually lacking some pretty basic information about the topic. You didn't even explain why someone would want to record anything at 44.1KHz, when there are much higher sample rates. You brought up using 48KHz as the sample rate, but didn't explain where that comes from. I think your viewers are even more ignorant than you on the subject, and might not know that CDs happen to use 16-bit @ 44.1KHz, and that DVD audio uses 48KHz. For anyone else reading this that actually cares to learn something, CD's compromised on the sound quality, because they couldn't make players that played back compressed audio without making them super expensive, and that was the highest quality sound they could use and still fit an entire symphony onto a single disc. (Audiophiles are historically fans of classical music, and when you're launching a new music format that's only going to be be affordable to the wealthy and/or those with "discerning taste", you kinda want to make sure you can cater to them a bit. It was a huge selling point for anyone sick of flipping albums to hear the second half of the performance, and I'm sure that without the support of those snooty weridos, CDs might never have taken off.) DVD's used 48KHz because it was the base sample rate used by DAT, which was one of the original digital recording formats, and because it was what people were using in studios, it got adopted by Mpeg2, DVD and digital broadcast formats. It only sounds slightly better, and it's almost imperceptible if someone uses proper dithering when creating the final audio file. It was simply a matter of compatibility with existing pro-audio equipment, which also supported higher sample rates like 96KHz. Good studios would record at the higher sample rate, and then downsample their work for the finished product. DVD-A used 24-bit audio @ 48KHz, because they were purely an audio experience, so they could use up more of the space on the disc for higher quality sound. Newer formats like BD (and the new dead HD) DVD used 96KHz, again, because of the larger amount of space available. Which is still really good sounding, but it's still only half the sample rate of the highest quality digital recordings, which is 24-bit @ 192KHz. There may eventually come a time when there's equipment that can capture audio at a higher sample rate, but even the obnoxious audiophile community that would typically support anything that's higher quality, just for the sake of it being measurably better (even if it wasn't perceptibly better) hasn't been pushing for anything higher. Turns out, even they can't tell the difference between 24-bit audio @ 192KHz, when compared to a super clean analog recording, from a well maintained deck with Dolby noise reduction. If you don't overdrive the tape, or have it distort in the upper frequencies, and you play it back on equipment that doesn't have any ground hum, it sounds fucking amazing--and so does 24-bit audio @ 192KHz, which I guarantee you've never heard in your life. Unless you're in a legit recording studio with high end gear to hear the difference, you can't tell. You can absolutely hear the difference between analog tape and the much lower quality audio used by CDs, because the dynamic range is reduced to 96 dB (which is a non-trivial 48 dB less than 24-bit audio) and more importantly, it's less than the 110 dB range of analog tape when recorded using a Dolby SR noise reduction system. 32-bit audio hasn't really taken off, because 24-bit audio already overkill with a wide dynamic range of 144 dB, which is already higher than the theoretical dynamic range of human hearing, which taps out at 140 dB--so 192 dB is just needlessly wasting storage space. That said, 16-bit audio with proper noise shaped dithering can have a perceived dynamic range of 120 dB, but again pure analog tape also has an effectively infinite sample rate, so that combined with the actually greater dynamic makes it sounds better than CD audio. Honestly, I'm not even sure what the point of your video even was, because KZfaq isn't the platform capable of even showing the subtle differences between audio using sample rates of 44.1KHz and 48KHz, especially when KZfaq already filters out everything over 15KHz. You may not be able to hear sounds over 15KHz, but I still can, and at this point if your hearing is already damaged enough to the point you can't even hear a sine wave between 15-20KHz, then you're clearly not the guy who should even care, because those sounds are for you, and I would agree that you shouldn't invest in anything better than CD audio, because it's completely lost on you. For those of us that actually understand digital audio, and have fully functional ears that can hear everything from 20Hz to 20KHz, there's plenty of reasons to record or listen to music that's using a higher sample rate and bit depth than CD audio. Of course, that's just a simplified explanation of some of the vast amounts of information your video was lacking, because I didn't even discuss the bit rate of digital audio (mostly because we were discussing uncompressed digital audio, and it's only when compressing audio files that bit rate becomes an issue, because that's where the sound quality gets drastically reduced.) But hey, you're just a guy who doesn't really have a background in this stuff, so I don't expect you to talk shop on the fine points of all this. Those of us who work with this stuff for real actually need to know about how our recording medium actually works, and we have to know how audio works, so that when we're mixing it for your consumption, that it sounds right--so we don't expect laypeople to know how the Fletcher-Munson curve affects our hearing during the process of recording and mixing, or on playback over a sound system of any kind. So, while they title of your video isn't wrong--the work you showed to get to the right answer is, because nobody in the history of the music and recording industry, or tangentially film and television, ever said 44.1KHz was optimal. The reason it's not optimal, is because the low pass filter is still attenuating frequencies within the audible range. So when Harry Nyquist figured all this shit out, he was merely pointing out the bare minimum that audio had to be sampled at to reproduce the full range of human hearing. He wasn't wrong, it's just that there's no perfect low pass filter that exists, capable of attenuating frequencies outside the range of human hearing, without attenuating audible signals. So, even with the best possible filter, you're still going to cut things off well above what we can hear, just to make sure nothing gets cut. In the real world, I typically don't allow my mixes to contain very much above 15KHz, because as you've noted, it's not supported by KZfaq, and most people won't hear that stuff anyway. However, I do allow reverb to contain as much high end content or "air" as we call it in the business, because those are the subtle things your ears will detect and miss if it's unnaturally chopped. It's like bad lighting in a poorly edited photo, or CGI--you have to be an expert to know what you're looking for to see it, but we instinctively know when those subtleties are lost and it will seem wrong or fake. Anyway, good luck with your channel. Hopefully you spend some time learning and doing some research before you go off and make something that's going to confuse or misinform your viewers.

@FilmmakerIQ 2 жыл бұрын

I'm not reading this novel especially when you start with a completely false statement that tape has more dynamic range... There's no point when you're so off base from the start.

@kernelpickle 2 жыл бұрын

@@FilmmakerIQ Maybe if you read what I wrote, you'd actually learn something smart guy. Feel free to look it up. Analog tape recorded with Dolby SR noise reduction, which was the standard in professional studios, had a dynamic range of 110dB, while 16-bit digital audio has a dynamic range of 96dB. I'm not talking about cassette tapes here bud, I'm talking about 1/2 tape used in professional studios to make multi-track recordings. So, please just STOP with your nonsense, because you don't know what the hell you're even talking about. You looked some things up on Wikipedia, and think that you're a professional because you make KZfaq videos. How many professional studios have you been in that actually had 1/2 tape machines? I guarantee you've never even seen a 1/2 inch tape in your life, let alone heard one played back over the studio monitors in a real studio. Clearly, you seem to fancy yourself a "Filmmaker" and not a recording engineer, or producer--so why don't you go make your silly little videos about lenses, or light meters, because you don't know shit about digital audio or recording.

@mateuszkubala1800 2 жыл бұрын

This video got me some new knowledge.

@nacholibre9929 2 жыл бұрын

At last I understood what is aliasing, thanks

@mattstegner 2 жыл бұрын

The 16k cut off is probably the encoding setting KZfaq picked for the codec, not some hard filter they applied. Most perpetual encoders (AAC, MP3) will throw away high frequency content. I mean, it probably wasn’t a nefarious decision by KZfaq.

@FilmmakerIQ 2 жыл бұрын

Of course it wasn't nefarious... But it was one annoying obstacle in trying to demonstrate this concept. And then it's only on SOME of the streams, not all...

@MadMatty72 2 жыл бұрын

Great vid - goid on ya.

@jgurtz 2 жыл бұрын

o gawd, mini flashbacks to numerical analysis and signals and systems classes!

@michaelkreitzer1369 2 жыл бұрын

This was great! Thank you. It seems that ultimately, this matters for producing, but not at all for listening. By time I get it to listen to, those high frequencies should have been long filtered out. However, I wonder how many PC sound systems (windows, alsa, openal, etc) bother to apply a low pass to signals they downsample in order to avoid aliasing?

@Photovintageguy 2 жыл бұрын

Sound people that slow down sounds for sound effects etc, say they need more room like 192k. It's kinda like slowing down 120fps to 25fps in video.

@Lantertronics 2 жыл бұрын

I've heard that too -- but unless they have special scientific microphones designed to capture frequencies above human hearing, I'm not sure it matters.

@Photovintageguy 2 жыл бұрын

I don't think it's about frequency width. It's about stretching the entire recording. When you stretch it makes everything thinner. Like pulling rubber band. Signal would get less resolution. Less data points. Through entire range of frequencies.

@Photovintageguy 2 жыл бұрын

This guy talks about pitch and time stretching uses 96k recording. Sound effects for movies. kzfaq.info/get/bejne/adVdhshp1rfIdJ8.html

@Photovintageguy 2 жыл бұрын

Reasons for 96k recording. Time and pitch stretching. kzfaq.info/get/bejne/adVdhshp1rfIdJ8.html

@FilmmakerIQ 2 жыл бұрын

Problem is the rubber band analogy doesn't work because Nyquist does not work that way. Using 24khz, the audio would be EXACTLY the same as 48khz in every respect BUT only up to 12khz. So it's not that you have more data points - that doesn't matter when the audio is sent back to analog in the speakers. I suspect the reason 96hz would be used for slow downed effects is the same reason I discussed in the video: headroom. With 96kz there's about an octave and change you can manuever around in without running into a Nyquist limit that dips into the perceivable range.

@mfeif 2 жыл бұрын

Great video, thanks! FWIW (and that’s not much) at 5:00 you say that KZfaq samples everything to 44.1. But actually, KZfaq uses the opus codec for the audio channels of videos, and that format is locked to 48. I think a few older vids might also have ogg or m4a which may be in 44.1, but “most” are sent in 48. It’s certainly not substantive for the point you’re making, more just trivia. Thanks!

@FilmmakerIQ 2 жыл бұрын

AAC is used for Apple devices which is locked to 44.1. It also happens to be what they use for the download file option in YT's creator studio.

@mfeif 2 жыл бұрын

Aha. Interesting. Using youtube-dl, here are the streams available for your video (limited to audio): 249 webm audio only tiny 52k , webm_dash container, opus @ 52k (48000Hz), 7.13MiB 250 webm audio only tiny 61k , webm_dash container, opus @ 61k (48000Hz), 8.37MiB 251 webm audio only tiny 108k , webm_dash container, opus @108k (48000Hz), 14.81MiB 140 m4a audio only tiny 129k , m4a_dash container, mp4a.40.2@129k (44100Hz), 17.71MiB I'm on a mac here (but not an iOS device); in Firefox, the youtube web app uses stream #251 (as visible in the "stats for nerds" right-click; in Safari it uses #140, so you are indeed correct! Again, thanks for the excellent video.

@HansBaier 2 жыл бұрын

It's the other way round. The Gibbs phenomenon shows up, when there is NO aliasing. It is the result of running through the antialiasing-filter in a DAC. The antialiasing-filter rolls off all frequencies above 20kHz and the result are the squiggles around the edges. A perfect square wave has an infinite number of overtones, and when you cut those off with the antialiasing filter, the result is a band limited square wave, which exhibits Gibbs phenomenon.

@FilmmakerIQ 2 жыл бұрын

Yeah I over stated the Gibbs part

@nathan43082 2 жыл бұрын

As someone who has repeatedly defended digital audio, including debunking false claims, I've been posting that Monty video for years. Great stuff that. Dan Lavry's White Paper has also been quite informative. I own a Lavry AD11 as well as a DA10 and record at 24/96 kHz most of the time for my songs, a handful of which you can find on Soundcloud. You can almost make out the AD11 under the desk behind my guitars in this video: kzfaq.info/get/bejne/ft1-gcerzrDGkqs.html.

@GuyXVIII 2 жыл бұрын

When the film guy gives better explanation about sound stuff then actual sound guys... Also, sound was recorded with distortion in the analog era, and now we crave that distortion, so why should we go against THIS form of aliasing distortion? Don't bother. Life is Short. Record in 44.1k. Great video :) Cheers!

@simongunkel7457 2 жыл бұрын

Well, there are music styles that use aliasing as a stylistic element. But one thing dynamic distortion has going for it is that the frequency content it adds is harmonically related to the input and even with intermodulation (where there's a risk of losing this feature) you have some intervals where it remains so - hence the popularity of power chords, where the added frequency is an octave below the root note. So I doubt it will become something with popular appeal and your reasoning reminds me of the old "we went from triads to tertrachords and now romanticism is regularly using quintachords and sextachords, so obviously the next big thing will be 12-tone music".

@etmax1 2 жыл бұрын

So here's the thing. Anything to do with digital sampling MUST always be done with analogue filters. On the recording side your filter MUST completely eliminate 100% of signals above Nyqist frequency. On the playback side, 100% of frequencies that are above the Nyquist frequency must not exist in the file AND the DAC output should be filtered with a filter that removes edge artefacts (remember they are stepped samples) so that there are no frequencies that were not in the original recording. To do ANYTHING else is to create distortions that significantly detract from the quality. Now the next "thing" where reality meets desire: A filter that has this ideal characteristic that passes 100% of the signal below the Nyquist frequency and 0% above the Nyquist frequency can only be approximated, BUT it causes significant phase shifts and amplitude ripple between frequencies which are audible making high frequencies sound smeared or pitchy. This was realised fairly early on and so "smarts" were added to CD decoders/DACs that created a whole bunch of in between samples creating a pre- analogue filter output that might be effectively 2, 4, 8 or even 16 times (perhaps more) the Nyquist frequency, and could therefore filtered by a much less aggressive filter to restore the analogue signal. As this is only half the equation, at the recording studio you would need to record with several times the Nyquist frequency, while being able to use a much less aggressive analogue filter and then digitally down to the 44.1kHz CD rate. Doing this you will end up with a digital recording on CD that has no audible aliasing or phase delays and amplitude ripple from the analogue filter, and I'd wager anybody (even golden ears) would ever detect any difference between that and something sampled at a higher rate unless the higher sample rate recorded higher frequencies. I say this because there is some evidence that some people can "feel" the envelope of recordings above their normal hearing range. This is the science and the engineering of it all.

@FilmmakerIQ 2 жыл бұрын

I would add that you don't want those higher frequencies even if you could feel them. There's nothing musical about those frequencies

@etmax1 2 жыл бұрын

@@FilmmakerIQ While I would agree with that statement, that is based on taste/emotion and as such can be argued against by those of different opinion. The rest however, is based on science/engineering.

@FilmmakerIQ 2 жыл бұрын

No it isn't based on taste or emotion, those frequencies are quite painful to listen to. I had someone else comment that the experience trying to hear a 21khz tone was like shoving needles into your ear.

@etmax1 2 жыл бұрын

@@FilmmakerIQ :-) I find valve amplifiers sound woeful, yet a number of people with deep pockets swear by them. This is that same sort of thing. Agreeing with me that it brings nothing to the occasion does not stop someone somewhere thinking it sounds great, just like I like birds eye chillies and my wife doesn't.

@TheJediJoker 2 жыл бұрын

You skipped an important point: the steeper a filter, the greater the induced phase shift introduced to the signal. You can get around this using different types of filters, but those introduce other temporal artifacts (such as pre-ringing with linear phase filters). And crucially, just as an anti-aliasing filter is needed at the analog input to a digital system, a reconstruction filter is needed at the analog output from a digital system. Therefore, the primary advantage of higher sampling rates in audio is that one can use less steep anti-aliasing and reconstruction filters starting at higher frequencies well outside the audible range, but well below the Nyquist frequency-all while generating fewer artifacts within the audible range.

@TheBohrokMan 2 жыл бұрын

That square wave spectrum analysis at 8:00 is so cool, I’ve never seen a visualization of aliasing like that! Thanks for putting out a high quality explanation of this stuff, love it! Oh and I have a similar experience to another commenter. In grad school, we often add filters when sampling analog data to cut out high frequencies that would be aliased.

@ajmhobby 2 жыл бұрын

Great video John! Digital sample rates are confusing but thanks to this video I understand them a little bit more. But what about DSD (Direct Stream Digital)?

@pokepress 2 жыл бұрын

I do know that MP3 compression cuts out at 16khz because of the way the standard was designed. Also, I think some devices start to roll off frequencies in the last octave or so, so even if you have speakers and ears that can reproduce and perceive those frequencies, your hardware may be reducing their amplitude.

@Liam3072 2 жыл бұрын

Not all MP3 encoders cut at 16KHz though . LAME Encoder does not beyond a certain bitrate. And anyway, KZfaq does not use MP3 compression. It uses either AAC or Opus.

@leonardhindmarsh2352 2 жыл бұрын

48kHz is enough for playback, usually the steepness of the LP filter to bandwith is 45 to 55% of the sample output, it gives almost non-existent phase errors and ripple within maximum audible range. 96kHz can provide benefits when it comes to pitch shifting of high frequency information and lower latency. Softer filters can also be used with 96kHz and possibly less ringing and phaseshifts, but it is rare for different applied filter to be used between different sampling frequencies, in addition, higher sampling frequencies often results in an extra component-instability. Basically all DACs and ADCs use delta sigma modulation with multiple bits (often 2-6bits). This involves a sampling frequency of several MHz, but they utilize another more effective type of modulation for the purpose, this modulation arise from a sawtooth that follow the analogue tone frequency which provides a pulse density/width that is digitized with 1-bit for a bitstream, partly and continuously analog for a certain period and compared to the analog input signal with differential circuits which results in different high frequency pulses designed to add or remove the energy in certain frequency bands, the distorsion energy is in this way increased in higher frequency bands and decreased in lower frequency bands, which is continued until the noise is satisfactorily reduced within the desired frequency band, this is done in several steps by several circuits, divided by amplitude for more effective noise shaping while maintaining stability, after this the process of demodulation and decimation takes part from several 1bit PDM bitstreams divided by amplitude to one 24bit PCM, with applied digital filters and downsampling.

@a1guitarmaker 2 жыл бұрын

One time you said the right words "four ninety-three" while the numbers on screen said "439". I was not expecting to hear the difference between 440 and 439!

@FilmmakerIQ 2 жыл бұрын

Yeah, I dyslexia

@andreasxfjd4141 2 жыл бұрын

I noticed many times how much better sounds a song on iTunes than on KZfaq (through a DAC and headphone)

@johnjacquard863 2 жыл бұрын

nice video thanks.

@muizzsiddique 2 жыл бұрын

For me the 16kHz limit is fine because my hearing ends sooner than that :( Definitely wasn't the case 10-15 years ago.

@johnjacquard863 2 жыл бұрын

the issue is more to do with the fundamental frequency of the instruments we use and the way we construct music. we only have drums in high frequency or transients of vocal sibilance.

@johnjacquard863 2 жыл бұрын

we don't need to hear anything about 10khz ( except harmonics)

@1bit Жыл бұрын

Great video and demonstration. I can clearly hear the difference between A and B when side by side but suspect I wouldn’t be able to discern the two if separated by 15 seconds of silence OR if comparing a more real world scenario with the organic complexities that our perception sorta smooths out when not listening to some sterile demonstration of sine vs square. Maybe worth noting I’m by no means young (39) yet still heard the difference though am likely an outlier as my earring presently scores within the range of teenagers and I remember frequently being bothered by high frequency sounds that nobody else seemed to hear back when I actually was a teen. 20+ years of working at shows and touring with bands took care of that curse😂

@FilmmakerIQ Жыл бұрын

You can hear the difference but did you hear the fake out I pulled ;)