Disegnare forms d’onda con AVAssetReader

Sto leggendo un brano dalla libreria di iPod usando assetUrl (nel codice chiamato audioUrl) Posso suonarlo in molti modi, posso tagliarlo, posso fare un po ‘di precessione con questo ma … davvero non capisco cosa farò con questo CMSampleBufferRef per ottenere dati per disegnare forms d’onda! Ho bisogno di informazioni sui valori di picco, come posso ottenere questo (forse un altro) modo?

AVAssetTrack * songTrack = [audioUrl.tracks objectAtIndex:0]; AVAssetReaderTrackOutput * output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:nil]; [reader addOutput:output]; [output release]; NSMutableData * fullSongData = [[NSMutableData alloc] init]; [reader startReading]; while (reader.status == AVAssetReaderStatusReading){ AVAssetReaderTrackOutput * trackOutput = (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0]; CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer]; if (sampleBufferRef){/* what I gonna do with this? */} 

Mi aiuti per favore!

Stavo cercando una cosa simile e ho deciso di “arrotolare il mio”. Mi rendo conto che questo è un vecchio post, ma nel caso in cui qualcun altro sia alla ricerca di questo, ecco la mia soluzione. è relativamente veloce e sporco e normalizza l’immagine a “fondo scala”. le immagini che crea sono “larghe”, cioè è necessario metterle in un UIScrollView o gestire in altro modo il display.

questo è basato su alcune risposte date a questa domanda

Uscita di esempio

forma d'onda del campione

EDIT: ho aggiunto una versione logaritmica dei metodi di calcolo della media e del rendering, vedere la fine di questo messaggio per la versione alternativa e gli output di confronto. Personalmente preferisco la versione lineare originale, ma ho deciso di postarla, nel caso qualcuno possa migliorare l’algoritmo utilizzato.

Avrai bisogno di queste importazioni:

 #import  #import  

Innanzitutto, un metodo di rendering generico che accetta un puntatore per calcolare i dati di esempio medi,
e restituisce un UIImage. Nota che questi campioni non sono campioni audio riproducibili.

 -(UIImage *) audioImageGraph:(SInt16 *) samples normalizeMax:(SInt16) normalizeMax sampleCount:(NSInteger) sampleCount channelCount:(NSInteger) channelCount imageHeight:(float) imageHeight { CGSize imageSize = CGSizeMake(sampleCount, imageHeight); UIGraphicsBeginImageContext(imageSize); CGContextRef context = UIGraphicsGetCurrentContext(); CGContextSetFillColorWithColor(context, [UIColor blackColor].CGColor); CGContextSetAlpha(context,1.0); CGRect rect; rect.size = imageSize; rect.origin.x = 0; rect.origin.y = 0; CGColorRef leftcolor = [[UIColor whiteColor] CGColor]; CGColorRef rightcolor = [[UIColor redColor] CGColor]; CGContextFillRect(context, rect); CGContextSetLineWidth(context, 1.0); float halfGraphHeight = (imageHeight / 2) / (float) channelCount ; float centerLeft = halfGraphHeight; float centerRight = (halfGraphHeight*3) ; float sampleAdjustmentFactor = (imageHeight/ (float) channelCount) / (float) normalizeMax; for (NSInteger intSample = 0 ; intSample < sampleCount ; intSample ++ ) { SInt16 left = *samples++; float pixels = (float) left; pixels *= sampleAdjustmentFactor; CGContextMoveToPoint(context, intSample, centerLeft-pixels); CGContextAddLineToPoint(context, intSample, centerLeft+pixels); CGContextSetStrokeColorWithColor(context, leftcolor); CGContextStrokePath(context); if (channelCount==2) { SInt16 right = *samples++; float pixels = (float) right; pixels *= sampleAdjustmentFactor; CGContextMoveToPoint(context, intSample, centerRight - pixels); CGContextAddLineToPoint(context, intSample, centerRight + pixels); CGContextSetStrokeColorWithColor(context, rightcolor); CGContextStrokePath(context); } } // Create new image UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext(); // Tidy up UIGraphicsEndImageContext(); return newImage; } 

Successivamente, un metodo che accetta un set AVURLA e restituisce i dati dell'immagine PNG

 - (NSData *) renderPNGAudioPictogramForAsset:(AVURLAsset *)songAsset { NSError * error = nil; AVAssetReader * reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error]; AVAssetTrack * songTrack = [songAsset.tracks objectAtIndex:0]; NSDictionary* outputSettingsDict = [[NSDictionary alloc] initWithObjectsAndKeys: [NSNumber numberWithInt:kAudioFormatLinearPCM],AVFormatIDKey, // [NSNumber numberWithInt:44100.0],AVSampleRateKey, /*Not Supported*/ // [NSNumber numberWithInt: 2],AVNumberOfChannelsKey, /*Not Supported*/ [NSNumber numberWithInt:16],AVLinearPCMBitDepthKey, [NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey, [NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey, [NSNumber numberWithBool:NO],AVLinearPCMIsNonInterleaved, nil]; AVAssetReaderTrackOutput* output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:outputSettingsDict]; [reader addOutput:output]; [output release]; UInt32 sampleRate,channelCount; NSArray* formatDesc = songTrack.formatDescriptions; for(unsigned int i = 0; i < [formatDesc count]; ++i) { CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i]; const AudioStreamBasicDescription* fmtDesc = CMAudioFormatDescriptionGetStreamBasicDescription (item); if(fmtDesc ) { sampleRate = fmtDesc->mSampleRate; channelCount = fmtDesc->mChannelsPerFrame; // NSLog(@"channels:%u, bytes/packet: %u, sampleRate %f",fmtDesc->mChannelsPerFrame, fmtDesc->mBytesPerPacket,fmtDesc->mSampleRate); } } UInt32 bytesPerSample = 2 * channelCount; SInt16 normalizeMax = 0; NSMutableData * fullSongData = [[NSMutableData alloc] init]; [reader startReading]; UInt64 totalBytes = 0; SInt64 totalLeft = 0; SInt64 totalRight = 0; NSInteger sampleTally = 0; NSInteger samplesPerPixel = sampleRate / 50; while (reader.status == AVAssetReaderStatusReading){ AVAssetReaderTrackOutput * trackOutput = (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0]; CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer]; if (sampleBufferRef){ CMBlockBufferRef blockBufferRef = CMSampleBufferGetDataBuffer(sampleBufferRef); size_t length = CMBlockBufferGetDataLength(blockBufferRef); totalBytes += length; NSAutoreleasePool *wader = [[NSAutoreleasePool alloc] init]; NSMutableData * data = [NSMutableData dataWithLength:length]; CMBlockBufferCopyDataBytes(blockBufferRef, 0, length, data.mutableBytes); SInt16 * samples = (SInt16 *) data.mutableBytes; int sampleCount = length / bytesPerSample; for (int i = 0; i < sampleCount ; i ++) { SInt16 left = *samples++; totalLeft += left; SInt16 right; if (channelCount==2) { right = *samples++; totalRight += right; } sampleTally++; if (sampleTally > samplesPerPixel) { left = totalLeft / sampleTally; SInt16 fix = abs(left); if (fix > normalizeMax) { normalizeMax = fix; } [fullSongData appendBytes:&left length:sizeof(left)]; if (channelCount==2) { right = totalRight / sampleTally; SInt16 fix = abs(right); if (fix > normalizeMax) { normalizeMax = fix; } [fullSongData appendBytes:&right length:sizeof(right)]; } totalLeft = 0; totalRight = 0; sampleTally = 0; } } [wader drain]; CMSampleBufferInvalidate(sampleBufferRef); CFRelease(sampleBufferRef); } } NSData * finalData = nil; if (reader.status == AVAssetReaderStatusFailed || reader.status == AVAssetReaderStatusUnknown){ // Something went wrong. return nil return nil; } if (reader.status == AVAssetReaderStatusCompleted){ NSLog(@"rendering output graphics using normalizeMax %d",normalizeMax); UIImage *test = [self audioImageGraph:(SInt16 *) fullSongData.bytes normalizeMax:normalizeMax sampleCount:fullSongData.length / 4 channelCount:2 imageHeight:100]; finalData = imageToData(test); } [fullSongData release]; [reader release]; return finalData; } 

Opzione avanzata: infine, se vuoi essere in grado di riprodurre l'audio utilizzando AVAudioPlayer, dovrai metterlo nella cache della cartella della bundle della tua app. Da quando l'ho fatto, ho deciso di memorizzare anche i dati dell'immagine e ho avvolto l'intera cosa in una categoria UIImage. è necessario includere questa offerta open source per estrarre l'audio e alcuni codici da qui per gestire alcune funzionalità di threading in background.

in primo luogo, alcuni definiscono e alcuni metodi di class generici per la gestione dei nomi dei percorsi ecc

 //#define imgExt @"jpg" //#define imageToData(x) UIImageJPEGRepresentation(x,4) #define imgExt @"png" #define imageToData(x) UIImagePNGRepresentation(x) + (NSString *) assetCacheFolder { NSArray *assetFolderRoot = NSSearchPathForDirectoriesInDomains(NSCachesDirectory, NSUserDomainMask, YES); return [NSString stringWithFormat:@"%@/audio", [assetFolderRoot objectAtIndex:0]]; } + (NSString *) cachedAudioPictogramPathForMPMediaItem:(MPMediaItem*) item { NSString *assetFolder = [[self class] assetCacheFolder]; NSNumber * libraryId = [item valueForProperty:MPMediaItemPropertyPersistentID]; NSString *assetPictogramFilename = [NSString stringWithFormat:@"asset_%@.%@",libraryId,imgExt]; return [NSString stringWithFormat:@"%@/%@", assetFolder, assetPictogramFilename]; } + (NSString *) cachedAudioFilepathForMPMediaItem:(MPMediaItem*) item { NSString *assetFolder = [[self class] assetCacheFolder]; NSURL * assetURL = [item valueForProperty:MPMediaItemPropertyAssetURL]; NSNumber * libraryId = [item valueForProperty:MPMediaItemPropertyPersistentID]; NSString *assetFileExt = [[[assetURL path] lastPathComponent] pathExtension]; NSString *assetFilename = [NSString stringWithFormat:@"asset_%@.%@",libraryId,assetFileExt]; return [NSString stringWithFormat:@"%@/%@", assetFolder, assetFilename]; } + (NSURL *) cachedAudioURLForMPMediaItem:(MPMediaItem*) item { NSString *assetFilepath = [[self class] cachedAudioFilepathForMPMediaItem:item]; return [NSURL fileURLWithPath:assetFilepath]; } 

Ora il metodo init che fa "il business"

 - (id) initWithMPMediaItem:(MPMediaItem*) item completionBlock:(void (^)(UIImage* delayedImagePreparation))completionBlock { NSFileManager *fman = [NSFileManager defaultManager]; NSString *assetPictogramFilepath = [[self class] cachedAudioPictogramPathForMPMediaItem:item]; if ([fman fileExistsAtPath:assetPictogramFilepath]) { NSLog(@"Returning cached waveform pictogram: %@",[assetPictogramFilepath lastPathComponent]); self = [self initWithContentsOfFile:assetPictogramFilepath]; return self; } NSString *assetFilepath = [[self class] cachedAudioFilepathForMPMediaItem:item]; NSURL *assetFileURL = [NSURL fileURLWithPath:assetFilepath]; if ([fman fileExistsAtPath:assetFilepath]) { NSLog(@"scanning cached audio data to create UIImage file: %@",[assetFilepath lastPathComponent]); [assetFileURL retain]; [assetPictogramFilepath retain]; [NSThread MCSM_performBlockInBackground: ^{ AVURLAsset *asset = [[AVURLAsset alloc] initWithURL:assetFileURL options:nil]; NSData *waveFormData = [self renderPNGAudioPictogramForAsset:asset]; [waveFormData writeToFile:assetPictogramFilepath atomically:YES]; [assetFileURL release]; [assetPictogramFilepath release]; if (completionBlock) { [waveFormData retain]; [NSThread MCSM_performBlockOnMainThread:^{ UIImage *result = [UIImage imageWithData:waveFormData]; NSLog(@"returning rendered pictogram on main thread (%d bytes %@ data in UIImage %0.0fx %0.0f pixels)",waveFormData.length,[imgExt uppercaseString],result.size.width,result.size.height); completionBlock(result); [waveFormData release]; }]; } }]; return nil; } else { NSString *assetFolder = [[self class] assetCacheFolder]; [fman createDirectoryAtPath:assetFolder withIntermediateDirectories:YES attributes:nil error:nil]; NSLog(@"Preparing to import audio asset data %@",[assetFilepath lastPathComponent]); [assetPictogramFilepath retain]; [assetFileURL retain]; TSLibraryImport* import = [[TSLibraryImport alloc] init]; NSURL * assetURL = [item valueForProperty:MPMediaItemPropertyAssetURL]; [import importAsset:assetURL toURL:assetFileURL completionBlock:^(TSLibraryImport* import) { //check the status and error properties of //TSLibraryImport if (import.error) { NSLog (@"audio data import failed:%@",import.error); } else{ NSLog (@"Creating waveform pictogram file: %@", [assetPictogramFilepath lastPathComponent]); AVURLAsset *asset = [[AVURLAsset alloc] initWithURL:assetFileURL options:nil]; NSData *waveFormData = [self renderPNGAudioPictogramForAsset:asset]; [waveFormData writeToFile:assetPictogramFilepath atomically:YES]; if (completionBlock) { [waveFormData retain]; [NSThread MCSM_performBlockOnMainThread:^{ UIImage *result = [UIImage imageWithData:waveFormData]; NSLog(@"returning rendered pictogram on main thread (%d bytes %@ data in UIImage %0.0fx %0.0f pixels)",waveFormData.length,[imgExt uppercaseString],result.size.width,result.size.height); completionBlock(result); [waveFormData release]; }]; } } [assetPictogramFilepath release]; [assetFileURL release]; } ]; return nil; } } 

Un esempio di invocazione di questo:

 -(void) importMediaItem { MPMediaItem* item = [self mediaItem]; // since we will be needing this for playback, save the url to the cached audio. [url release]; url = [[UIImage cachedAudioURLForMPMediaItem:item] retain]; [waveFormImage release]; waveFormImage = [[UIImage alloc ] initWithMPMediaItem:item completionBlock:^(UIImage* delayedImagePreparation){ waveFormImage = [delayedImagePreparation retain]; [self displayWaveFormImage]; }]; if (waveFormImage) { [waveFormImage retain]; [self displayWaveFormImage]; } } 

Versione logaritmica dei metodi di calcolo della media e del rendering

 #define absX(x) (x<0?0-x:x) #define minMaxX(x,mn,mx) (x<=mn?mn:(x>=mx?mx:x)) #define noiseFloor (-90.0) #define decibel(amplitude) (20.0 * log10(absX(amplitude)/32767.0)) -(UIImage *) audioImageLogGraph:(Float32 *) samples normalizeMax:(Float32) normalizeMax sampleCount:(NSInteger) sampleCount channelCount:(NSInteger) channelCount imageHeight:(float) imageHeight { CGSize imageSize = CGSizeMake(sampleCount, imageHeight); UIGraphicsBeginImageContext(imageSize); CGContextRef context = UIGraphicsGetCurrentContext(); CGContextSetFillColorWithColor(context, [UIColor blackColor].CGColor); CGContextSetAlpha(context,1.0); CGRect rect; rect.size = imageSize; rect.origin.x = 0; rect.origin.y = 0; CGColorRef leftcolor = [[UIColor whiteColor] CGColor]; CGColorRef rightcolor = [[UIColor redColor] CGColor]; CGContextFillRect(context, rect); CGContextSetLineWidth(context, 1.0); float halfGraphHeight = (imageHeight / 2) / (float) channelCount ; float centerLeft = halfGraphHeight; float centerRight = (halfGraphHeight*3) ; float sampleAdjustmentFactor = (imageHeight/ (float) channelCount) / (normalizeMax - noiseFloor) / 2; for (NSInteger intSample = 0 ; intSample < sampleCount ; intSample ++ ) { Float32 left = *samples++; float pixels = (left - noiseFloor) * sampleAdjustmentFactor; CGContextMoveToPoint(context, intSample, centerLeft-pixels); CGContextAddLineToPoint(context, intSample, centerLeft+pixels); CGContextSetStrokeColorWithColor(context, leftcolor); CGContextStrokePath(context); if (channelCount==2) { Float32 right = *samples++; float pixels = (right - noiseFloor) * sampleAdjustmentFactor; CGContextMoveToPoint(context, intSample, centerRight - pixels); CGContextAddLineToPoint(context, intSample, centerRight + pixels); CGContextSetStrokeColorWithColor(context, rightcolor); CGContextStrokePath(context); } } // Create new image UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext(); // Tidy up UIGraphicsEndImageContext(); return newImage; } - (NSData *) renderPNGAudioPictogramLogForAsset:(AVURLAsset *)songAsset { NSError * error = nil; AVAssetReader * reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error]; AVAssetTrack * songTrack = [songAsset.tracks objectAtIndex:0]; NSDictionary* outputSettingsDict = [[NSDictionary alloc] initWithObjectsAndKeys: [NSNumber numberWithInt:kAudioFormatLinearPCM],AVFormatIDKey, // [NSNumber numberWithInt:44100.0],AVSampleRateKey, /*Not Supported*/ // [NSNumber numberWithInt: 2],AVNumberOfChannelsKey, /*Not Supported*/ [NSNumber numberWithInt:16],AVLinearPCMBitDepthKey, [NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey, [NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey, [NSNumber numberWithBool:NO],AVLinearPCMIsNonInterleaved, nil]; AVAssetReaderTrackOutput* output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:outputSettingsDict]; [reader addOutput:output]; [output release]; UInt32 sampleRate,channelCount; NSArray* formatDesc = songTrack.formatDescriptions; for(unsigned int i = 0; i < [formatDesc count]; ++i) { CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i]; const AudioStreamBasicDescription* fmtDesc = CMAudioFormatDescriptionGetStreamBasicDescription (item); if(fmtDesc ) { sampleRate = fmtDesc->mSampleRate; channelCount = fmtDesc->mChannelsPerFrame; // NSLog(@"channels:%u, bytes/packet: %u, sampleRate %f",fmtDesc->mChannelsPerFrame, fmtDesc->mBytesPerPacket,fmtDesc->mSampleRate); } } UInt32 bytesPerSample = 2 * channelCount; Float32 normalizeMax = noiseFloor; NSLog(@"normalizeMax = %f",normalizeMax); NSMutableData * fullSongData = [[NSMutableData alloc] init]; [reader startReading]; UInt64 totalBytes = 0; Float64 totalLeft = 0; Float64 totalRight = 0; Float32 sampleTally = 0; NSInteger samplesPerPixel = sampleRate / 50; while (reader.status == AVAssetReaderStatusReading){ AVAssetReaderTrackOutput * trackOutput = (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0]; CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer]; if (sampleBufferRef){ CMBlockBufferRef blockBufferRef = CMSampleBufferGetDataBuffer(sampleBufferRef); size_t length = CMBlockBufferGetDataLength(blockBufferRef); totalBytes += length; NSAutoreleasePool *wader = [[NSAutoreleasePool alloc] init]; NSMutableData * data = [NSMutableData dataWithLength:length]; CMBlockBufferCopyDataBytes(blockBufferRef, 0, length, data.mutableBytes); SInt16 * samples = (SInt16 *) data.mutableBytes; int sampleCount = length / bytesPerSample; for (int i = 0; i < sampleCount ; i ++) { Float32 left = (Float32) *samples++; left = decibel(left); left = minMaxX(left,noiseFloor,0); totalLeft += left; Float32 right; if (channelCount==2) { right = (Float32) *samples++; right = decibel(right); right = minMaxX(right,noiseFloor,0); totalRight += right; } sampleTally++; if (sampleTally > samplesPerPixel) { left = totalLeft / sampleTally; if (left > normalizeMax) { normalizeMax = left; } // NSLog(@"left average = %f, normalizeMax = %f",left,normalizeMax); [fullSongData appendBytes:&left length:sizeof(left)]; if (channelCount==2) { right = totalRight / sampleTally; if (right > normalizeMax) { normalizeMax = right; } [fullSongData appendBytes:&right length:sizeof(right)]; } totalLeft = 0; totalRight = 0; sampleTally = 0; } } [wader drain]; CMSampleBufferInvalidate(sampleBufferRef); CFRelease(sampleBufferRef); } } NSData * finalData = nil; if (reader.status == AVAssetReaderStatusFailed || reader.status == AVAssetReaderStatusUnknown){ // Something went wrong. Handle it. } if (reader.status == AVAssetReaderStatusCompleted){ // You're done. It worked. NSLog(@"rendering output graphics using normalizeMax %f",normalizeMax); UIImage *test = [self audioImageLogGraph:(Float32 *) fullSongData.bytes normalizeMax:normalizeMax sampleCount:fullSongData.length / (sizeof(Float32) * 2) channelCount:2 imageHeight:100]; finalData = imageToData(test); } [fullSongData release]; [reader release]; return finalData; } 

risultati di confronto

Lineare
Trama lineare per l'inizio di "Warm It Up" di Acme Swing Company

logaritmica
Trama logaritmica per l'inizio di "Warm It Up" di Acme Swing Company

Dovresti essere in grado di ottenere un buffer di audio dal tuo sampleBuffRef e quindi scorrere questi valori per build la tua forma d’onda:

 CMBlockBufferRef buffer = CMSampleBufferGetDataBuffer( sampleBufferRef ); CMItemCount numSamplesInBuffer = CMSampleBufferGetNumSamples(sampleBufferRef); AudioBufferList audioBufferList; CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer( sampleBufferRef, NULL, &audioBufferList, sizeof(audioBufferList), NULL, NULL, kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment, &buffer ); // this copies your audio out to a temp buffer but you should be able to iterate through this buffer instead SInt32* readBuffer = (SInt32 *)malloc(numSamplesInBuffer * sizeof(SInt32)); memcpy( readBuffer, audioBufferList.mBuffers[0].mData, numSamplesInBuffer*sizeof(SInt32));