VideoToolBox编码器解读 VideoToolBox编码器解读

VTCompressionSession 要将相机采集到的数据进行编码，使用VideoToolBox的编码器进行编码
编码器作用就是将未压缩的CVPixelBuffer数据编码成压缩的CMSampleBuffer数据
未压缩的数据是CVPixelBuffer，压缩的数据是CMBlockBuffer,都是被封装在CMSampleBuffer里
压缩数据和未压缩数据的区别
image.png 编码器的作用如下

文章图片
image.png 编码器使用的工作流如下：
image.png 1.创建编码器，需要提供的数据
文章图片
image.png 函数

//OSStatus 返回创建的状态 OSStatus VTCompressionSessionCreate( //1. 一个分配器默认为Null CFAllocatorRef allocator, //2. 视频帧的像素宽度 int32_t width, //3. 视频帧的像素高度 int32_t height, //4. 编码类型常用H264编码 kCMVideoCodecType_H264 CMVideoCodecType codecType, //5. 编码方式Null由videoToolBox选择 CFDictionaryRef encoderSpecification, //6. 创建一个像素缓冲池的属性 Null为由videoToolBox创建 CFDictionaryRef sourceImageBufferAttributes, //7. 数据压缩分配器默认为Null CFAllocatorRef compressedDataAllocator, //8. 输出回调VTCompressionSessionEncodeFrame VTCompressionOutputCallback outputCallback, //9. 回调对象 void *outputCallbackRefCon, //10. VTCompressionSession 要创建的对象 VTCompressionSessionRef_Nullable *compressionSessionOut );

从Camera摄像头采集到的CMSampleBuffer容器数据,获取其包含的未编码的数据CVPixelBuffer(即CVImageBufferRef)，拿到宽高，并使用编码成kCMVideoCodecType_H264的基本数据Element Stream流，填充session

CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer); size_t width = CVPixelBufferGetWidth(imageBuffer); size_t height = CVPixelBufferGetHeight(imageBuffer); VTCompressionSessionRef session; OSStatus ret = VTCompressionSessionCreate(NULL, (int)width, (int)height, kCMVideoCodecType_H264, NULL, NULL, NULL, OutputCallback, NULL, &session);

2.设置编码器的属性 image.png

VTSessionSetProperty(session, key: kVTCompressionPropertyKey_RealTime, value: kCFBooleanTrue) VTSessionSetProperty(session, key: kVTCompressionPropertyKey_ProfileLevel, value: kVTProfileLevel_H264_High_AutoLevel) VTSessionSetProperty(session, key: kVTCompressionPropertyKey_AllowFrameReordering, value: kCFBooleanFalse) // 不产生 B 帧

3.开始编码 VTCompressionSessionEncodeFrame开始编码，编码后的数据在回调进行处理

文章图片
image.png

/*! @functionVTCompressionSessionEncodeFrame @abstract Call this function to present frames to the compression session. Encoded frames may or may not be output before the function returns. @discussion The client should not modify the pixel data after making this call. The session and/or encoder will retain the image buffer as long as necessary. @paramsession The compression session. @paramimageBuffer A CVImageBuffer containing a video frame to be compressed. Must have a nonzero reference count. @parampresentationTimeStamp The presentation timestamp for this frame, to be attached to the sample buffer. Each presentation timestamp passed to a session must be greater than the previous one. @paramduration The presentation duration for this frame, to be attached to the sample buffer. If you do not have duration information, pass kCMTimeInvalid. @paramframeProperties Contains key/value pairs specifying additional properties for encoding this frame. Note that some session properties may also be changed between frames. Such changes have effect on subsequently encoded frames. @paramsourceFrameRefcon Your reference value for the frame, which will be passed to the output callback function. @paraminfoFlagsOut Points to a VTEncodeInfoFlags to receive information about the encode operation. The kVTEncodeInfo_Asynchronous bit may be set if the encode is (or was) running asynchronously. The kVTEncodeInfo_FrameDropped bit may be set if the frame was dropped (synchronously). Pass NULL if you do not want to receive this information. */ VT_EXPORT OSStatus VTCompressionSessionEncodeFrame( CM_NONNULL VTCompressionSessionRefsession, CM_NONNULL CVImageBufferRefimageBuffer, CMTimepresentationTimeStamp, CMTimeduration, // may be kCMTimeInvalid CM_NULLABLE CFDictionaryRefframeProperties, void * CM_NULLABLEsourceFrameRefcon, VTEncodeInfoFlags * CM_NULLABLEinfoFlagsOut )

PTS: 即presentationTimestamp，每一次传入的一定比前面一帧的要大，编码后被依附到CVSampleBuffer里，也可以不设置

let currentTimeMills = Int64(CFAbsoluteTimeGetCurrent() * 1000) if self.encodingTimeMills == -1 { self.encodingTimeMills = currentTimeMills } let encodingDuration = currentTimeMills - self.encodingTimeMillslet presentationTimestamp = CMTimeMake(value: encodingDuration, timescale: 1000) // 当前编码视频帧的时间戳，单位为毫秒VTCompressionSessionEncodeFrame(session, imageBuffer, presentationTimestamp, kCMTimeInvalid, NULL, NULL, NULL);

编码的回调处理编码后，得到的是压缩的数据CMSampleBuffer,
回调处理

void OutputCallback(void *outputCallbackRefCon, void *sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer)

要对压缩数据CMSampleBuffer进行处理.

文章图片
image.png 压缩后的数据CMSampleBuffer, 是MEPEG-4形式，需要转为Elementary Stream形式，再在下面两处地方使用
一般有两种类型的使用，一是网络流，另一个是写到文件，他们都是接受Elementary Stream形式的数据，在回调里，要进行相应的处理，转为Elementary Stream形式
以写入到文件.mov为例,
一个CMSampleBuffer只能是一帧数据，但一帧可能有多个NALU单元，所以需要循环的把NALU转到NSData里,一个NALU由startcode码+type类型+数据组成
image.png 在抽取NALU时，因为CMSampleBuffer里的数据是MPEG-4形式的，即含有一个四个字节大小的header，header里存储的是NALU数据的大小，要进行相应的转化

文章图片
image.png 【VideoToolBox编码器解读】另外，对于关键帧I帧，还需要抽取sps，pps，转到NSData里

文章图片
image.png

void encodeOutputDataCallback(void * CM_NULLABLE outputCallbackRefCon, void * CM_NULLABLE sourceFrameRefCon, OSStatus status, VTEncodeInfoFlags infoFlags, CM_NULLABLE CMSampleBufferRef sampleBuffer) { if (noErr != status || nil == sampleBuffer) { NSLog(@"VEVideoEncoder::encodeOutputCallback Error : %d!", (int)status); return; }if (nil == outputCallbackRefCon) { return; }if (!CMSampleBufferDataIsReady(sampleBuffer)) { return; }if (infoFlags & kVTEncodeInfo_FrameDropped) { NSLog(@"VEVideoEncoder::H264 encode dropped frame."); return; }VEVideoEncoder *encoder = (__bridge VEVideoEncoder *)outputCallbackRefCon; const char header[] = "\x00\x00\x00\x01"; size_t headerLen = (sizeof header) - 1; NSData *headerData = https://www.it610.com/article/[NSData dataWithBytes:header length:headerLen]; // 判断是否是关键帧 bool isKeyFrame = !CFDictionaryContainsKey((CFDictionaryRef)CFArrayGetValueAtIndex(CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, true), 0), (const void *)kCMSampleAttachmentKey_NotSync); if (isKeyFrame) { NSLog(@"VEVideoEncoder::编码了一个关键帧"); CMFormatDescriptionRef formatDescriptionRef = CMSampleBufferGetFormatDescription(sampleBuffer); // 关键帧需要加上SPS、PPS信息 size_t sParameterSetSize, sParameterSetCount; const uint8_t *sParameterSet; OSStatus spsStatus = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(formatDescriptionRef, 0, &sParameterSet, &sParameterSetSize, &sParameterSetCount, 0); size_t pParameterSetSize, pParameterSetCount; const uint8_t *pParameterSet; OSStatus ppsStatus = CMVideoFormatDescriptionGetH264ParameterSetAtIndex(formatDescriptionRef, 1, &pParameterSet, &pParameterSetSize, &pParameterSetCount, 0); if (noErr == spsStatus && noErr == ppsStatus) { NSData *sps = [NSData dataWithBytes:sParameterSet length:sParameterSetSize]; NSData *pps = [NSData dataWithBytes:pParameterSet length:pParameterSetSize]; NSMutableData *spsData = [NSMutableData data]; [spsData appendData:headerData]; [spsData appendData:sps]; if ([encoder.delegate respondsToSelector:@selector(videoEncodeOutputDataCallback:isKeyFrame:)]) { [encoder.delegate videoEncodeOutputDataCallback:spsData isKeyFrame:isKeyFrame]; }NSMutableData *ppsData = [NSMutableData data]; [ppsData appendData:headerData]; [ppsData appendData:pps]; if ([encoder.delegate respondsToSelector:@selector(videoEncodeOutputDataCallback:isKeyFrame:)]) { [encoder.delegate videoEncodeOutputDataCallback:ppsData isKeyFrame:isKeyFrame]; } } }CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer); size_t length, totalLength; char *dataPointer; status = CMBlockBufferGetDataPointer(blockBuffer, 0, &length, &totalLength, &dataPointer); if (noErr != status) { NSLog(@"VEVideoEncoder::CMBlockBufferGetDataPointer Error : %d!", (int)status); return; }size_t bufferOffset = 0; static const int avcHeaderLength = 4; while (bufferOffset < totalLength - avcHeaderLength) { // 读取 NAL 单元长度 uint32_t nalUnitLength = 0; memcpy(&nalUnitLength, dataPointer + bufferOffset, avcHeaderLength); // 网络上传输数据普遍采用的都是大端 // 大端转小端 nalUnitLength = CFSwapInt32BigToHost(nalUnitLength); NSData *frameData = [[NSData alloc] initWithBytes:(dataPointer + bufferOffset + avcHeaderLength) length:nalUnitLength]; NSMutableData *outputFrameData = [NSMutableData data]; [outputFrameData appendData:headerData]; [outputFrameData appendData:frameData]; bufferOffset += avcHeaderLength + nalUnitLength; if ([encoder.delegate respondsToSelector:@selector(videoEncodeOutputDataCallback:isKeyFrame:)]) { [encoder.delegate videoEncodeOutputDataCallback:outputFrameData isKeyFrame:isKeyFrame]; } }}

4.完成编码