当前位置：首页 > article >正文

音视频入门基础：RTP专题（14）——FFmpeg源码中，对H.264的各种RTP有效载荷结构的解析

article 2026/3/20 8:37:33

一、引言

由《音视频入门基础：RTP专题（10）——FFmpeg源码中，解析RTP header的实现》可以知道，FFmpeg源码的rtp_parse_packet_internal函数的前半部分实现了解析某个RTP packet的RTP header的功能。而在解析完RTP header后，rtp_parse_packet_internal函数内部会执行函数指针parse_packet指向的回调函数来对不同有效载荷类型的RTP payload进行解析：

static int rtp_parse_packet_internal(RTPDemuxContext *s, AVPacket *pkt,const uint8_t *buf, int len)
{
//...if (s->handler && s->handler->parse_packet) {rv = s->handler->parse_packet(s->ic, s->dynamic_protocol_context,s->st, pkt, &timestamp, buf, len, seq,flags);}
//...
}

比如，对于有效载荷类型为H.264的payload，parse_packet指向的回调函数为h264_handle_packet函数，此时通过h264_handle_packet函数对H.264格式的payload进行解析；对于有效载荷类型为AAC的payload，parse_packet指向的回调函数为aac_parse_packet函数，此时通过aac_parse_packet函数对AAC格式的payload进行解析。

二、h264_handle_packet函数的定义

h264_handle_packet函数定义在FFmpeg源码（本文演示用的FFmpeg源码版本为7.0.1）的源文件libavformat/rtpdec_h264.c：

// return 0 on packet, no more left, 1 on packet, 1 on partial packet
static int h264_handle_packet(AVFormatContext *ctx, PayloadContext *data,AVStream *st, AVPacket *pkt, uint32_t *timestamp,const uint8_t *buf, int len, uint16_t seq,int flags)
{uint8_t nal;uint8_t type;int result = 0;if (!len) {av_log(ctx, AV_LOG_ERROR, "Empty H.264 RTP packet\n");return AVERROR_INVALIDDATA;}nal  = buf[0];type = nal & 0x1f;/* Simplify the case (these are all the NAL types used internally by* the H.264 codec). */if (type >= 1 && type <= 23)type = 1;switch (type) {case 0:                    // undefined, but pass them throughcase 1:if ((result = av_new_packet(pkt, len + sizeof(start_sequence))) < 0)return result;memcpy(pkt->data, start_sequence, sizeof(start_sequence));memcpy(pkt->data + sizeof(start_sequence), buf, len);COUNT_NAL_TYPE(data, nal);break;case 24:                   // STAP-A (one packet, multiple nals)// consume the STAP-A NALbuf++;len--;result = ff_h264_handle_aggregated_packet(ctx, data, pkt, buf, len, 0,NAL_COUNTERS, NAL_MASK);break;case 25:                   // STAP-Bcase 26:                   // MTAP-16case 27:                   // MTAP-24case 29:                   // FU-Bavpriv_report_missing_feature(ctx, "RTP H.264 NAL unit type %d", type);result = AVERROR_PATCHWELCOME;break;case 28:                   // FU-A (fragmented nal)result = h264_handle_packet_fu_a(ctx, data, pkt, buf, len,NAL_COUNTERS, NAL_MASK);break;case 30:                   // undefinedcase 31:                   // undefineddefault:av_log(ctx, AV_LOG_ERROR, "Undefined type (%d)\n", type);result = AVERROR_INVALIDDATA;break;}pkt->stream_index = st->index;return result;
}

该函数的作用是：对H.264格式的RTP payload（有效载荷）进行解析。H.264格式的RTP的payload有三种不同的有效载荷结构：Single NAL Unit Packet、Aggregation Packet（STAP-A、STAP-B、MTAP16、MTAP24）和Fragmentation Unit（FU-A、FU-B）。在h264_handle_packet函数中对这些有效载荷结构进行统一解析处理。

形参ctx：输入型参数。用来输出日志，可忽略。

形参data：输入型参数，指向一个PayloadContext（有效载荷上下文）变量。

形参st：输入型参数，指向一个AVStream类型变量。该变量存贮该路视频流的信息。

形参pkt：输出型参数。执行h264_handle_packet函数后，pkt会得到从该RTP packet的payload中得到的信息。

形参timestamp：输入型参数，其指向的变量的值为该RTP packet的RTP header中的timestamp（时间戳）。

形参buf：输入型参数，指针buf指向该RTP packet的RTP payload的第一个字节，即RTP payload header。

形参len：输入型参数，为该RTP packet的RTP payload的大小（以字节为单位）。

形参seq：输入型参数，为该RTP packet的RTP header中的sequence number（序列号）。

形参flags：输入型参数，表示该RTP packet的RTP header中的marker字段的值是否为1。

返回值：返回一个负数表示失败，FFmpeg不支持这种有效载荷结构；返回非负数表示成功。

三、h264_handle_packet函数的内部实现分析

（一）解析RTP payload header

h264_handle_packet函数内部，首先判断变量len（该RTP packet的RTP payload的大小）是否为0，如果为0，表示是空的H.264 RTP数据包，打印错误日志：“Empty H.264 RTP packet”：

    uint8_t nal;uint8_t type;int result = 0;if (!len) {av_log(ctx, AV_LOG_ERROR, "Empty H.264 RTP packet\n");return AVERROR_INVALIDDATA;}

由《音视频入门基础：RTP专题（12）——RTP封装H.264时，RTP中的NAL Unit Type》可以知道，如果RTP payload为H.264格式，那RTP payload的第一个字节就是RTP payload header，RTP payload header的结构就是NALU Header（包含forbidden_zero_bit、nal_ref_idc、nal_unit_type）。h264_handle_packet函数内部通过下面语句将RTP payload header的nal_unit_type读取出来，赋值给变量type：

    nal  = buf[0];type = nal & 0x1f;/* Simplify the case (these are all the NAL types used internally by* the H.264 codec). */if (type >= 1 && type <= 23)type = 1;

然后h264_handle_packet函数中会根据不同的nal_unit_type值执行不同的逻辑来处理不同的有效载荷结构。

（二）解析Single NAL Unit Packet

当nal_unit_type范围在1至 23（含 23）之间时，有效载荷结构为Single NAL Unit Packet，此时该RTP packet的有效载荷中只包含一个NALU，h264_handle_packet函数中会执行下面代码块来处理Single NAL Unit Packet：

    case 0:                    // undefined, but pass them throughcase 1:if ((result = av_new_packet(pkt, len + sizeof(start_sequence))) < 0)return result;memcpy(pkt->data, start_sequence, sizeof(start_sequence));memcpy(pkt->data + sizeof(start_sequence), buf, len);COUNT_NAL_TYPE(data, nal);break;

上面的代码块中，首先通过av_new_packet函数给pkt->data分配内存（关于av_new_packet函数用法可以参考：《FFmpeg源码：packet_alloc、av_new_packet、av_shrink_packet、av_grow_packet函数分析》），然后把该RTP packet的payload数据提取出来，加上值为“0001”（四字节）的起始码，存到pkt->data中：

数组start_sequence中存放的数据就是“0001”，表示H.264码流的NALU的起始码：

static const uint8_t start_sequence[] = { 0, 0, 0, 1 };

（三）解析STAP-A

当nal_unit_type值为24时，有效载荷结构为STAP-A，此时该RTP packet的有效载荷中可能包含多个NALU，h264_handle_packet函数中会执行下面代码块来处理STAP-A。

    case 24:                   // STAP-A (one packet, multiple nals)// consume the STAP-A NALbuf++;len--;result = ff_h264_handle_aggregated_packet(ctx, data, pkt, buf, len, 0,NAL_COUNTERS, NAL_MASK);break;

上面的代码块中，首先会执行语句：“buf++;len--;”让指针buf指向RTP payload header之后的数据，让len的值等于该RTP packet的RTP payload去掉RTP payload header之后的大小（以字节为单位）。然后会执行ff_h264_handle_aggregated_packet函数处理STAP-A。

ff_h264_handle_aggregated_packet函数定义在libavformat/rtpdec_h264.c中：

int ff_h264_handle_aggregated_packet(AVFormatContext *ctx, PayloadContext *data, AVPacket *pkt,const uint8_t *buf, int len,int skip_between, int *nal_counters,int nal_mask)
{int pass         = 0;int total_length = 0;uint8_t *dst     = NULL;int ret;// first we are going to figure out the total sizefor (pass = 0; pass < 2; pass++) {const uint8_t *src = buf;int src_len        = len;while (src_len > 2) {uint16_t nal_size = AV_RB16(src);// consume the length of the aggregatesrc     += 2;src_len -= 2;if (nal_size <= src_len) {if (pass == 0) {// countingtotal_length += sizeof(start_sequence) + nal_size;} else {// copyingmemcpy(dst, start_sequence, sizeof(start_sequence));dst += sizeof(start_sequence);memcpy(dst, src, nal_size);if (nal_counters)nal_counters[(*src) & nal_mask]++;dst += nal_size;}} else {av_log(ctx, AV_LOG_ERROR,"nal size exceeds length: %d %d\n", nal_size, src_len);return AVERROR_INVALIDDATA;}// eat what we handledsrc     += nal_size + skip_between;src_len -= nal_size + skip_between;}if (pass == 0) {/* now we know the total size of the packet (with the* start sequences added) */if ((ret = av_new_packet(pkt, total_length)) < 0)return ret;dst = pkt->data;}}return 0;
}

由《音视频入门基础：RTP专题（12）——RTP封装H.264时，视频的有效载荷结构》可以知道，

当有效载荷结构（即RTP layload）为STAP-A时，此时：

该RTP数据包（RTP packet） = RTP header + STAP-A

一个STAP-A = RTP payload header（此时为STAP-A NAL HDR） + 若干个single-time aggregation units

一个single-time aggregation units = NAL unit size（固定占2字节） + NAL unit（包含NALU Header）

ff_h264_handle_aggregated_packet函数中，首先会通过AV_RB16宏定义将single-time aggregation unit的NAL unit size读取出来，存入变量nal_size中。关于AV_RB16宏定义的用法可以参考：《FFmpeg源码：AV_RB32、AV_RB16、AV_RB8宏定义分析》

 uint16_t nal_size = AV_RB16(src);

通过av_new_packet函数让指针dst（即pkt->data）指向一个分配的内存块：

        if (pass == 0) {/* now we know the total size of the packet (with the* start sequences added) */if ((ret = av_new_packet(pkt, total_length)) < 0)return ret;dst = pkt->data;}

如果NAL unit size超过剩下的RTP payload的大小，表示出错了，打印错误日志："nal size exceeds length: %d %d\n"。如果没超过，根据NAL unit size的值把该RTP packet的payload中的每个NALU提取出来，每个NALU前都加上值为“0001”（四字节）的起始码，存到dst（即pkt->data）中：

            if (nal_size <= src_len) {if (pass == 0) {// countingtotal_length += sizeof(start_sequence) + nal_size;} else {// copyingmemcpy(dst, start_sequence, sizeof(start_sequence));dst += sizeof(start_sequence);memcpy(dst, src, nal_size);if (nal_counters)nal_counters[(*src) & nal_mask]++;dst += nal_size;}} else {av_log(ctx, AV_LOG_ERROR,"nal size exceeds length: %d %d\n", nal_size, src_len);return AVERROR_INVALIDDATA;}

执行ff_h264_handle_aggregated_packet函数后，pkt->data指向的缓冲区会得到该RTP packet的payload中的每个NALU的数据（可能包含多个NALU，每个NALU的数据之间以“0001”分隔）。

（四）解析FU-A

当nal_unit_type值为28时，有效载荷结构为FU-A，此时一个NALU可能会被分割成多个RTP Packet，h264_handle_packet函数中会执行下面代码块来处理FU-A：

    case 28:                   // FU-A (fragmented nal)result = h264_handle_packet_fu_a(ctx, data, pkt, buf, len,NAL_COUNTERS, NAL_MASK);break;

h264_handle_packet_fu_a函数定义在libavformat/rtpdec_h264.c中：

static int h264_handle_packet_fu_a(AVFormatContext *ctx, PayloadContext *data, AVPacket *pkt,const uint8_t *buf, int len,int *nal_counters, int nal_mask)
{uint8_t fu_indicator, fu_header, start_bit, nal_type, nal;if (len < 3) {av_log(ctx, AV_LOG_ERROR, "Too short data for FU-A H.264 RTP packet\n");return AVERROR_INVALIDDATA;}fu_indicator = buf[0];fu_header    = buf[1];start_bit    = fu_header >> 7;nal_type     = fu_header & 0x1f;nal          = fu_indicator & 0xe0 | nal_type;// skip the fu_indicator and fu_headerbuf += 2;len -= 2;if (start_bit && nal_counters)nal_counters[nal_type & nal_mask]++;return ff_h264_handle_frag_packet(pkt, buf, len, start_bit, &nal, 1);
}

由《音视频入门基础：RTP专题（12）——RTP封装H.264时，视频的有效载荷结构》可以知道，FU-A 由一个8位的碎片单元指示符（FU indicator，又称FU identifier，其实就是RTP payload header）、一个8位组的碎片单元报头（FU header）和一个碎片单元有效载荷（FU payload，又称fragmentation unit payload，H264 NAL Unit Payload）组成。

h264_handle_packet_fu_a函数中，通过下面代码将FU indicator读取出来，存到变量fu_indicator中；将FU header读取出来，存到变量fu_header中；把fu_header的S位（起始位）读取出来，存到变量start_bit中；把fu_header的Type字段（表示NAL单元有效载荷类型）读取出来，存到变量nal_type中；变量nal相当于存贮该NALU的NALU Header：

    fu_indicator = buf[0];fu_header    = buf[1];start_bit    = fu_header >> 7;nal_type     = fu_header & 0x1f;nal          = fu_indicator & 0xe0 | nal_type;

让指针buf指向FU indicator和FU header之后的数据，即指向FU-A的FU payload。让变量len的值变为该FU payload的大小（以字节为单位）：

    // skip the fu_indicator and fu_headerbuf += 2;len -= 2;

然后h264_handle_packet_fu_a函数中会调用ff_h264_handle_frag_packet函数处理该FU-A的FU payload：

    return ff_h264_handle_frag_packet(pkt, buf, len, start_bit, &nal, 1);

ff_h264_handle_frag_packet函数定义在libavformat/rtpdec_h264.c中。可以看到，执行ff_h264_handle_frag_packet函数后，pkt->data会得到该FU-A的FU payload（前面加上“0001”的起始码）中的数据，即得到该NALU在该RTP Packet中的分片数据：

int ff_h264_handle_frag_packet(AVPacket *pkt, const uint8_t *buf, int len,int start_bit, const uint8_t *nal_header,int nal_header_len)
{int ret;int tot_len = len;int pos = 0;if (start_bit)tot_len += sizeof(start_sequence) + nal_header_len;if ((ret = av_new_packet(pkt, tot_len)) < 0)return ret;if (start_bit) {memcpy(pkt->data + pos, start_sequence, sizeof(start_sequence));pos += sizeof(start_sequence);memcpy(pkt->data + pos, nal_header, nal_header_len);pos += nal_header_len;}memcpy(pkt->data + pos, buf, len);return 0;
}

（五）解析其它有效载荷结构

当nal_unit_type值为25、26、27、29时，有效载荷结构分别为STAP-B、MTAP-16、MTAP-24、FU-B。由于FFmpeg目前（截止7.0.1版本）还不支持这几种有效载荷结构，所以h264_handle_packet函数中会通过avpriv_report_missing_feature函数打印错误日志："RTP H.264 NAL unit type %d"：

    case 25:                   // STAP-Bcase 26:                   // MTAP-16case 27:                   // MTAP-24case 29:                   // FU-Bavpriv_report_missing_feature(ctx, "RTP H.264 NAL unit type %d", type);result = AVERROR_PATCHWELCOME;break;

四、总结

1.FFmpeg源码中，在h264_handle_packet函数内部统一对H.264的各种RTP有效载荷结构进行解析处理。

2.FFmpeg目前（截至7.0.1版本）还不支持STAP-B、MTAP-16、MTAP-24、FU-B这几种有效载荷结构的解析。所以如果要解析包含这几种有效载荷结构的RTP流，可能会出错。要想支持，可以修改FFmpeg源码，在h264_handle_packet函数内部添加解析对应的有效载荷结构的代码。