当前位置：首页 > news >正文

（-）：wav文件转化为txt文件

news 2024/5/15 10:36:37

首先声明一下，由于水平有限，所以这里的代码没有加入错误处理程序，为version1；

第一步：明确wav格式：

一、综述
    WAVE文件作为多媒体中使用的声波文件格式之一，它是以RIFF格式为标准的。
RIFF是英文Resource Interchange File Format的缩写，每个WAVE文件的头四个
字节便是“RIFF”。
    WAVE文件是由若干个Chunk组成的。按照在文件中的出现位置包括：RIFF WAVE
Chunk, Format Chunk, Fact Chunk(可选), Data Chunk。具体见下图：

------------------------------------------------
|             RIFF WAVE Chunk                  |
|             ID = 'RIFF'                     |
|             RiffType = 'WAVE'                |
------------------------------------------------
|             Format Chunk                     |
|             ID = 'fmt '                      |
------------------------------------------------
|             Fact Chunk(optional)             |
|             ID = 'fact'                      |
------------------------------------------------
|             Data Chunk                       |
|             ID = 'data'                      |
------------------------------------------------
            图1   Wav格式包含Chunk示例

    其中除了Fact Chunk外，其他三个Chunk是必须的。每个Chunk有各自的ID，位
于Chunk最开始位置，作为标示，而且均为4个字节。并且紧跟在ID后面的是Chunk大
小（去除ID和Size所占的字节数后剩下的其他字节数目），4个字节表示，低字节
表示数值低位，高字节表示数值高位。下面具体介绍各个Chunk内容。
PS：
    所有数值表示均为低字节表示低位，高字节表示高位。

二、具体介绍
RIFF WAVE Chunk
    ==================================
    |       |所占字节数| 具体内容   |
    ==================================
    | ID    | 4 Bytes |   'RIFF'    |
    ----------------------------------
    | Size | 4 Bytes |             |
    ----------------------------------
    | Type | 4 Bytes |   'WAVE'    |
    ----------------------------------
            图2 RIFF WAVE Chunk

    以'FIFF'作为标示，然后紧跟着为size字段，该size是整个wav文件大小减去ID
和Size所占用的字节数，即FileLen - 8 = Size。然后是Type字段，为'WAVE'，表
示是wav文件。
    结构定义如下：
struct RIFF_HEADER
{
char szRiffID[4]; // 'R','I','F','F'
DWORD dwRiffSize;
char szRiffFormat[4]; // 'W','A','V','E'
};

Format Chunk
    ====================================================================
    |               |   字节数 |              具体内容                |
    ====================================================================
    | ID            | 4 Bytes |   'fmt '                             |
    --------------------------------------------------------------------
    | Size          | 4 Bytes | 数值为16或18，18则最后又附加信息     |
    -------------------------------------------------------------------- ----
    | FormatTag     | 2 Bytes | 编码方式，一般为0x0001               |     |
    --------------------------------------------------------------------     |
    | Channels      | 2 Bytes | 声道数目，1--单声道；2--双声道       |     |
    --------------------------------------------------------------------     |
    | SamplesPerSec | 4 Bytes | 采样频率                             |     |
    --------------------------------------------------------------------     |
    | AvgBytesPerSec| 4 Bytes | 每秒所需字节数                       |     |===> WAVE_FORMAT
    --------------------------------------------------------------------     |
    | BlockAlign    | 2 Bytes | 数据块对齐单位(每个采样需要的字节数) |     |
    --------------------------------------------------------------------     |
    | BitsPerSample | 2 Bytes | 每个采样需要的bit数                  |     |
    --------------------------------------------------------------------     |
    |               | 2 Bytes | 附加信息（可选，通过Size来判断有无） |     |
    -------------------------------------------------------------------- ----
                            图3 Format Chunk

    以'fmt '作为标示。一般情况下Size为16，此时最后附加信息没有；如果为18
则最后多了2个字节的附加信息。主要由一些软件制成的wav格式中含有该2个字节的
附加信息。
    结构定义如下：
struct WAVE_FORMAT
{
WORD wFormatTag;
WORD wChannels;
DWORD dwSamplesPerSec;
DWORD dwAvgBytesPerSec;
WORD wBlockAlign;
WORD wBitsPerSample;
};
struct FMT_BLOCK
{
char szFmtID[4]; // 'f','m','t',' '
DWORD dwFmtSize;
WAVE_FORMAT wavFormat;
};

Fact Chunk
    ==================================
    |       |所占字节数| 具体内容   |
    ==================================
    | ID    | 4 Bytes |   'fact'    |
    ----------------------------------
    | Size | 4 Bytes |   数值为4   |
    ----------------------------------
    | data | 4 Bytes |             |
    ----------------------------------
            图4 Fact Chunk

    Fact Chunk是可选字段，一般当wav文件由某些软件转化而成，则包含该Chunk。
    结构定义如下：
struct FACT_BLOCK
{
char szFactID[4]; // 'f','a','c','t'
DWORD dwFactSize;
};

Data Chunk
    ==================================
    |       |所占字节数| 具体内容   |
    ==================================
    | ID    | 4 Bytes |   'data'    |
    ----------------------------------
    | Size | 4 Bytes |             |
    ----------------------------------
    | data |          |             |
    ----------------------------------
             图5 Data Chunk

    Data Chunk是真正保存wav数据的地方，以'data'作为该Chunk的标示。然后是
数据的大小。紧接着就是wav数据。根据Format Chunk中的声道数以及采样bit数，
wav数据的bit位置可以分成以下几种形式：
    ---------------------------------------------------------------------
    |   单声道 |    取样1    |    取样2    |    取样3    |    取样4    |
    |           |--------------------------------------------------------
    | 8bit量化 |    声道0    |    声道0    |    声道0    |    声道0    |
    ---------------------------------------------------------------------
    |   双声道 |          取样1            |           取样2           |
    |           |--------------------------------------------------------
    | 8bit量化 | 声道0(左) | 声道1(右) | 声道0(左) | 声道1(右) |
    ---------------------------------------------------------------------
    |           |          取样1            |           取样2           |
    |   单声道 |--------------------------------------------------------
    | 16bit量化 |    声道0    | 声道0      |    声道0    | 声道0      |
    |           | (低位字节) | (高位字节) | (低位字节) | (高位字节) |
    ---------------------------------------------------------------------
    |           |                         取样1                         |
    |   双声道 |--------------------------------------------------------
    | 16bit量化 | 声道0(左) | 声道0(左) | 声道1(右) | 声道1(右) |
    |           | (低位字节) | (高位字节) | (低位字节) | (高位字节) |
    ---------------------------------------------------------------------
                         图6 wav数据bit位置安排方式

    Data Chunk头结构定义如下：
    struct DATA_BLOCK
{
char szDataID[4]; // 'd','a','t','a'
DWORD dwDataSize;
};

以上是wav格式的完整解释，我们最常用的是16字节单声道音频即：

前44个字节为头信息，我们可以直接使用；

另一个需要注意的是，wav文件以二进制打开，二进制的编码方式为补码；

一，首先定义上述的几个类，我们操纵文件时，直接操纵缓冲区；

#include<fstream>
#include<string>
#include<sstream>//几种wav文件中会遇到的数据类型
typedef unsigned long DWORD;//4个字节
typedef unsigned short WORD;//2个字节
//-----------------------------------------------------------------------------------------------------------
//用于打开一个输入文件
inline std::ifstream& gl_open_file(std::ifstream& in, const std::string& filename,std::ios::openmode mode=std::ios::in)
{in.close();in.clear();in.open(filename.c_str(),mode);return in;
}
//用于打开一个输出文件
inline std::ofstream& gl_open_file(std::ofstream& out, const std::string& filename,std::ios::openmode mode=std::ios::out|std::ios::app)
{out.close();out.clear();out.open(filename.c_str(),mode);return out;
}
//-------------------------------------------------------------------------------------------------------------
//将两个字节的补码转化为一个带正负号的整数
inline int bytes_complement(unsigned char lowbyte,unsigned char highbyte)
{WORD num=((highbyte<<8)+lowbyte);return num&0x8000 ? (-((~num+1)&0x7fff)) : num;
}
//--------------------------------------------------------------------------------------------------------------
//wave数据格式的头信息：
/* 一个wav数据由RIFF WAVE Chunk, Format Chunk ,Fact Chunk, Data Chunk组成*/
struct Riff_Header{public:char riff_id[5];  // 'R','I','F','F'DWORD riff_size;char riff_format[5]; // 'W','A','V','E'size_t  object_size;//本对象中数据所需要的字节数//以c字符串为参数，构造riff——header对象Riff_Header(char* buf);//以文件流为参数直接构造对象Riff_Header(std::ifstream& in);Riff_Header(){}//默认构造函数};struct Format_Chunk{ char format_id[5];DWORD format_size;WORD format_tag;//编码方式WORD channels;//通道数DWORD samplesPerSec;//采样频率DWORD avgBytesPerSec;//每秒需要字节数 Byte率WORD blockAlign;//每个采样需要的字节数   WORD bitsPerSample;// 每个采样需要的bit数  size_t object_size;//本对象中数据所需要的字节数//以c字符串为参数，构造Wave_format对象Format_Chunk(char* buf);//以文件流为参数构造对象Format_Chunk(std::ifstream& in);Format_Chunk(){}};struct Data_Block
{char data_id[5];// 'd','a','t','a'DWORD data_size;size_t object_size;Data_Block(char* buf);Data_Block(std::ifstream& in);Data_Block(){}};
//wav 文件中除数据之外的包含文件信息的头数据
class Wave_Header
{
public:Riff_Header riff;Format_Chunk   fmt;Data_Block data;size_t   object_size;	
public:Wave_Header(char* buf);Wave_Header(std::ifstream& in);Wave_Header(){}
};
//------------------------------------------------基本头信息类完成----------------------------------------//

上述的类每个都有一个以char指针为参数的构造函数，这里我没有定义，因为一这里没有使用这一版本构造函数；二它的实现非常简单但是很繁琐；所以就没有实现；

二，重载输入输出操作符：

/*************下为各个结构的重载的输入输出操作符**************************************/
//-------------------------------------------------------------------------------------
//面向wav文件的输入，这里定义更加普遍的模板函数,非格式化输入
template<typename charT,typename traits>
std::basic_istream<charT,traits>& operator>>(std::basic_istream<charT,traits>&  strm,Riff_Header& rf)
{std::streambuf* strm_buffer=strm.rdbuf();//rdbuf()返回缓冲区的指针strm_buffer->sgetn(rf.riff_id,4);strm_buffer->sgetn(rf.riff_format,4);//riff_format此处为缓冲区，用于读取riff_size的信息memcpy(&rf.riff_size,rf.riff_format,4);strm_buffer->sgetn(rf.riff_format,4);//后面的完善工作：ID，format变为c字符串,以便与输出操作rf.riff_id[4]='\0';rf.riff_format[4]='\0';rf.object_size=12;return strm;
}
//格式化输出
template<typename charT,typename traits>
std::basic_ostream<charT,traits>& operator<<(std::basic_ostream<charT,traits>&  strm,Riff_Header& rf)
{std::basic_ostringstream<charT,traits> s;s.copyfmt(strm);s.width(0);s<<" RIFF WAVE Chunk :"<<'\n'<<"ID:"<<rf.riff_id<<'\n'<<"TYPE:"<<rf.riff_format<<'\n';strm<<s.str();return strm;
}//------------------------------------------------------------------------------------------------------------
template<typename charT,typename traits>
std::basic_istream<charT,traits>& operator>>(std::basic_istream<charT,traits>&  strm,Format_Chunk& fc)
{std::streambuf* strm_buffer=strm.rdbuf();char tmp_buf[4];//小小缓冲区strm_buffer->sgetn(fc.format_id,4);fc.format_id[4]='\0';strm_buffer->sgetn(tmp_buf,4);memcpy(&fc.format_size,tmp_buf,4);strm_buffer->sgetn(tmp_buf,4);memcpy(&fc.format_tag,tmp_buf,2);memcpy(&fc.channels,tmp_buf+2,2);strm_buffer->sgetn(tmp_buf,4);memcpy(&fc.samplesPerSec,tmp_buf,4);strm_buffer->sgetn(tmp_buf,4);memcpy(&fc.avgBytesPerSec,tmp_buf,4);strm_buffer->sgetn(tmp_buf,4);memcpy(&fc.blockAlign,tmp_buf,2);memcpy(&fc.bitsPerSample,tmp_buf+2,2);if(fc.format_size==16){//无附加信息fc.object_size=24;}else{strm_buffer->sgetn(tmp_buf,2);//耗用两个字节fc.object_size=26;}return strm;
}
//格式化输出
template<typename charT,typename traits>
std::basic_ostream<charT,traits>& operator<<(std::basic_ostream<charT,traits>&  strm,Format_Chunk& fc)
{std::basic_ostringstream<charT,traits> s;s.copyfmt(strm);s.width(0);s<<"FORMAT CHUNK:"<<'\n'<<"FORMAT ID:"<<fc.format_id<<'\n'<<"FormatTag: "<<fc.format_tag<<'\n'<<"Channels: "<<fc.channels<<'\n'<<"SamplesPerSec:"<<fc.samplesPerSec<<'\n'<<"AvgBytesPerSec: "<<fc.avgBytesPerSec<<'\n'<<"BlockAlign: "<<fc.blockAlign<<'\n'<<"BitsPerSample(每个采样需要的bit数 ):"<<fc.bitsPerSample<<'\n';strm<<s.str();return strm;
}//-----------------------------------------------------------------------------------------------------------
template<typename charT,typename traits>
std::basic_istream<charT,traits>& operator>>(std::basic_istream<charT,traits>&  strm,Data_Block& db)
{std::streambuf* strm_buffer=strm.rdbuf();char tmp_buf[4];//小小缓冲区strm_buffer->sgetn(db.data_id,4);db.data_id[4]='\0';strm_buffer->sgetn(tmp_buf,4);memcpy(&db.data_size,tmp_buf,4);db.object_size=8;return strm;
}
//格式化输出
template<typename charT,typename traits>
std::basic_ostream<charT,traits>& operator<<(std::basic_ostream<charT,traits>&  strm,Data_Block& db)
{std::basic_ostringstream<charT,traits> s;s.copyfmt(strm);s.width(0);s<<"DATA BLOCK:"<<'\n'<<"DATA ID:"<<db.data_id<<'\n'<<"DATA SIZE:"<<db.data_size<<'\n';strm<<s.str();return strm;
}
//--------------------------------------------------------------------------------------------------------------
template<typename charT,typename traits>
std::basic_istream<charT,traits>& operator>>(std::basic_istream<charT,traits>&  strm,Wave_Header& wh)
{strm>>wh.riff>>wh.fmt>>wh.data;wh.object_size=wh.riff.object_size+wh.fmt.object_size+wh.data.object_size;return strm;
}template<typename charT,typename traits>
std::basic_ostream<charT,traits>& operator<<(std::basic_ostream<charT,traits>&  strm,Wave_Header& wh)
{std::basic_ostringstream<charT,traits> s;s.copyfmt(strm);s.width(0);s<<wh.riff<<wh.fmt<<wh.data;strm<<s.str();return strm;
}
//----------------------------------------------------------------------------------------------------------

其中的输出操作符的实现参考的是：《C++标准程序库》

三：

struct Out_Place
{std::string place_id;//"vector","file","array"....virtual void put(int elem){};virtual ~Out_Place(){}
};	//一个wav文件的完整对象,一个对象关联就是wav文件
class Wave_Total
{
public:Wave_Header* header;//数据头信息Out_Place* out_place;//out_place 是转化后内容的输出处：可能是数组，文件，和各种容器Wave_Total(){}//以wav文件名为参数，构造本文件的对象。当第二个参数是true时，我们构造头信息，否则直接读取数据Wave_Total(std::string& filename,Out_Place* op,bool need_header=true);Wave_Total(char* buf);//每个对象关联的是一个数组buf
};

这里需要说明的是，我写到这儿时候，想到并不一定是将转化后的信息输出到文件，可以输出到内存中使用，也可以输出到GUI中直接作出波形图，或者函数中直接处理（例如FFT），所以就有一个Out_Place类，代表了输出对象，其中id标示输出地，put定义输出方式；

四，构造函数：

inline Riff_Header::Riff_Header(std::ifstream& in)
{in>>*this;
}
inline Format_Chunk::Format_Chunk(std::ifstream& in)
{in>>*this;
}
inline Data_Block::Data_Block(std::ifstream& in)
{in>>*this;
}
inline Wave_Header::Wave_Header(std::ifstream& in)
{in>>*this;
}

五，由于我的任务是输出到文件中，所以我需要一个Out_place与文件关联：

Wave_Total::Wave_Total(std::string& filename,Out_Place* op,bool need_head)
{out_place=op;std::ifstream in;gl_open_file(in,filename,std::ios::binary|std::ios::in);if(need_head){//此时需要头信息header=new Wave_Header;in>>*header;}else{//此时按照一般的wav文件格式跳过头信息，一般为44个字节in.seekg(44,std::ios::beg);}//下为主要读取程序：std::streambuf* in_buffer=in.rdbuf();char tmp_buf[2];//小小缓冲区while(in_buffer->sgetn(tmp_buf,2))out_place->put(bytes_complement(tmp_buf[0],tmp_buf[1]));in.close();
}
//此类用于表示输出到文件，其中文件名作为信息存储在此对象中
struct out_to_file:public Out_Place
{std::ofstream strm;std::string outfilename;out_to_file(std::string& filename);out_to_file();void put(int elem);~out_to_file(){strm.close();}
};
out_to_file::out_to_file(std::string& filename)
{place_id="file";outfilename=filename;gl_open_file(strm,filename);
}
//此处的put控制输出操作，输出格式，输出信息
void out_to_file::put(int elem)
{strm.precision(4);strm.setf(std::ios::fixed,std::ios::floatfield);float tmp;if(elem<0){tmp=(float)elem/32768;}else{tmp=(float)elem/32767;}strm<<tmp<<'\n';
}

六：主体部分已经完成了，但是我们需要批量操作，下面定义批量操作：

#include<io.h>
//以文件名为参数的函数指针
typedef void(*PFN)(std::string& filename);
//遍历同一文件夹下的所有文件,以pfn处理每个文件
bool transfer(std::string& filename, std::string& postfix , PFN pfn );
//遍历一个文件夹下所有文件（包括被嵌套的文件），以pfn处理
void dfsFolder(std::string& folderpath,std::string& postfix,PFN pfn);

/*
struct _finddata_t
{unsigned attrib;     //文件属性time_t time_create;  //文件创建时间time_t time_access;  //文件上一次访问时间time_t time_write;   //文件上一次修改时间_fsize_t size;  //文件字节数wchar_t  name[_MAX_FNAME]; //文件名
}; 
*/#include"read.h"
#include<iostream>
//遍历同一文件夹下的所有文件
bool transfer(std::string& directoryname ,std::string& postfix ,PFN pfn)
{_finddata_t fileInfo;//搜索与指定的文件名称匹配的第一个实例，若成功则返回第一个实例的句柄，否则返回-1Llong handle = _findfirst(directoryname.c_str(), &fileInfo);std::string pathname=directoryname.substr(0,directoryname.size()-postfix.size()-1);if (handle == -1L){std::cerr << "failed to transfer files" << std::endl;return false;}do {pfn(pathname+std::string(fileInfo.name));} while (_findnext(handle, &fileInfo) == 0);//_findnext搜索与_findfirst函数提供的文件名称匹配的下一个实例，若成功则返回０，否则返回－１return true;
}
//某一目录下包括多个文件夹
void dfsFolder(std::string& folderPath,std::string& postfix,PFN pfn)
{_finddata_t FileInfo;std::string strfind = folderPath + "\\*";long Handle = _findfirst(strfind.c_str(), &FileInfo);if (Handle == -1L){std::cerr << "can not match the folder path" << std::endl;exit(-1);}do{//判断是否有子目录if (FileInfo.attrib & _A_SUBDIR)    {//这个语句很重要if( (strcmp(FileInfo.name,".") != 0 ) &&(strcmp(FileInfo.name,"..") != 0))   {std::string newPath = folderPath + "\\" + FileInfo.name;dfsFolder(newPath,postfix,pfn);}}else  {pfn(folderPath+"//"+std::string(FileInfo.name));}}while (_findnext(Handle, &FileInfo) == 0);_findclose(Handle);}

最后，我们定义一个函数指针，对于每一个文件都调用它，这里我们需要的函数指针的功能是：将wav转化为txt并储存；

//传入文件名，转换为同名txt文件
void wav_to_txt(std::string& filename)
{//假设wav转化后的txt文件都存储在E：\\wavtotxt目录下std::string outfilename=filename.substr(0,filename.size()-3);//去除wavoutfilename=outfilename+"txt";out_to_file out(outfilename);Wave_Total wt(filename,&out);
}

使用函数：

//简单版
int main()
{dfsFolder(string("E:\\wav2"),string(".wav"),wav_to_txt);
}

 //复杂版
int main(){cout<<"如果想转化一个文件，请输入f，如果想转化一个目录，请输入d："<<endl;char c;cin>>c;switch(c){case 'd':{cout<<"Please enter your directory:  ";string directoryname;cin>>directoryname;dfsFolder(directoryname,string(".wav"),wav_to_txt);break;}case 'f':{cout<<"Please enter your directory:  ";string filename;cin>>filename;wav_to_txt(filename);break;}default:cout<<"invalid input!"<<endl;break;}cout<<"success!";char c2=getchar();}

ok，任务完成。

对于不明白补码的同学，这里二进制原码反码补码有百度文库的参考

查看全文

http://www.taodudu.cc/news/show-3389041.html