meta 
前言 
前一段时间公司项目有个推送内容语音播报的需求,当时让做技术调研,简单搜了下相关的文章和资料,调研一半的时候突然来了优先级更高的需求,搁置了,这两天空下来,所以继续看,并且有了可行的方案。
调研后发现这个需求用到Notification Service Extension ,网上有一些文章讲这个需求的实现,但是绝大多数讲的方案现在已经不适用了或者是只提供一个大致的思路没有具体的实现,所以把我实现这个需求的过程记录分享出来
Notification Service Extension 
Notification Service ExtensioniOS10之后才能使用,如果要想使用Notification Service Extension对通知内容进行更改,需要在推送中增加mutable-content字段并将值设置为true,使用通知扩展后推送的处理流程如图所示
完成后会在项目中生成一个target和对应的文件夹,我们的代码就要卸载NotificationService.m中
NotificationService.m文件内部有两个方法,我们可以在这个方法中对通知内容进行修改
1 2 3 - (void )didReceiveNotificationRequest:(UNNotificationRequest *)request withContentHandler:(void  (^)(UNNotificationContent *contentToDeliver))contentHandler 
在这个方法中做通知扩展终止前的兜底处理
1 2 - (void )serviceExtensionTimeWillExpire; 
到此Notification Service Extension创建完成
播报探索 
系统语音合成 
系统提供的有文字转语音播报的方法,我们在收到推送后可以传入文字直接播报出语音
1 2 3 4 5 6 7 8 9 10 11 12 13 14 - (void )didReceiveNotificationRequest:(UNNotificationRequest *)request withContentHandler:(void  (^)(UNNotificationContent * _Nonnull))contentHandler {     self .contentHandler = contentHandler;     self .bestAttemptContent = [request.content mutableCopy];               self .bestAttemptContent.title = [NSString  stringWithFormat:@"%@ [modified]" , self .bestAttemptContent.title];     NSString  *content = self .bestAttemptContent.userInfo[@"aps" ][@"alert" ][@"body" ];     AVSpeechUtterance  *utterance = [AVSpeechUtterance  speechUtteranceWithString:content];     AVSpeechSynthesisVoice  *voice = [AVSpeechSynthesisVoice  voiceWithLanguage:@"zh-CN" ];     utterance.voice = voice;     AVSpeechSynthesizer  *synth = [[AVSpeechSynthesizer  alloc] init];     [synth speakUtterance:utterance];     self .contentHandler(self .bestAttemptContent); } 
网上大部分文章也会这么写,但是现在实际应用中基本不会用这个方案原因有两个:
iOS12.1 之后系统限制了在扩展里进行播报的能力,所以此方案只能用在iOS12.1 之前语音生硬,并且多音字和英文字母在汉语语境下经常读错,比如字母E会读成额的音(这个我同一套代码在不同设备上读音不一致,没找到原因),还是三方服务效果好点>_<! 
 
内置本地音频 
后来就想,不能语音合成,那播放本地语音呢?
1 2 3 4 5 6 7 8 9 10 11 - (void )didReceiveNotificationRequest:(UNNotificationRequest *)request withContentHandler:(void  (^)(UNNotificationContent * _Nonnull))contentHandler {     self .contentHandler = contentHandler;     self .bestAttemptContent = [request.content mutableCopy];               self .bestAttemptContent.title = [NSString  stringWithFormat:@"%@ [modified]" , self .bestAttemptContent.title];     NSString  * voiceType = self .bestAttemptContent.userInfo[@"voiceType" ];     UNNotificationSound * sound = [UNNotificationSound soundNamed:[NSString  stringWithFormat:@"%@.mp3" ,voiceType]];     self .bestAttemptContent.sound = sound;     self .contentHandler(self .bestAttemptContent); } 
扩展类中有个bestAttemptContent属性,他是UNMutableNotificationContent类型,我们在修改推送内容时也是对它进行修改,而播放本地语音就是修改它的sound属性,但是这个时候产品跳出来了,说对这样的实现不太满意,太死板,只能播放固定的音频不够灵活😂,没办法只能继续看
然后就想把播报内容拆开,本地内置几段语音,根据推送内容进行拼接,然后修改sound进行播报,我先随便找了两段比较短的音频内置进项目进行测试
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 -(void )audioMergeClick{          NSString  *audioPath1 = [[NSBundle  mainBundle]pathForResource:@"1"  ofType:@"mp3" ];     NSString  *audioPath2 = [[NSBundle  mainBundle]pathForResource:@"2"  ofType:@"mp3" ];     AVURLAsset  *audioAsset1 = [AVURLAsset  assetWithURL:[NSURL  fileURLWithPath:audioPath1]];     AVURLAsset  *audioAsset2 = [AVURLAsset  assetWithURL:[NSURL  fileURLWithPath:audioPath2]];          AVMutableComposition  *composition = [AVMutableComposition  composition];          AVMutableCompositionTrack  *audioTrack1 = [composition addMutableTrackWithMediaType:AVMediaTypeAudio  preferredTrackID:0 ];     AVMutableCompositionTrack  *audioTrack2 = [composition addMutableTrackWithMediaType:AVMediaTypeAudio  preferredTrackID:0 ];          AVAssetTrack  *audioAssetTrack1 = [[audioAsset1 tracksWithMediaType:AVMediaTypeAudio ] firstObject];     AVAssetTrack  *audioAssetTrack2 = [[audioAsset2 tracksWithMediaType:AVMediaTypeAudio ]firstObject];                         [audioTrack1 insertTimeRange:CMTimeRangeMake (kCMTimeZero, audioAsset1.duration) ofTrack:audioAssetTrack1 atTime:kCMTimeZero error:nil ];     [audioTrack2 insertTimeRange:CMTimeRangeMake (kCMTimeZero, audioAsset2.duration) ofTrack:audioAssetTrack2 atTime:audioAsset1.duration error:nil ];                    AVAssetExportSession  *session = [[AVAssetExportSession  alloc]initWithAsset:composition presetName:AVAssetExportPresetAppleM4A ];          NSString  *outPutFilePath = [[self .filePath stringByDeletingLastPathComponent] stringByAppendingPathComponent:@"test.m4a" ];          if  ([[NSFileManager  defaultManager] fileExistsAtPath:outPutFilePath]) {         [[NSFileManager  defaultManager] removeItemAtPath:outPutFilePath error:nil ];     }          NSLog (@"---%@" ,[session supportedFileTypes]);     session.outputURL = [NSURL  fileURLWithPath:self .filePath];     session.outputFileType = AVFileTypeAppleM4A ;      session.shouldOptimizeForNetworkUse = YES ;        [session exportAsynchronouslyWithCompletionHandler:^{         if  (session.status == AVAssetExportSessionStatusCompleted ) {             NSLog (@"合并成功----%@" , outPutFilePath);             UNNotificationSound * sound = [UNNotificationSound soundNamed:@"test.m4a" ];             self .bestAttemptContent.sound = sound;             self .contentHandler(self .bestAttemptContent);         } else  {             self .contentHandler(self .bestAttemptContent);         }     }]; } 
然后,推送过来后播报的还是默认声音😳,于是开始找原因,一开始以为是文件格式的问题,于是把m4a转化成mp3(省略代码),还是不行,最后找到了一片文章(感谢大佬),文章说sound读取本地音频不是所有路径都可以,是有优先级的
主应用中的文件夹 
AppGroups共享目录中的Library/Sounds文件夹 
main bundle 
 
根据这个说法我开始测试,首先是第一条,我打印出APP沙盒Library下所有目录
Library/SoundsSounds目录了吧,这是我创建的,之前并没有,然后修改合成文件的保存路径,重新走起! 
 
额,依然播报的是系统默认声音,短时间内没找到原因,于是直接看第二优先级
我们知道因为沙盒机制,iOS系统的App只能访问自己的文件夹,AppGroups就是苹果提供的同一开发者账号下多App资源共享的一种方案,最低支持iOS8,我们用的是企业签名,因为公司组织架构和权限的原因麻烦了一天才把一个简单AppGroups配置完成😭,这里就不贴具体的创建和配置过程了,网上相关的资料也很多了
最终将上边的音频文件导出路径修改为AppGroups下的Library/Sounds
1 2 3 4 5 6 7 8 9 10 [session exportAsynchronouslyWithCompletionHandler:^{     if  (session.status == AVAssetExportSessionStatusCompleted ) {         NSLog (@"合并成功----%@" , outPutFilePath);         UNNotificationSound * sound = [UNNotificationSound soundNamed:@"test.m4a" ];         self .bestAttemptContent.sound = sound;         self .contentHandler(self .bestAttemptContent);     } else  {         self .contentHandler(self .bestAttemptContent);     } }]; 
记得判断文件夹是否存在,我这里简单贴一下AppGroups的操作吧
1 2 3 4 5 NSURL  *groupURL = [[NSFileManager  defaultManager] containerURLForSecurityApplicationGroupIdentifier:kGroupDefaultSuiteName];    NSURL  * sounds = [groupURL URLByAppendingPathComponent:@"/Library/Sounds/"  isDirectory:YES ];     if  (![[NSFileManager  defaultManager] contentsOfDirectoryAtPath:sounds.path error:nil ]) {         [[NSFileManager  defaultManager] createDirectoryAtPath:sounds.path withIntermediateDirectories:YES  attributes:nil  error:nil ];     } 
最终,发起推送,播报成功,但这个方案产品虽然说可以但还是不太满意,于是,继续摸索
最终方案 
那么总结下之前方案不行的原因:
直接语音转文字播报,系统限制iOS12.1后播报能力 
固定音频,不够灵活产品不满意 
拆分固定音频,拼接后播报(同上) 
 
既然这样,能不能把上边几种方案的优点结合下?将文字转语音后的音频文件存到本地然后再去播报?这里得到了另一个大佬的指点(感谢大佬),尝试过后确认方案可行☺️
查看了AVSpeechSynthesizer文档后没找到转音频文件的相关方法(可能是我眼拙,找到的请告诉我)于是去看了三方的能力,其中百度和科大讯飞的离线合成都提供了获取音频文件的方法,但是最终我用的是科大讯飞的,因为百度的只提供两个设备码供测试,科大讯飞的十个(格局打开)
详细方法看科大讯飞 文档,注册申请过程我这里就不赘述了,但是如果想用三方服务强烈建议先看文档!!!
1. 初始化离线合成引擎 
1 2 3 4 5   [IFlySetting setLogFile:LVL_ALL];   [IFlySetting showLogcat:YES ];   NSString  *initString = [[NSString  alloc] initWithFormat:@"appid=%@" , @"你的appid" ];   [IFlySpeechUtility createUtility:initString]; 
2. 设置参数 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 _iFlySpeechSynthesizer = [IFlySpeechSynthesizer sharedInstance];   _iFlySpeechSynthesizer.delegate = self ;   [[IFlySpeechUtility getUtility] setParameter:@"tts"  forKey:[IFlyResourceUtil ENGINE_START]];      [_iFlySpeechSynthesizer setParameter:[IFlySpeechConstant TYPE_LOCAL] forKey:[IFlySpeechConstant ENGINE_TYPE]];      [_iFlySpeechSynthesizer setParameter:@"xiaoyan"  forKey:[IFlySpeechConstant VOICE_NAME]];      NSString  *resPath = [[NSBundle  mainBundle] pathForResource:@"common"  ofType:@"jet" ];   NSString  *resPath1 = [[NSBundle  mainBundle] pathForResource:@"xiaoyan"  ofType:@"jet" ];   NSString  *vcnResPath = [[NSString  alloc] initWithFormat:@"%@;%@" ,resPath,resPath1];      [_iFlySpeechSynthesizer setParameter:vcnResPath forKey:@"tts_res_path" ];   [_iFlySpeechSynthesizer synthesize:content toUri:[self  pcmPath]]; 
其中-(void)synthesize:(NSString *)text toUri:(NSString*)uri方法就是离线合成后讲语音文件保存本地的方法,两个参数,第一个是要播报的文字内容,第二个是音频文件要存储的路径
3. 获取本地音频 
这边有个问题就是离线合成的语音文件是pcm格式的,不仅是讯飞,百度也是一样,pcm我们是不能直接给sound播放的,所以我们要做一个格式转换,转成mp3进行播放,贴一个pcm转mp3方法,需要用的lame三方库
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 - (BOOL )convertPcm:(NSString  *)pcmPath toMp3:(NSString  *)mp3Path {     @try  {         FILE *fpcm = fopen([pcmPath cStringUsingEncoding:NSASCIIStringEncoding ], "rb" );         if  (fpcm == NULL ) {             return  false ;         }         FILE *fmp3 = fopen([mp3Path cStringUsingEncoding:NSASCIIStringEncoding ], "wb" );         int  channelCount = 1 ;            lame_t lame = lame_init();         lame_set_in_samplerate(lame, 16000 );          lame_set_num_channels(lame, channelCount);          lame_set_VBR(lame, vbr_default);         lame_set_quality(lame, 2 );         lame_init_params(lame);         const  int  PCM_SIZE = 8192 ;         const  int  MP3_SIZE  = 8192 ;          short  int  pcm_buffer[PCM_SIZE*channelCount];         unsigned  char  mp3_buffer[MP3_SIZE ];         int  read;         int  write;         do  {             read = fread(pcm_buffer, channelCount*sizeof (short  int ), PCM_SIZE, fpcm);             if  (read == 0 ) {                 write = lame_encode_flush(lame, mp3_buffer, MP3_SIZE );             } else  {                 if  (channelCount == 1 ) {                     write = lame_encode_buffer(lame, pcm_buffer, NULL , read, mp3_buffer, MP3_SIZE );                  } else  {                     write = lame_encode_buffer_interleaved(lame, pcm_buffer, read, mp3_buffer, MP3_SIZE );                  }             }             fwrite(mp3_buffer, write, 1 , fmp3);         } while  (read != 0 );         lame_mp3_tags_fid(lame, fmp3);         lame_close(lame);         fclose(fmp3);         fclose(fpcm);     } @catch  (NSException  *exception) {         NSLog (@"catch exception, %@" , exception);         return  false ;     } @finally  {         return  true ;     } } 
最后在合成完成方法中做转换和播报处理
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 - (void ) onCompleted:(IFlySpeechError *) error {     if  (error.errorCode == 0 ) {         NSURL  *groupURL = [[NSFileManager  defaultManager] containerURLForSecurityApplicationGroupIdentifier:kGroupDefaultSuiteName];         NSURL  * sounds = [groupURL URLByAppendingPathComponent:@"/Library/Sounds/"  isDirectory:YES ];         if  (![[NSFileManager  defaultManager] contentsOfDirectoryAtPath:sounds.path error:nil ]) {             [[NSFileManager  defaultManager] createDirectoryAtPath:sounds.path withIntermediateDirectories:YES  attributes:nil  error:nil ];         }         NSURL  *mp3Path = [groupURL URLByAppendingPathComponent:@"Library/Sounds/voice.mp3"  isDirectory:NO ];         BOOL  result = [self  convertPcm:[self  pcmPath] toMp3:mp3Path.path];         if  (result) {             if  (@available(iOS 12.1 ,*)) {                 UNNotificationSound * sound = [UNNotificationSound soundNamed:@"voice.mp3" ];                 self .bestAttemptContent.sound = sound;                 self .contentHandler(self .bestAttemptContent);             }else {                 _player = [[AVAudioPlayer  alloc] initWithContentsOfURL:mp3Path error:nil ];                 [_player play];                 self .contentHandler(self .bestAttemptContent);             }         }else {             self .contentHandler(self .bestAttemptContent);         }     }else {         self .contentHandler(self .bestAttemptContent);     } } 
这里依然做了系统区分,因为实际测试后发现,iOS11的系统设置合成音频给sound后还是播放的默认声音,后来发现有人遇到类似的问题,iOS10-iOS12系统无法在推送扩展里读取到AppGroups中的音频文件,之前手边只有iOS15系统的测试机,没有发现这个问题,所以最后在依然做了区分处理低版本系统用AVAudioPlayer播放合成音频
实现总结 
创建Notification Service Extension以实现对推送消息做最后的修改 
添加AppGroup 
将要播报的文字内容用离线合成转成音频文件并存入AppGroup内的/Library/Sounds下 
修改bestAttemptContent的sound为存入本地的音频文件 
 
虽然实现这个需求废了些时间,但是实现后回头看看,也就几步😂😂😂
注意点 
Extension是单独的进程,离线合成引擎要在Extension中启动Extension启动后只有约30s时间供你操作,超时会播放默认声音推送内容要添加mutable-content字段并将值设置为true 
 
参考文章 
iOS小技能:消息推送扩展的使用 
iOS13微信收款到账语音提醒开发总结