最近逛瀚思彼岸学习,发现可以用上tts的场景挺多的。刚好手里有台充值信仰入的索尼SRX-X77音响,于是尝试进行接入,发现了问题:HA能自动识别出来,但无法播放;进一步测试发现,Kodi软件配置dlna服务后给HA调用功能正常、另外使用windows media调用X77也正常。由此开始了漫长的折腾,期间研读插件代码、学习dlna协议、还抓包对比分析,最后算是解决了,虽然不完美。
1.准备
Tips
dlna插件上有输入文本播放语音,需要调用tts插件完成。
2.过程
2.1解决状态识别失败
- 修改{[HA配置目录]}/custom_components/dlna.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23#小改,改动地方见中文备注
#……省略……#
def register_device(self,location_url):
#……省略……#
try:
SCPDURLpath = info.find(
"./device/serviceList/service/[serviceType='{}']/SCPDURL".format(
UPNP_DEFAULT_SERVICE_TYPE
)
).text #增加".text"
except:
SCPDURLpath = ''
try:
eventSubURLpath = info.find(
"./device/serviceList/service/[serviceType='{}']/eventSubURL".format( #参考X77的配置xml,将原来的“eventSubURLpath”改成“eventSubURL”
UPNP_DEFAULT_SERVICE_TYPE
)
).text #增加".text"
except:
eventSubURLpath = ''
#……省略……# - {[HA配置目录]}/custom_components/media_player/dlna.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28#小改,改动地方见中文备注
#……省略……#
def soapRequest(self,location, service, fnName, fnParams):
bodyString = '<?xml version="1.0"?>'
bodyString += '<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/" s:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">'#s:encodingStyle段描述从<s:Body>上移
bodyString += ' <s:Body>'
bodyString += ' <u:' + fnName + ' xmlns:u="' + service + '">'
bodyString += ' ' + fnParams
bodyString += ' </u:' + fnName + '>'
bodyString += ' </s:Body>'
bodyString += '</s:Envelope>'
headers = {
'Content-Type': 'text/xml; charset="utf-8"',
'Cache-Control': 'no-cache',
'Connection': 'Close',
'Pragma': 'no-cache',
'Accept': 'text/xml',
'User-Agent': 'UPnP/1.0',
'SOAPAction': '"'+service + '#' + fnName + '"'
}
try:
res = requests.post(location, data=bodyString.encode('utf-8'), headers=headers,timeout=10)
res.encoding = 'utf-8'
except Exception as e:
_LOGGER.error("send Request Error:{}".format(e))
return None
return res
#……省略……#2.2解决无法播放tts
- {[HA配置目录]}custom_components/media_player/dlna.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17#……省略……#
def play_media(self, media_type, media_id, **kwargs):
"""Send play_media commmand."""
self._device.Stop() #增加停止方法
media_data=('<DIDL-Lite xmlns="urn:schemas-upnp-org:metadata-1-0/DIDL-Lite/" xmlns:upnp="urn:schemas-upnp-org:metadata-1-0/upnp/" xmlns:dc="http://purl.org/dc/elements/1.1/">'
'<item id="f-0" parentID="0" restricted="0">'
'<dc:title>Audio</dc:title>'
'<dc:creator>Anonymous</dc:creator>'
'<upnp:class>object.item.audioItem.musicTrack</upnp:class>'
'<res duration="0:00:00.500" bitrate="2000" bitsPerSample="16" sampleFrequency="16000" nrAudioChannels="1" protocolInfo="http-get:*:audio/mpeg:DLNA.ORG_PN=MP3;DLNA.ORG_OP=01;DLNA.ORG_FLAGS=01500000000000000000000000000000">{}</res>'
'</item>'
'</DIDL-Lite>') #增加设置media_data参数
if self._device.SetAVTransportURI(media_id,media_data.format(media_id))['status'] == 200: #增加判断是否成功设置媒体URI
self._device.Play()
pass
#……省略……#参数
参考windows media player调用x77的dlna服务抓包设置的参数。
2.3解决播放tts不完整问题(播到最后两个字就中断了)
- {[HA安装目录]}/components/http/view.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19#小改,改动地方见中文备注
class HomeAssistantView(object):
#……省略……#
def register(self, router):
"""Register the view with a router."""
assert self.url is not None, 'No url set for view'
urls = [self.url] + self.extra_urls
for method in ('get', 'post', 'delete', 'put', 'head'): #增加head处理方法
handler = getattr(self, method, None)
if not handler:
continue
handler = request_handler_factory(self, handler)
for url in urls:
router.add_route(method, url, handler)
#……省略……# - {[HA安装目录]}/components/tts/init.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48#增加head方法,同时响应断点续传的get请求
class TextToSpeechView(HomeAssistantView):
"""TTS view to serve a speech audio."""
requires_auth = False
url = '/api/tts_proxy/{filename}'
name = 'api:tts:speech'
def __init__(self, tts):
"""Initialize a tts view."""
self.tts = tts
def get(self, request, filename):
"""Start a get request."""
try:
content, data = yield from self.tts.async_read_tts(filename)
except HomeAssistantError as err:
_LOGGER.error("Error on load tts: %s", err)
return web.Response(status=404)
#旧方法 return web.Response(body=data, content_type=content)
#至后面代码都是新增
headers = {
'Accept-Ranges': 'bytes',
'TransferMode.DLNA.ORG': 'Streaming',
'ContentFeatures.DLNA.ORG': 'DLNA.ORG_PN=MP3;DLNA.ORG_OP=01;DLNA.ORG_FLAGS=01500000000000000000000000000000'
}
if request.headers.get('X-AV-Physical-Unit-Info') == 'pa="SRS-X77"' and request.headers.get('Range') is None:
exit()
return web.FileResponse(headers=headers, path=os.path.join(self.tts.cache_dir, filename))
def head(self, request, filename):
"""Start a head request."""
try:
content, data = yield from self.tts.async_read_tts(filename)
except HomeAssistantError as err:
_LOGGER.error("Error on load tts: %s", err)
return web.Response(status=404)
GMT_FORMAT = '%a, %d %b %Y %H:%M:%S GMT'
headers = {
'Accept-Ranges': 'bytes',
'TransferMode.DLNA.ORG': 'Streaming',
'Last-Modified': dt_util.utcnow().strftime(GMT_FORMAT),
'ContentFeatures.DLNA.ORG': 'DLNA.ORG_PN=MP3;DLNA.ORG_OP=01;DLNA.ORG_FLAGS=01500000000000000000000000000000'
}
return web.Response(headers=headers, body=data, content_type=content)Tips
抓包发现X77get请求tts语音文件使用的是断点续传,所以改用web.FileResponse。
3.小结
- X77请求tts语音,会先发一个head请求,如果HA不处理,会导致播放不完整;虽然实现断点续传,测试下来好像不用效果是一样的。
- 最后还有个问题:文字(过短)过短还是没声音,5个数字内不行,超过5个数字就正常。但kodi是正常的,media player直接推送给X77播放正常,模拟与kodi交互以及media player直接推送交换问题一样,只能判断是bug范畴了。有个想法是给短语音加上静音延长时间,想着播放短语音的场景应该也不多,先搁浅了,毕竟还要花时间学一堆偏门的东西,ha的东西都还有很多没学。。。。