宇树科技 文档中心
Source: https://support.unitree.com/home/en/G1_developer/VuiClient_Service
img[alt="1"]{ width:500px; } table th:first-of-type { width: 180pt; text-align: center; }
Introduction
The audio and lighting-related hardware facilities embedded in the upper body of G1 enable the robot to achieve voice interaction capabilities,
including:
- RGB light strip - 256 colors
- Speaker - 8Ω 3W (5W peak)
- Four - microphone array. microphone spacing of 20mm (quasi - linear, silicon microphones)
Component Location
Secondary Development
Software service version requirement:
Vui_Service >= 2.0.3.8, Vui Module>= 2.0.0.3. If the built - in service version is low, please contact technical support to upgrade to the correct version.
The audio and lighting - related interfaces currently mainly provide the following capabilities. Users can use the following combinations to develop their own voice interaction programs.
- ASR (Automatic Speech Recognition), default non - streaming model, local offline. Can include azimuth, emotion, speaker role and other information.
- TTS (Text - to - Speech), local offline. Pronunciation role is selectable, currently only supports Chinese.
- Audio stream control
- Volume control
- Light strip color control
AudioClient Class
The AudioClient class can implement functions such as text - to - speech, audio control/playback, and lighting control.
Interface List:
| Function Name | TtsMaker |
|---|---|
| Function Prototype | int32_t TtsMaker(const std::string& text, int32_t speaker_id) |
| Function Overview | Text - to - speech conversion |
| Parameters | text: Text speaker_id: Role ID |
| Return Value | Returns 0 if the call is successful, otherwise returns the relevant error code. |
| Remarks | speaker_id 0 for Chinese roles and 1 for English roles. Mixed Chinese and English modes are not supported. |
| speaker_id 0 | 你好。我是宇树科技的机器人。例程启动成功 |
| speaker_id 1 | Hello. I'm a robot from Unitree Robotics. The example has started successfully. |
| Function Name | GetVolume |
|---|---|
| Function Prototype | int32_t GetVolume(uint8_t &volume) |
| Function Overview | Get the system volume |
| Parameters | volume: Volume level (0 - 100) |
| Return Value | Returns 0 if the call is successful, otherwise returns the relevant error code. |
| Remarks |
| Function Name | SetVolume |
|---|---|
| Function Prototype | int32_t SetVolume(uint8_t volume) |
| Function Overview | Set the system volume |
| Parameters | volume: Volume level (0 - 100) |
| Return Value | Returns 0 if the call is successful, otherwise returns the relevant error code. |
| Remarks |
| Function Name | LedControl |
|---|---|
| Function Prototype | int32_t LedControl(uint8_t R, uint8_t G, uint8_t B) |
| Function Overview | Light strip control |
| Parameters | R: Red (0 - 255) G: Green (0 - 255) B: Blue (0 - 255) |
| Return Value | Returns 0 if the call is successful, otherwise returns the relevant error code. |
| Remarks | The interval between calls to this interface must be greater than 200ms. |
| Function Name | PlayStream |
|---|---|
| Function Prototype | int32_t PlayStream(std::string app_name, std::string stream_id, std::vector |
| Function Overview | Audio stream playback |
| Parameters | app_name: Application name stream_id: Identification ID, the same ID means continuous playback from cache, different IDs mean interrupting the current playback pcm_data: PCM format, sampling rate 16K, single - channel, 16 - bit |
| Return Value | Returns 0 if the call is successful, otherwise returns the relevant error code. |
| Remarks | Please pay attention to the audio format. |
| Function Name | PlayStop |
|---|---|
| Function Prototype | int32_t PlayStop(std::string app_name) |
| Function Overview | Stop playback |
| Parameters | app_name: Application name |
| Return Value | Returns 0 if the call is successful, otherwise returns the relevant error code. |
| Remarks |
ASR Messages / Audio Play State
When the robot's microphone is turned on (switch to the wake - up mode in the APP or remote control), the built - in microphone + ASR module will recognize the human voice in the environment.
Subscribe to the topic rt/audio_msg (class type: std_msgs::msg::dds_::String_) to obtain the recognition information provided by the built-in offline ASR module.
{
"index": 1,
"timestamp":29319303490
"text": "Hello",
"angle": 90,
"speaker_id": 0,
"sense": "unknown",
"confidence": 0.95,
"language": "en - US",
"is_final": true
}
| Parameter Name | Parameter Type | Meaning |
|---|---|---|
| index | Integer | Unique message sequence number |
| timestamp | Integer | Timestamp |
| text | String | Speech recognition result |
| angle | Integer | Azimuth angle 0 - 180 |
| speaker_id | Integer | Speaker recognition result |
| sense | String | Emotion recognition result |
| confidence | Float | Confidence level |
| language | String | Language type |
| is_final | Integer | End flag (used in the streaming recognition mode, non - streaming by default) |
Play State
{
"play_state": 1
}
| Parameter Name | Parameter Type | Meaning |
|---|---|---|
| play_state | Integer | 0:play stop 1:start play |
#include <fstream>
#include <iostream>
#include <thread>
#include <unitree/common/time/time_tool.hpp>
#include <unitree/idl/ros2/String_.hpp>
#include <unitree/robot/channel/channel_subscriber.hpp>
#include <unitree/robot/g1/audio/g1_audio_client.hpp>
#include "wav.hpp"
#define AUDIO_FILE_PATH "../example/g1/audio/test.wav"
#define AUDIO_SUBSCRIBE_TOPIC "rt/audio_msg"
#define GROUP_IP "239.168.123.161"
#define PORT 5555
#define WAV_SECOND 5 // record seconds
#define WAV_LEN (16000 * 2 * WAV_SECOND)
int sock;
void asr_handler(const void *msg) {
std_msgs::msg::dds_::String_ *resMsg = (std_msgs::msg::dds_::String_ *)msg;
std::cout << "Topic:\"rt/audio_msg\" recv: " << resMsg->data() << std::endl;
}
std::string get_local_ip_for_multicast() {
struct ifaddrs *ifaddr, *ifa;
char host[NI_MAXHOST];
std::string result = "";
getifaddrs(&ifaddr);
for (ifa = ifaddr; ifa != nullptr; ifa = ifa->ifa_next) {
if (!ifa->ifa_addr || ifa->ifa_addr->sa_family != AF_INET) continue;
getnameinfo(ifa->ifa_addr, sizeof(struct sockaddr_in), host, NI_MAXHOST, NULL, 0, NI_NUMERICHOST);
std::string ip(host);
if (ip.find("192.168.123.") == 0) {
result = ip;
break;
}
}
freeifaddrs(ifaddr);
return result;
}
void thread_mic(void) {
sock = socket(AF_INET, SOCK_DGRAM, 0);
sockaddr_in local_addr{};
local_addr.sin_family = AF_INET;
local_addr.sin_port = htons(PORT);
local_addr.sin_addr.s_addr = INADDR_ANY;
bind(sock, (sockaddr *)&local_addr, sizeof(local_addr));
ip_mreq mreq{};
inet_pton(AF_INET, GROUP_IP, &mreq.imr_multiaddr);
std::string local_ip = get_local_ip_for_multicast();
std::cout << "local ip: "<<local_ip << std::endl;
mreq.imr_interface.s_addr = inet_addr(local_ip.c_str());
setsockopt(sock, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq));
int total_bytes = 0;
std::vector<int16_t> pcm_data;
pcm_data.reserve(WAV_LEN / 2);
std::cout << "start record!" << std::endl;
while (total_bytes < WAV_LEN) {
char buffer[2048];
ssize_t len = recvfrom(sock, buffer, sizeof(buffer), 0, nullptr, nullptr);
if (len > 0) {
size_t sample_count = len / 2;
const int16_t *samples = reinterpret_cast<const int16_t *>(buffer);
pcm_data.insert(pcm_data.end(), samples, samples + sample_count);
total_bytes += len;
}
}
WriteWave("record.wav", 16000, pcm_data.data(), pcm_data.size(), 1);
std::cout << "record finish! save to record.wav " << std::endl;
}
int main(int argc, char const *argv[]) {
if (argc < 2) {
std::cout << "Usage: audio_client_example [NetWorkInterface(eth0)]"
<< std::endl;
exit(0);
}
int32_t ret;
/*
* Initilaize ChannelFactory
*/
unitree::robot::ChannelFactory::Instance()->Init(0, argv[1]);
unitree::robot::g1::AudioClient client;
client.Init();
client.SetTimeout(10.0f);
/*ASR message Example*/
unitree::robot::ChannelSubscriber<std_msgs::msg::dds_::String_> subscriber(
AUDIO_SUBSCRIBE_TOPIC);
subscriber.InitChannel(asr_handler);
/*Volume Example*/
uint8_t volume;
ret = client.GetVolume(volume);
std::cout << "GetVolume API ret:" << ret
<< " volume = " << std::to_string(volume) << std::endl;
ret = client.SetVolume(100);
std::cout << "SetVolume to 100% , API ret:" << ret << std::endl;
/*TTS Example*/
ret = client.TtsMaker("你好。我是宇树科技的机器人。例程启动成功",
0); // Auto play
std::cout << "TtsMaker API ret:" << ret << std::endl;
unitree::common::Sleep(5);
ret = client.TtsMaker(
"Hello. I'm a robot from Unitree Robotics. The example has started "
"successfully. ",
1); // Engilsh TTS
std::cout << "TtsMaker API ret:" << ret << std::endl;
unitree::common::Sleep(8);
/*Audio Play Example*/
int32_t sample_rate = -1;
int8_t num_channels = 0;
bool filestate = false;
std::vector<uint8_t> pcm =
ReadWave(AUDIO_FILE_PATH, &sample_rate, &num_channels, &filestate);
std::cout << "wav file sample_rate = " << sample_rate
<< " num_channels = " << std::to_string(num_channels)
<< " filestate =" << filestate << std::endl;
if (filestate && sample_rate == 16000 && num_channels == 1) {
client.PlayStream(
"example", std::to_string(unitree::common::GetCurrentTimeMillisecond()),
pcm);
std::cout << "start play stream" << std::endl;
unitree::common::Sleep(3);
std::cout << "stop play stream" << std::endl;
ret = client.PlayStop("example");
} else {
std::cout << "audio file format error, please check!" << std::endl;
}
/*LED Control Example*/
client.LedControl(0, 255, 0);
unitree::common::Sleep(1);
client.LedControl(0, 0, 0);
unitree::common::Sleep(1);
client.LedControl(0, 0, 255);
std::cout << "AudioClient api test finish , asr start..." << std::endl;
std::thread mic_t(thread_mic);
while (1) {
sleep(1); // wait for asr message
}
mic_t.join();
return 0;
}