Flutter语音识别插件deepgram_speech_to_text的使用
Flutter语音识别插件deepgram_speech_to_text的使用
Features
Speech-to-Text | Status | Methods |
---|---|---|
From File | ✅ | listen.file() , listen.path() |
From URL | ✅ | listen.url() |
From Byte | ✅ | listen.bytes() |
From Audio Stream | ✅ | listen.live() , listen.liveListener() |
Text-to-Speech | Status | Methods |
---|---|---|
From Text | ✅ | speak.text() |
From Text Stream | ✅ | speak.live() , speak.liveSpeaker() |
Agent Interaction | Status | Methods |
---|---|---|
Agent Interaction | 🚧 | agent.live() |
PRs are welcome for all work-in-progress 🚧 features
Getting Started
All you need is a Deepgram API key. You can get a free one by signing up on Deepgram.
Usage
Initialize Client
First, create the client with optional parameters:
String apiKey = 'your_api_key';
// Optional parameters can be passed in client's baseQueryParams or in every method's queryParams
Deepgram deepgram = Deepgram(apiKey, baseQueryParams: {
'model': 'nova-2-general',
'detect_language': true,
'filler_words': false,
'punctuation': true,
// More options here : https://developers.deepgram.com/reference/listen-file
});
Speech to Text
Call the methods under the proper listen subclass:
// From file
DeepgramListenResult resFile = await deepgram.listen.file(File('audio.wav'));
// From URL
DeepgramListenResult resUrl = await deepgram.listen.url('https://somewhere/audio.wav');
// From bytes
DeepgramListenResult resBytes = await deepgram.listen.bytes(List.from([1, 2, 3, 4, 5]));
// Streaming from audio stream (e.g., microphone)
Stream<List<int>> micStream = await AudioRecorder().startStream(RecordConfig(
encoder: AudioEncoder.pcm16bits,
sampleRate: 16000,
numChannels: 1,
));
final streamParams = {
'language': 'en', // Must specify encoding and sample_rate according to the audio stream
'encoding': 'linear16',
'sample_rate': 16000,
};
Deepgram deepgramStreaming = Deepgram(apiKey, baseQueryParams: streamParams);
// Automatically managed stream
Stream<DeepgramListenResult> streamAuto = deepgramStreaming.listen.live(micStream);
// Manually managed stream
DeepgramLiveListener listener = deepgramStreaming.listen.liveListener(micStream);
listener.stream.listen((res) {
print(res.transcript);
});
listener.start();
listener.pause();
listener.resume();
listener.close();
Text to Speech
Call the methods under the proper speak subclass:
// From text
DeepgramSpeakResult resText = await deepgram.speak.text('Hello world');
// Streaming from text stream
Stream<String> textStream = Stream.fromIterable(['Hello', 'World']);
Stream<DeepgramSpeakResult> streamTTS = deepgram.speak.live(textStream);
// Manually managed stream
DeepgramLiveSpeaker speaker = deepgram.speak.liveSpeaker(textStream);
speaker.stream.listen((res) {
print(res);
// Use the audio data if needed
});
speaker.start();
speaker.flush();
speaker.clear();
speaker.close();
Debugging Common Errors
- Ensure your API key is valid and has enough credits.
- Check parameters; some models do not support certain parameters (e.g., whisper model with live streaming).
- For empty transcripts/metadata only, verify that the encoding and sample rate match the audio stream.
Additional Information & Support
This package was created to fulfill specific needs since there were no Dart SDKs available for Deepgram. Contributions and feature requests are welcome on GitHub. If you find this project useful, consider supporting it here.
Complete Demo Example
Below is a complete demo example showcasing how to use the deepgram_speech_to_text
plugin within a Flutter application:
import 'dart:io';
import 'package:flutter/material.dart';
import 'package:deepgram_speech_to_text/deepgram_speech_to_text.dart';
import 'package:record/record.dart';
void main() => runApp(MyApp());
class MyApp extends StatelessWidget {
@override
Widget build(BuildContext context) {
return MaterialApp(
home: Scaffold(
appBar: AppBar(title: Text('Deepgram Speech To Text Demo')),
body: Center(child: SpeechToTextDemo()),
),
);
}
}
class SpeechToTextDemo extends StatefulWidget {
@override
_SpeechToTextDemoState createState() => _SpeechToTextDemoState();
}
class _SpeechToTextDemoState extends State<SpeechToTextDemo> {
final String apiKey = "YOUR_API_KEY";
final TextEditingController _textController = TextEditingController();
bool _isRecording = false;
bool _isTranscribing = false;
String _transcription = "";
late Deepgram deepgram;
@override
void initState() {
super.initState();
deepgram = Deepgram(apiKey, baseQueryParams: {
'model': 'nova-2-general',
'detect_language': true,
'filler_words': false,
'punctuation': true,
});
}
Future<void> startRecording() async {
final record = Record();
if (await record.hasPermission()) {
await record.start(
path: null, // Use null for byte array output
encoder: AudioEncoder.PCM_16, // Default is AAC
bitRate: 128000,
samplingRate: 16000,
);
setState(() {
_isRecording = true;
});
final bytes = await record.stop();
final audioStream = Stream.value(bytes);
final streamParams = {
'language': 'en',
'encoding': 'linear16',
'sample_rate': 16000,
};
final deepgramStreaming = Deepgram(apiKey, baseQueryParams: streamParams);
final listener = deepgramStreaming.listen.liveListener(audioStream);
listener.stream.listen((res) {
setState(() {
_transcription = res.transcript ?? '';
});
});
listener.start();
setState(() {
_isTranscribing = true;
});
await Future.delayed(Duration(seconds: 5)); // Simulate transcription time
listener.close();
setState(() {
_isTranscribing = false;
_isRecording = false;
});
}
}
@override
Widget build(BuildContext context) {
return Column(
mainAxisAlignment: MainAxisAlignment.center,
children: <Widget>[
ElevatedButton(
onPressed: !_isRecording && !_isTranscribing ? startRecording : null,
child: Text(_isRecording ? 'Recording...' : 'Start Recording'),
),
SizedBox(height: 20),
Text(_transcription),
],
);
}
}
In this demo, we’ve integrated the deepgram_speech_to_text
plugin into a simple Flutter app that records audio from the device’s microphone and transcribes it using Deepgram’s speech-to-text service. The transcription result is then displayed on the screen. Adjust the API key and any other configurations as necessary for your environment.
更多关于Flutter语音识别插件deepgram_speech_to_text的使用的实战系列教程也可以访问 https://www.itying.com/category-92-b0.html
更多关于Flutter语音识别插件deepgram_speech_to_text的使用的实战系列教程也可以访问 https://www.itying.com/category-92-b0.html
当然,以下是一个使用 deepgram_speech_to_text
插件在 Flutter 中进行语音识别的示例代码。这个插件允许你将音频数据发送到 Deepgram 的 API 并获取转录文本。
首先,确保你已经在 pubspec.yaml
文件中添加了 deepgram_speech_to_text
依赖:
dependencies:
flutter:
sdk: flutter
deepgram_speech_to_text: ^latest_version # 请替换为最新版本号
然后,运行 flutter pub get
来安装依赖。
接下来,创建一个 Flutter 应用并配置 deepgram_speech_to_text
插件。以下是一个完整的示例,展示如何使用该插件进行语音识别:
import 'package:flutter/material.dart';
import 'package:deepgram_speech_to_text/deepgram_speech_to_text.dart';
void main() {
runApp(MyApp());
}
class MyApp extends StatelessWidget {
@override
Widget build(BuildContext context) {
return MaterialApp(
home: Scaffold(
appBar: AppBar(
title: Text('Deepgram Speech to Text Example'),
),
body: Center(
child: SpeechRecognitionButton(),
),
),
);
}
}
class SpeechRecognitionButton extends StatefulWidget {
@override
_SpeechRecognitionButtonState createState() => _SpeechRecognitionButtonState();
}
class _SpeechRecognitionButtonState extends State<SpeechRecognitionButton> {
final DeepgramSpeechToText _speechToText = DeepgramSpeechToText();
String _transcription = '';
@override
void initState() {
super.initState();
// 配置 Deepgram API 密钥(请替换为你的实际密钥)
_speechToText.configure(apiKey: 'YOUR_DEEPGRAM_API_KEY');
}
@override
Widget build(BuildContext context) {
return ElevatedButton(
onPressed: () async {
// 开始语音识别
_startSpeechRecognition();
},
child: Text('Start Speech Recognition'),
);
}
Future<void> _startSpeechRecognition() async {
// 录制音频并发送到 Deepgram API
try {
final result = await _speechToText.recognizeSpeechFromMic(
languageModel: 'en-US', // 选择语言模型
);
setState(() {
_transcription = result.transcript;
});
ScaffoldMessenger.of(context).showSnackBar(
SnackBar(
content: Text('Transcription: $_transcription'),
),
);
} catch (e) {
print('Error: $e');
ScaffoldMessenger.of(context).showSnackBar(
SnackBar(
content: Text('Error during speech recognition'),
),
);
}
}
}
注意事项:
- API 密钥:确保你已经从 Deepgram 获取了 API 密钥,并在代码中替换
'YOUR_DEEPGRAM_API_KEY'
。 - 权限:在 Android 和 iOS 上,你需要申请麦克风权限。对于 Android,这通常在
AndroidManifest.xml
中自动处理。对于 iOS,你需要在Info.plist
中添加麦克风权限请求。 - 错误处理:示例代码中有基本的错误处理,但你可能需要根据实际需求进行更详细的错误处理。
iOS 麦克风权限配置(Info.plist)
在 ios/Runner/Info.plist
文件中添加以下内容:
<key>NSMicrophoneUsageDescription</key>
<string>This app needs access to your microphone to perform speech recognition.</string>
Android 麦克风权限配置(AndroidManifest.xml)
通常,deepgram_speech_to_text
插件会自动处理 Android 权限请求,但如果你遇到权限问题,可以手动检查 AndroidManifest.xml
文件中是否包含以下内容:
<uses-permission android:name="android.permission.RECORD_AUDIO" />
这个示例代码展示了如何使用 deepgram_speech_to_text
插件进行基本的语音识别。根据你的具体需求,你可能需要进一步定制和扩展功能。