Flutter Gemma

Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models.


Bring the power of Google’s lightweight Gemma language models directly to your Flutter applications. With Flutter Gemma, you can seamlessly incorporate advanced AI capabilities into your iOS and Android apps, all without relying on external servers.



  • Local Execution: Run Gemma models directly on user devices for enhanced privacy and offline functionality.
  • Platform Support: Compatible with both iOS and Android platforms.
  • Ease of Use: Simple interface for integrating Gemma models into your Flutter projects.


  1. Add flutter_gemma to your pubspec.yaml:

      flutter_gemma: latest_version
  2. Run flutter pub get to install.


Download Model

Obtain a pre-trained Gemma model (recommended: 2b or 2b-it) from Kaggle.

Optionally, fine-tune a model for your specific use case.

Platform Specific Setup


  1. Enable file sharing in info.plist:

  2. Change the linking type of pods to static, replace use_frameworks! in Podfile with use_frameworks! :linkage => :static.


If you want to use a GPU to work with the model, you need to add OpenGL support in the AndroidManifest.xml. If you plan to use only the CPU, you can skip this step.

Add to AndroidManifest.xml above tag </application>:

<uses-native-library android:name="libOpenCL-car.so" android:required="false"/>
<uses-native-library android:name="libOpenCL-pixel.so" android:required="false"/>


  1. Web currently works only with GPU backend models; CPU backend models are not supported by Mediapipe yet.

  2. Add dependencies to index.html file in the web folder:

    <script type="module">
    import { FilesetResolver, LlmInference } from 'https://cdn.jsdelivr.net/npm/@mediapipe/tasks-genai';
    window.FilesetResolver = FilesetResolver;
    window.LlmInference = LlmInference;

Prepare Model

Place the model in the assets or upload it to a network drive, such as Firebase.

ATTENTION!! You do not need to load the model every time the application starts; it is stored in the system files and only needs to be done once. Please carefully review the example application. You should use loadAssetModel and loadNetworkModel methods only when you need to upload the model to the device.


Loading Models from Assets (available only in debug mode)

Don’t forget to add your model to pubspec.yaml.

  1. Loading from assets:

    await FlutterGemmaPlugin.instance.loadAssetModel(fullPath: 'model.bin');
  2. Loading from assets with progress status:

    FlutterGemmaPlugin.instance.loadAssetModelWithProgress(fullPath: 'model.bin').listen(
      (progress) {
        print('Loading progress: $progress%');
      onDone: () {
        print('Model loading complete.');
      onError: (error) {
        print('Error loading model: $error');

Loading Models from Network

For web usage, you will also need to enable CORS (Cross-Origin Resource Sharing) for your network resource. To enable CORS in Firebase, you can follow the guide in the Firebase documentation: Setting up CORS.

  1. Loading from the network:

    await FlutterGemmaPlugin.instance.loadNetworkModel(url: 'https://example.com/model.bin');
  2. Loading from the network with progress status:

    FlutterGemmaPlugin.instance.loadNetworkModelWithProgress(url: 'https://example.com/model.bin').listen(
      (progress) {
        print('Loading progress: $progress%');
      onDone: () {
        print('Model loading complete.');
      onError: (error) {
        print('Error loading model: $error');


void main() async {
  await FlutterGemmaPlugin.instance.init(
    maxTokens: 512,  // maxTokens is optional, by default the value is 1024
    temperature: 1.0,  // temperature is optional, by default the value is 1.0
    topK: 1,  // topK is optional, by default the value is 1
    randomSeed: 1,  // randomSeed is optional, by default the value is 1

  runApp(const MyApp());

Generate Response

final flutterGemma = FlutterGemmaPlugin.instance;
String response = await flutterGemma.getResponse(prompt: 'Tell me something interesting');

Generate Response as a Stream

final flutterGemma = FlutterGemmaPlugin.instance;
flutterGemma.getAsyncResponse(prompt: 'Tell me something interesting').listen((String? token) => print(token));

Generate Chat Response

This method works properly only for instruction-tuned models.

final flutterGemma = FlutterGemmaPlugin.instance;
final messages = <Message>[];
messages.add(Message(text: 'Who are you?', isUser: true));
String response = await flutterGemma.getChatResponse(messages: messages);
messages.add(Message(text: response));
messages.add(Message(text: 'Really?', isUser: true));
String response = await flutterGemma.getChatResponse(messages: messages);

Generate Chat Response as a Stream

This method works properly only for instruction-tuned models.

final flutterGemma = FlutterGemmaPlugin.instance;
final messages = <Message>[];
messages.add(Message(text: 'Who are you?', isUser: true));
flutterGemma.getAsyncChatResponse(messages: messages).listen((String? token) => print(token));

The full and complete example can be found in the example folder.

Important Considerations

  • Larger models (like 7b and 7b-it) may be too resource-intensive for on-device use.

Coming Soon

  • LoRA (Low Rank Adaptation) support

Example Code

Here is a complete example of a Flutter app that uses flutter_gemma to generate responses and handle chat interactions.

import 'package:flutter/material.dart';
import 'package:flutter_gemma_example/chat_screen.dart';

void main() async {
  runApp(const ChatApp());

class ChatApp extends StatelessWidget {
  const ChatApp({super.key});

  Widget build(BuildContext context) {
    return MaterialApp(
      title: 'Flutter Gemma Example',
      darkTheme: ThemeData(
        brightness: Brightness.dark,
        textTheme: const TextTheme(
          bodyLarge: TextStyle(color: Colors.white),
          bodyMedium: TextStyle(color: Colors.white),
      themeMode: ThemeMode.dark,
      home: const SafeArea(child: ChatScreen()),

Chat Screen Example

import 'package:flutter/material.dart';
import 'package:flutter_gemma_example/flutter_gemma_plugin.dart';

class ChatScreen extends StatefulWidget {
  const ChatScreen({Key? key}) : super(key: key);

  _ChatScreenState createState() => _ChatScreenState();

class _ChatScreenState extends State<ChatScreen> {
  final TextEditingController _textController = TextEditingController();
  final List<Message> _messages = [];
  final FlutterGemmaPlugin _flutterGemma = FlutterGemmaPlugin.instance;

  void initState() {

  Future<void> _initializeGemma() async {
    await _flutterGemma.init(
      maxTokens: 512,
      temperature: 1.0,
      topK: 1,
      randomSeed: 1,
    await _flutterGemma.loadAssetModel(fullPath: 'model.bin');

  void _handleSubmitted(String text) {
    Message message = Message(text: text, isUser: true);
    setState(() {
      _messages.insert(0, message);

  void _generateResponse(Message message) async {
    String response = await _flutterGemma.getChatResponse(messages: [_message]);
    Message botMessage = Message(text: response, isUser: false);
    setState(() {
      _messages.insert(0, botMessage);

  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(
        title: const Text('Flutter Gemma Chat'),
      body: Column(
        children: <Widget>[
            child: ListView.builder(
              reverse: true,
              padding: const EdgeInsets.all(8.0),
              itemCount: _messages.length,
              itemBuilder: (context, index) {
                return _buildMessage(_messages[index], index == _messages.length - 1);
          const Divider(height: 1.0),
            decoration: BoxDecoration(color: Theme.of(context).cardColor),
            child: _buildTextComposer(),

  Widget _buildMessage(Message message, bool isLast) {
    return Padding(
      key: Key('message_${message.text}'),
      padding: EdgeInsets.only(right: 16.0, left: 16.0, bottom: isLast ? 20.0 : 0.0),
      child: Row(
        crossAxisAlignment: CrossAxisAlignment.start,
        children: <Widget>[
          if (!message.isUser)
              margin: const EdgeInsets.only(right: 16.0),
              child: CircleAvatar(child: Text('G')),
            child: Column(
              crossAxisAlignment: message.isUser ? CrossAxisAlignment.end : CrossAxisAlignment.start,
              children: <Widget>[
                  style: TextStyle(fontSize: 16.0, color: message.isUser ? Colors.blue : Colors.black),
          if (message.isUser)
              margin: const EdgeInsets.only(left: 16.0),
              child: CircleAvatar(child: Text('U')),

  Widget _buildTextComposer() {
    return IconTheme(
      data: IconThemeData(color: Theme.of(context).colorScheme.secondary),
      child: Container(
        margin: const EdgeInsets.symmetric(horizontal: 8.0),
        child: Row(
          children: <Widget>[
              child: TextField(
                controller: _textController,
                onSubmitted: _handleSubmitted,
                decoration: const InputDecoration.collapsed(hintText: 'Send a message'),
              icon: const Icon(Icons.send),
              onPressed: () => _handleSubmitted(_textController.text),

class Message {
  final String text;
  final bool isUser;

  Message({required this.text, required this.isUser});

This example demonstrates how to set up a simple chat interface using flutter_gemma to generate responses based on user input. The ChatScreen widget handles user input, displays messages, and generates responses using the Gemma model.

