Voice Activation - Đánh thức Jarvis bằng "Hey Jarvis"

AI Hunter · Lúc 02:49:54 Hôm qua

Hôm nay chúng ta sẽ vứt bàn phím đi. Từ giờ, bạn chỉ cần ngồi dựa lưng vào ghế và nói: "Hey Jarvis, báo cáo tình hình hôm nay!".

Quy trình hoạt động như sau:

Wake Word: Máy tính luôn lắng nghe ngầm. Khi phát hiện từ khóa "Jarvis", nó sẽ "Ting" một cái.
STT (Speech to Text): Ghi âm câu lệnh tiếp theo và chuyển thành văn bản.
Processing: Gửi văn bản vào Backend (Docker) để xử lý.
TTS (Text to Speech): Nhận câu trả lời và đọc to lên bằng giọng chị Google/Microsoft.

Voice Activation - Đánh thức Jarvis bằng Hey Jarvis.jpg

1. Chuẩn bị thư viện (Trên máy thật)

Vì chạy trên máy thật, bạn cần cài Python và các thư viện sau.
Mở Terminal (CMD/PowerShell) trên máy tính của bạn:

Mã:

pip install pvporcupine pyaudio SpeechRecognition edge-tts pygame requests

Giải thích:

pvporcupine: Engine bắt từ khóa "Jarvis" cực nhạy (của Picovoice).
SpeechRecognition: Chuyển giọng nói thành văn bản.
edge-tts: Giọng đọc tiếng Việt cực hay của Microsoft Edge (Miễn phí).
pygame: Để phát file âm thanh MP3.

*Lưu ý: Nếu cài pyaudio bị lỗi (thường gặp trên Windows), hãy Google "install pyaudio windows pipwin" để fix.

2. Lấy Key Porcupine (Miễn phí)

Để dùng từ khóa "Jarvis", bạn cần đăng ký account free:

Truy cập: https://console.picovoice.ai/
Đăng ký tài khoản.
Copy chuỗi AccessKey ở trang chủ.

3. Viết Client Voice ("Cơ thể" của Jarvis)

Tạo một file mới tên jarvis_voice.py ở thư mục gốc (ngang hàng với docker-compose.yml), dán code sau vào:

Python:

import struct
import pyaudio
import pvporcupine
import speech_recognition as sr
import requests
import json
import os
import asyncio
import edge_tts
import pygame

# --- CẤU HÌNH ---
PICOVOICE_ACCESS_KEY = "DÁN_KEY_CỦA_BẠN_VÀO_ĐÂY"
BACKEND_URL = "http://localhost:8000/chat"
API_SECRET = "sieumatkhau123456" # Phải trùng với file docker-compose ở bài 24

# --- HÀM NÓI (TTS) ---
async def speak(text):
    print(f"🗣️ Jarvis: {text}")
    communicate = edge_tts.Communicate(text, "vi-VN-HoaiMyNeural") # Giọng nữ miền Nam
    await communicate.save("reply.mp3")
  
    # Phát âm thanh
    pygame.mixer.init()
    pygame.mixer.music.load("reply.mp3")
    pygame.mixer.music.play()
    while pygame.mixer.music.get_busy():
        pygame.time.Clock().tick(10)
  
    # Xóa file tạm
    pygame.mixer.quit()
    if os.path.exists("reply.mp3"):
        os.remove("reply.mp3")

# --- HÀM GỌI BACKEND ---
def chat_with_brain(text):
    headers = {
        "Authorization": f"Bearer {API_SECRET}",
        "Content-Type": "application/json"
    }
    try:
        response = requests.post(BACKEND_URL, json={"message": text}, headers=headers)
        if response.status_code == 200:
            return response.json().get("answer", "Xin lỗi, tôi không nghe rõ.")
        else:
            return f"Lỗi Server: {response.status_code}"
    except Exception as e:
        return "Không kết nối được với não bộ."

# --- HÀM NGHE LỆNH (STT) ---
def listen_command():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print("🎤 Đang nghe lệnh...")
        # Phát tiếng 'Ting' nhẹ để biết đã kích hoạt (Optional)
        try:
            audio = r.listen(source, timeout=5, phrase_time_limit=10)
            text = r.recognize_google(audio, language="vi-VN")
            print(f"👤 Bạn: {text}")
            return text
        except sr.WaitTimeoutError:
            print("❌ Không nghe thấy gì.")
            return None
        except sr.UnknownValueError:
            print("❌ Không hiểu giọng nói.")
            return None

# --- VÒNG LẶP CHÍNH (WAKE WORD) ---
def main():
    # Khởi tạo Porcupine với từ khóa 'jarvis'
    porcupine = pvporcupine.create(
        access_key=PICOVOICE_ACCESS_KEY,
        keywords=["jarvis"]
    )
  
    pa = pyaudio.PyAudio()
    audio_stream = pa.open(
        rate=porcupine.sample_rate,
        channels=1,
        format=pyaudio.paInt16,
        input=True,
        frames_per_buffer=porcupine.frame_length
    )

    print("🚀 Jarvis Voice Client đã khởi động! Hãy nói 'Jarvis'...")

    try:
        while True:
            pcm = audio_stream.read(porcupine.frame_length)
            pcm = struct.unpack_from("h" * porcupine.frame_length, pcm)

            # Kiểm tra xem có nói từ khóa không
            keyword_index = porcupine.process(pcm)

            if keyword_index >= 0:
                print("⚡ Đã phát hiện 'JARVIS'!")
              
                # 1. Nghe lệnh
                command = listen_command()
              
                if command:
                    # 2. Gửi lên não
                    reply = chat_with_brain(command)
                  
                    # 3. Nói câu trả lời
                    asyncio.run(speak(reply))

    except KeyboardInterrupt:
        print("Đang tắt...")
    finally:
        if audio_stream is not None:
            audio_stream.close()
        if pa is not None:
            pa.terminate()
        porcupine.delete()

if __name__ == "__main__":
    main()

4. Trải nghiệm cảm giác Iron Man

Bước 1: Đảm bảo Docker Backend đang chạy.

Mã:

docker-compose up -d

Bước 2: Chạy file script Python trên máy thật.

Mã:

python jarvis_voice.py

Bước 3: Thử nghiệm.

Ngồi im lặng, máy sẽ không làm gì cả.
Hô to: "Jarvis!"
Terminal sẽ hiện: ⚡ Đã phát hiện 'JARVIS'! và 🎤 Đang nghe lệnh....
Nói tiếp: "Bật đèn phòng khách lên."
Jarvis sẽ xử lý, gọi API Smart Home (nếu có), rồi trả lời bằng giọng nói: "Đã bật đèn phòng khách thưa sếp."

Tổng kết

Bây giờ bạn đã có một trợ lý ảo hoàn chỉnh:

Tai: Porcupine + SpeechRecognition.
Miệng: Edge-TTS.
Não: Ollama/FastAPI (chạy trong Docker).
Mắt: Camera Vision (đã làm ở phần trước).

Hệ thống này hoàn toàn có thể mở rộng để đặt trong một chiếc loa thông minh (như Raspberry Pi) và để ở phòng khách.

Tuy nhiên, Jarvis vẫn còn một điểm yếu cuối cùng: Bộ nhớ ngắn hạn. Nếu bạn tắt script đi, nó sẽ quên mất bạn tên gì.
Làm sao để Jarvis nhớ mãi mãi những thông tin quan trọng của bạn?

Voice Activation - Đánh thức Jarvis bằng "Hey Jarvis"

AI Hunter

Member

1. Chuẩn bị thư viện (Trên máy thật)​

2. Lấy Key Porcupine (Miễn phí)​

3. Viết Client Voice ("Cơ thể" của Jarvis)​

4. Trải nghiệm cảm giác Iron Man​

Tổng kết​

1. Chuẩn bị thư viện (Trên máy thật)

2. Lấy Key Porcupine (Miễn phí)

3. Viết Client Voice ("Cơ thể" của Jarvis)

4. Trải nghiệm cảm giác Iron Man

Tổng kết