1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141
|
// Copyright 2020 The Chromium Authors
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
#ifndef CHROME_BROWSER_ACCESSIBILITY_LIVE_CAPTION_LIVE_CAPTION_SPEECH_RECOGNITION_HOST_H_
#define CHROME_BROWSER_ACCESSIBILITY_LIVE_CAPTION_LIVE_CAPTION_SPEECH_RECOGNITION_HOST_H_
#include <memory>
#include <string>
#include <unordered_map>
#include "base/memory/raw_ptr.h"
#include "base/memory/weak_ptr.h"
#include "build/build_config.h"
#include "components/live_caption/translation_util.h"
#include "content/public/browser/document_service.h"
#include "content/public/browser/web_contents_observer.h"
#include "media/mojo/mojom/speech_recognition.mojom.h"
#include "mojo/public/cpp/bindings/pending_receiver.h"
class PrefService;
namespace content {
class RenderFrameHost;
}
namespace captions {
class CaptionBubbleContextBrowser;
class GreedyTextStabilizer;
class LiveCaptionController;
class LiveTranslateController;
///////////////////////////////////////////////////////////////////////////////
// Live Caption Speech Recognition Host
//
// A class that implements the Mojo interface
// SpeechRecognitionRecognizerClient. There exists one
// LiveCaptionSpeechRecognitionHost per render frame.
//
class LiveCaptionSpeechRecognitionHost
: public content::DocumentService<
media::mojom::SpeechRecognitionRecognizerClient>,
public content::WebContentsObserver {
public:
LiveCaptionSpeechRecognitionHost(const LiveCaptionSpeechRecognitionHost&) =
delete;
LiveCaptionSpeechRecognitionHost& operator=(
const LiveCaptionSpeechRecognitionHost&) = delete;
// static
static void Create(
content::RenderFrameHost* frame_host,
mojo::PendingReceiver<media::mojom::SpeechRecognitionRecognizerClient>
receiver);
// media::mojom::SpeechRecognitionRecognizerClient:
void OnSpeechRecognitionRecognitionEvent(
const media::SpeechRecognitionResult& result,
OnSpeechRecognitionRecognitionEventCallback reply) override;
void OnLanguageIdentificationEvent(
media::mojom::LanguageIdentificationEventPtr event) override;
void OnSpeechRecognitionError() override;
void OnSpeechRecognitionStopped() override;
protected:
// Mac and ChromeOS move the fullscreened window into a new workspace. When
// the WebContents associated with this RenderFrameHost goes fullscreen,
// ensure that the Live Caption bubble moves to the new workspace.
#if BUILDFLAG(IS_MAC) || BUILDFLAG(IS_CHROMEOS)
void MediaEffectivelyFullscreenChanged(bool is_fullscreen) override;
#endif
private:
LiveCaptionSpeechRecognitionHost(
content::RenderFrameHost& frame_host,
mojo::PendingReceiver<media::mojom::SpeechRecognitionRecognizerClient>
pending_receiver);
~LiveCaptionSpeechRecognitionHost() override;
void OnTranslationCallback(const std::string& cached_translation,
const std::string& original_transcription,
const std::string& source_language,
const std::string& target_language,
bool is_final,
const captions::TranslateEvent& result);
// Returns the WebContents if it exists. If it does not exist, sets the
// RenderFrameHost reference to nullptr and returns nullptr.
content::WebContents* GetWebContents();
// Returns the LiveCaptionController for frame_host_. Returns nullptr if it
// does not exist. Lifetime is tied to the BrowserContext.
LiveCaptionController* GetLiveCaptionController();
// Returns the LiveTranslateController for frame_host_. Returns nullptr if it
// does not exist. Lifetime is tied to the BrowserContext.
LiveTranslateController* GetLiveTranslateController();
// Processes and returns the text to be dispatched.
std::string GetTextForDispatch(const std::string& text, bool is_final);
std::unique_ptr<CaptionBubbleContextBrowser> context_;
// A flag used by the Live Translate feature indicating whether transcriptions
// should stop.
bool stop_transcriptions_ = false;
// Used to cache translations to avoid retranslating the same string. Cleared
// after every Final to manage the size appropriately.
TranslationCache translation_cache_;
// The source language code of the audio stream.
std::string source_language_;
// The user preferences containing the target and source language codes.
raw_ptr<PrefService> prefs_;
// The number of characters sent to the translation service.
int characters_translated_ = 0;
// The number of characters omitted from the translation by the text
// stabilization policy. Used by metrics only.
int translation_characters_erased_ = 0;
// The number of requests to the translation service. Used by metrics only.
int partial_result_count_ = 0;
// The automatically detected language of the audio stream.
std::string auto_detected_language_;
// The number of consecutive highly confident language identification events.
int language_identification_event_count_ = 0;
std::unique_ptr<captions::GreedyTextStabilizer> greedy_text_stabilizer_;
base::WeakPtrFactory<LiveCaptionSpeechRecognitionHost> weak_factory_{this};
};
} // namespace captions
#endif // CHROME_BROWSER_ACCESSIBILITY_LIVE_CAPTION_LIVE_CAPTION_SPEECH_RECOGNITION_HOST_H_
|