appview/ingestor/src/ingestor.rs
Julia Lange eb28549a0f
Global, mono-binary to libraries and binaries
This separates the previous mono-binary setup into separate libraries
and binaries. Specifically it split the old since api/ingestor binary
into an Atproto, and DB library, as well as an api, and ingestor binary.

Atproto Lib
Was mostly untouched. The original URI implementation was changed to use
FromStr, otherwise only imports were changed.

DB Lib
Is mostly unused, so there wasn't much that needed to be changed. Some
new files were added so that future work on it can hit the ground
running.

Api Binary
Is almost entirely the same. Imports were changed and the ingestor code
of main was removed.

Ingestor Binary
Was almost entirely refactored. An interface to made injestors was
added, and it was modularized. The only shared code is in
Ingestor.start(), and collections.rs's macros, but that is mostly
boilerplate.
2025-06-06 09:39:15 -07:00

53 lines
1.5 KiB
Rust

use crate::collections::Collection;
use rocketman::{
options::JetstreamOptions,
ingestion::LexiconIngestor,
connection::JetstreamConnection,
handler,
};
use std::{
collections::HashMap,
sync::{Arc, Mutex},
};
use tracing::{info, error};
pub struct Ingestor {
ingestors: HashMap<String, Box<dyn LexiconIngestor + Send + Sync>>,
}
impl Ingestor {
pub fn new() -> Self {
Self { ingestors: HashMap::new() }
}
pub fn add_collection<C: Collection>(&mut self, collection: C) {
self.ingestors.insert(collection.get_nsid(), collection.get_ingestor());
}
pub async fn start(self) {
info!("Starting ingestor with the following collections: {:?}",
self.ingestors.keys());
let opts = JetstreamOptions::builder()
.wanted_collections(self.ingestors.keys().cloned().collect())
.build();
let jetstream = JetstreamConnection::new(opts);
let cursor: Arc<Mutex<Option<u64>>> = Arc::new(Mutex::new(None));
let msg_rx = jetstream.get_msg_rx();
let reconnect_tx = jetstream.get_reconnect_tx();
let cursor_clone = cursor.clone();
tokio::spawn(async move {
while let Ok(message) = msg_rx.recv_async().await {
if let Err(e) = handler::handle_message(message, &self.ingestors,
reconnect_tx.clone(), cursor_clone.clone()).await {
error!("Error processing message: {}", e);
}
}
});
if let Err(e) = jetstream.connect(cursor.clone()).await {
error!("Failed to connect to Jetstream: {}", e);
std::process::exit(1);
}
}
}