The AI Infrastructure You Don't Know You Have: Why Security Teams Are Scrambling to Find Hidden Deployments

Q: What Changed in Julius v0.2.0?

Julius, an open-source security scanner developed by Praetorian, nearly doubled its detection capabilities in its latest release. The tool went from identifying 33 different AI services to 63 in a single update, adding 30 new detection probes that cover the full spectrum of how organizations actually deploy AI infrastructure . The expansion reflects a critical gap in enterprise security: while teams were focused on detecting self-hosted basics like Ollama and vLLM, they missed the bigger picture. Organizations are now running AI through cloud providers, deploying high-performance inference engines for production workloads, and using gateway systems to route traffic between applications and models. Julius now covers all three layers .

Q: Which AI Services Are Now Detectable?

The new probes span three major categories of AI infrastructure. Cloud-managed services represent the first wave, where organizations assume their endpoints are inherently private. They often aren't. Misconfigured API gateways, exposed proxy layers, and overly permissive network policies can put them on the open internet . The second category covers high-performance inference engines that teams deploy for speed, latency, or cost optimization. These tend to run with default configurations and minimal authentication . The third category is where things get particularly sensitive. AI gateway systems route, observe, and control traffic between applications and language models. An exposed gateway often means access to every model and API key behind it .

Q: Why Self-Hosted RAG Platforms Present the Biggest Risk?

Retrieval-Augmented Generation, or RAG, platforms are purpose-built to ingest and query internal documents. These systems are designed to handle contracts, HR policies, financial data, and source code. An exposed RAG endpoint is, by definition, an exposed document store . PrivateGPT is a telling example. The entire value proposition is "keep your documents private by running everything locally." The irony is that PrivateGPT's API defaults to no authentication. Its document list endpoint is a simple web request that returns every ingested document's metadata, including filenames and chunk counts. The model field is hardcoded to "private-gpt," which makes detection trivial and false positives near-zero . RAGFlow follows a similar pattern. Its health check endpoint is unauthenticated and returns a JSON response with a field unique to RAGFlow that tracks the status of the Elasticsearch or Infinity backend powering document retrieval. Even when RAGFlow is partially broken, the health endpoint still responds with the same structure, making detection reliable in any state .

Q: What Makes This Discovery Tool Different?

Julius is not a model fingerprinting tool, which identifies which language model generated a piece of text. Instead, Julius identifies the server infrastructure itself: what software is running on the endpoint. Think of it as service detection for AI, similar to what network mapping tools like Nmap do for traditional infrastructure . The v0.2.0 release also hardened the scanner itself with response size limiting and TLS configuration options for enterprise environments. The tool fixed several detection issues, including an Ollama probe that was false-positiving on Ollama-compatible servers like SGLang and KoboldCpp by requiring specific fields in API responses. It also fixed header detection rules that silently failed on HTTP/2 connections, affecting five cloud probes . The coverage now spans the full AI infrastructure stack: from cloud-managed inference through self-hosted serving to the RAG and orchestration layer. If an organization is running AI infrastructure, Julius should find it. The development team continues to expand probe coverage as new tools emerge, accepting community contributions in the form of simple YAML files that can be tested locally before submission . For security teams accustomed to discovering shadow IT through traditional infrastructure scanning, this represents a new frontier. The AI infrastructure explosion has outpaced security discovery tooling, leaving organizations vulnerable to exposure of sensitive documents, API keys, and model access through endpoints they didn't know existed.

FrontierNews.ai AI Research Desk

FrontierNews.ai