Wednesday, March 18, 2026
Home » Production AI is a data pipeline problem: Insights from 504 enterprises running private AI
AI

Production AI is a data pipeline problem: Insights from 504 enterprises running private AI

For the past two years, the conversation around AI infrastructure has focused on the emergence of mainstream Large Language Models (LLMs) from several notable and popular providers, including OpenAI, Anthropic, Meta and others. This has driven massive demand for GPU resources for model training and inference across hyperscaler clouds and a new generation of neocloud providers. But enterprises already running AI in production are discovering that scaling AI depends on solving a different challenge. It is not only a compute problem.

Production AI is a data pipeline problem.

New independent research from Freeform Dynamics, based on a survey of 504 senior IT and data leaders, provides insight into how enterprises already running private AI environments, sometimes referred to as sovereign AI, are actually building infrastructure for production workloads.

Survey respondents come from medium to large enterprises across industries including financial services, manufacturing, healthcare and life sciences, professional services, government, and media and entertainment. Most represent organizations with more than 1,000 employees across the United States, United Kingdom, France, and Germany.

You can explore the full findings in the Freeform Dynamics survey report Storage Infrastructure for Enterprise AI: Lessons from Seasoned Adopters on Building Scalable Sovereign Environments.

What the research shows

Across enterprises already running AI in production, three clear patterns are emerging.

  1. AI infrastructure constraints are shifting beyond GPUs. Storage performance and data movement are becoming just as critical to maintaining throughput in production pipelines.
  2. Object storage is already widely used in AI environments. The survey reinforces a trend the industry has been discussing for the past year, with object storage functioning as a foundational layer within many production architectures.
  3. Tiered infrastructure models are becoming the operational norm, combining fast storage for active workloads with scalable capacity layers for persistent AI data.

Together, these findings point to a broader conclusion: Scaling AI successfully depends on how effectively enterprises manage and operationalize data across the AI lifecycle.

Infrastructure, not just GPUs, becomes the constraint

The public narrative around AI infrastructure still focuses heavily on the scarcity of compute resources, but enterprises already running AI workloads report a broader set of infrastructure pressures.

According to the Freeform Dynamics research:

  • 57% prioritize storage performance to avoid AI bottlenecks
  • 54% cite compute or GPU availability
  • 52% cite network bandwidth

These findings reinforce an emerging reality. As AI workloads scale, infrastructure teams must support continuous data movement across training, inference, and operational pipelines.

GPUs remain essential, but production AI environments also depend on systems that can stage, process, govern, protect, and reuse data across the lifecycle.

For infrastructure teams, this shifts the conversation from compute capacity alone to the architecture of the full data pipeline.

A key finding: Object storage is foundational to AI infrastructure

One of the clearest signals from the Freeform Dynamics survey research is how consistently object storage appears in production AI environments. As organizations move from experimentation to production AI, infrastructure decisions increasingly revolve around how data is stored, staged, and reused across the pipeline.

The survey results reinforce what many infrastructure teams have already observed in practice.

Among enterprises running private AI in production:

  • 91% report meaningful use of object storage
  • 44% use it extensively
  • 47% use it quite a bit

Object storage represents the highest overall adoption among storage architectures, slightly ahead of file-based storage and well above block-based storage.

Importantly, the survey findings do not suggest that object storage replaces other storage types. Instead, it functions as a foundational layer within tiered architectures, where different storage technologies support different performance and workload requirements.

File systems and other fast tiers continue to serve active workloads, while object storage provides scalable capacity for persistent and reusable data across the AI lifecycle.


Tiered architectures are the operational reality

The survey research also reveals how enterprises are evolving infrastructure to support AI.

Many organizations are not building entirely new environments from scratch. Instead, they are adapting and extending existing systems while introducing new components where needed.

Survey results show:

  • 44% adapt existing compute infrastructure for AI
  • 42% adapt existing storage infrastructure
  • 40% purpose-build compute environments
  • 39% purpose-build storage infrastructure

This mix reflects how production AI environments typically evolve in practice. Rather than greenfield deployments, many organizations are layering AI workloads onto existing infrastructure while building specialized components for scale.

In practice, this results in tiered architectures that combine:

  • Fast storage tiers for active training and inference workloads
  • Scalable object storage tiers for persistent datasets and reusable model inputs

These architectures prioritize simplicity, operational predictability, and long-term scalability.

Production AI requires new data capabilities

As these environments mature, enterprises are encountering new operational challenges related to data management.

In the research:

  • 40% cite metadata handling at scale as a bottleneck risk
  • 38% report challenges supporting mixed workloads

These findings highlight the operational complexity of modern AI pipelines. Training pipelines, inference workloads, data ingestion, and governance all place different demands on infrastructure.

Successful AI environments must support:

  • high-throughput data ingestion for training
  • low-latency access for inference
  • governance and lifecycle management for reusable data assets

AI systems do not operate as isolated compute jobs. They operate as data pipelines that continuously ingest, process, and reuse data across the lifecycle.

A data platform view of AI infrastructure

The enterprises represented in this research are already operating AI in real production environments, and their experience points to a clear conclusion: Scaling AI successfully requires infrastructure designed around the data pipeline, not just the compute layer.

As AI becomes increasingly inference-driven and embedded in operational systems, organizations are prioritizing platforms that can keep data close to compute, manage it across the lifecycle, and support predictable performance at scale.

This is why many production AI environments are converging on simple, tiered architectures built around object-based data foundations.

The data defines the problem.
The platform determines who scales.

To explore the full research findings and architectural insights, download the Freeform Dynamics survey report: Storage Infrastructure for Enterprise AI: Lessons from Seasoned Adopters on Building Scalable Sovereign Environments.