Introduction: A New Era of Scientific Cloud and a New Set of Problems
Scientific research is entering a hyper-accelerated era powered by cloud computing. From genomics and bioinformatics pipelines to AI/ML training, microscopy imaging, simulations, and large-scale data engineering, modern discovery now depends on High-Performance Computing (HPC), Graphics Processing Units (GPUs), and elastic cloud infrastructure.
However, the financial and operational models supporting these workloads remain rooted in traditional IT assumptions—centralized budgets, department-based cost centers, predictable usage patterns, and rigid governance. These models are poorly aligned with how research actually operates in the cloud.
Research environments are fundamentally different:
- Funding is tied to grants, not departments
- Principal Investigators (PIs) own budgets and accountability
- Workloads are bursty, unpredictable, and resource-intensive
- Multi-institution collaborations increase complexity
- Researchers are not CloudOps or FinOps specialists
This mismatch creates a growing and often invisible problem: phantom assets. Phantom assets are cloud resources—compute, storage, or GPU capacity—that remain active and incur costs but lack valid project or grant attribution.
Across research institutions, it is common for 10–25% of monthly cloud spend to fall into this category. These assets exist and consume funding, but they lack financial identity. As a result, costs cannot be accurately charged back, audited, or justified to funding agencies—putting grant efficiency and compliance at risk.
WatchDog for Research Gateway, developed by Relevance Lab, introduces a GenAI-driven, autonomous approach to research FinOps—continuously identifying phantom assets, restoring financial context, and enabling proactive cost governance without slowing down scientific discovery.
Why Research FinOps is Different (and Much Harder)
Traditional enterprise FinOps is built on a set of assumptions that work well for predictable, business-driven IT environments, including:
- Predictable workload lifecycles
- Centralized budget ownership
- Department-level cost reporting
- Dedicated and skilled CloudOps teams
- Strict provisioning and governance workflows
Research cloud environments break nearly all of these assumptions.
Unique Challenges in Research Cloud FinOps
- Irregular, grant-based budgets: Funding cycles open and close unpredictably, making fixed budgets and long-term forecasting ineffective.
- PI-owned cloud spending: PIs, not centralized IT or finance teams, own and approve cloud costs.
- Bursty scientific workloads: HPC jobs, graphics processing units (GPUs), AI model training, and data pipelines drive sudden and uneven resource usage.
- Ungoverned resource creation: Manual launches, legacy workflows, and ad hoc experimentation often bypass standard governance and tagging controls.
- Multi-institution collaboration: Research projects frequently span multiple accounts, regions, and institutions, complicating ownership, attribution, and chargeback.
- Simplicity over dashboards: Researchers need clear, actionable insights—not complex FinOps dashboards designed for finance teams.
Why Enterprise FinOps Tools Fall Short
As a result of these differences, traditional enterprise FinOps tools fail to address the root causes of cost leakage in research environments:
- They lack awareness of grants, PIs, and project lifecycles
- They cannot map cloud resources to scientific or research context
- They fail to detect missing or invalid project and grant tags
- They rely on manual intervention rather than autonomous remediation
WatchDog is purpose-built to solve these research-specific challenges, bringing grant-aware, GenAI-driven intelligence to cloud FinOps and empowering scientific innovation without slowing it down.
The Phantom Assets Problem: The Silent Budget Drain
Phantom assets are cloud resources running in AWS that are not mapped to any Research Gateway project, cost center, or grant ID. Although these assets continue to operate and incur costs, they lack financial and project ownership.
They commonly include:
- Amazon EC2 instances
- GPU nodes
- Amazon EBS volumes and snapshots
- Amazon S3 buckets
- Amazon FSx and Amazon RDS databases
- Orphaned cluster artifacts
How Phantom Assets Are Created
Phantom assets typically emerge through everyday research activity:
- Researchers launching cloud resources manually
- Assets left running after experiments or jobs are terminated
- Incorrect, missing, or outdated project and grant tags
- Resources created in non-standard AWS regions
- External collaborators operating without tagging automation
- Legacy workflows provisioned outside Research Gateway
Why Phantom Assets Become a Budget Drain
Because these resources are not aligned with the institution’s financial and governance structure, they create:
- Unallocated spend that cannot be charged back to grants or projects
- Budget leakage, resulting in unexpected overruns for Principal Investigators (PIs)
- Audit risk, as funding agencies require cost transparency and justification
- Forecasting errors, making burn-rate and budget planning unreliable
WatchDog addresses this challenge through continuous, automated detection, intelligent classification, and proactive remediation—eliminating phantom assets before they silently erode research budgets.

WatchDog: GenAI for Research FinOps at Scale
WatchDog is an autonomous FinOps agent purpose-built for Research Gateway environments. It continuously monitors all connected AWS accounts and regions across the institution, providing real-time visibility and control over research cloud spend.
What WatchDog Monitors
- Idle or unused compute resources
- Underutilized CPUs and graphics processing units (GPUs)
- Oversized compute instances
- Budget thresholds at project and account levels
- Security drifts (for example, public S3 buckets and open security groups)
- Tagging and metadata compliance
- Project and grant alignment
- Phantom assets
Key Capabilities

WatchDog goes beyond detection. It explains the root cause of cost and compliance issues, prioritizes actions based on impact, and recommends the least disruptive, most cost-efficient remediation—enabling research teams to stay focused on science, not cloud operations.

Daily FinOps Summary: Visibility for Principal Investigators and Research IT
Research Gateway users receive a daily FinOps summary generated by WatchDog. This summary provides timely, grant-aligned visibility into cloud usage and cost across projects.
What the Daily Summary Includes
- New phantom assets detected in the previous 24 hours
- Estimated unallocated cost associated with phantom assets
- Idle or orphaned compute resources
- Underutilized GPU and HPC nodes
- Accounts approaching or exceeding daily cost thresholds
- Top cost and efficiency optimization opportunities
- High-confidence auto-remediation recommendations
What PIs Gain from the Summary
For the first time, Principal Investigators (PIs) receive a clear, grant-aligned view of:
- Project-level cloud spending
- Active resources currently running
- Assets that may be unnecessary or misaligned
- Whether budgets are on track or trending toward overconsumption
This shared visibility eliminates guesswork, improves accountability, and makes research budgeting more predictable—without adding operational burden to researchers.
GenAI-Powered Project Mapping – Solving the Hardest FinOps Challenge
Assigning cloud assets to the correct project or grant is one of the most time-consuming and error-prone tasks in Research FinOps. Manual tagging, inconsistent metadata, and dynamic research workloads make accurate attribution difficult at scale.
WatchDog uses GenAI to infer the most likely project or grant for each asset by analyzing multiple contextual signals, including:
- Resource naming patterns
- VPC, subnet, and security group associations
- IAM roles and instance profiles
- CloudTrail creation and modification logs
- Related or neighboring resources in the same environment
- Research Gateway project catalog and metadata
Each inference produces a likely project recommendation with an associated confidence score ranging from 0.0 to 1.0, enabling safe and transparent decision-making.
What Administrators Can Do
Administrators can:
- Accept a recommended mapping with a single click
- Bulk-assign project or grant mappings across multiple assets
- Mark assets as intentional or explicitly exclude them from remediation
This capability transforms what was previously a manual, multi-day audit process into an automated, repeatable daily workflow—dramatically reducing operational overhead while improving financial accuracy.
Business Impact: Better, Faster, More Cost-Efficient Research
WatchDog delivers measurable operational and financial outcomes across research cloud environments:

- 20–35% reduction in cloud spend by eliminating phantom assets, idle compute, and oversized resources
- Near-complete tagging compliance across all connected AWS accounts
- Zero unallocated spend, enabling accurate project- and grant-level chargeback
- Audit-ready cost transparency for funding agencies and compliance reviews
- Simplified cost visibility for Principal Investigators (PIs) and research teams
- Reduced manual CloudOps effort for Research IT teams
- Higher productivity and faster research cycles through proactive optimization
Institutions gain end-to-end control over cost, usage, governance, and optimization—without slowing down scientific innovation.
Conclusion: Research FinOps Reinvented with GenAI
WatchDog represents a significant advancement for Research Gateway customers by bringing together:
- Autonomous monitoring across research cloud environments
- GenAI-driven analysis grounded in research context
- Grant-aware intelligence for accurate financial attribution
- Project-centric cost control aligned to scientific workflows
- Daily, actionable insights for PIs and Research IT
- Automated remediation to prevent cost leakage
- End-to-end visibility across accounts, projects, and grants
Together, these capabilities address the real-world complexity of scientific research, modern funding models, and cloud-native workloads.
With WatchDog and continuous phantom asset detection, research institutions can reduce cost leakage, strengthen governance, improve budget predictability, and accelerate scientific outcomes—without adding operational burden for researchers.
This is Research FinOps, redesigned for the next generation of discovery.

