Newxel Lead Data Platform Engineer (NXJ-181)

Lead Data Platform Engineer (NXJ-181)

Type

Remote

Location

Ukraine

The Role

Cloud billing data is inherently messy — multi-cloud, multi-structure, with billing models that don’t normalize cleanly and usage signals that don’t behave. You’ll own the data infrastructure that untangles this: the pipelines, models, and backend services that turn raw AWS, GCP, and Azure billing exports into reliable product capabilities. This is a production ownership role — architecture, code, monitoring, and stability all land on you.

About the Product

The platform ingests and processes large-scale cloud billing, usage, and operational data across AWS, Azure, and GCP — and turns it into cost visibility, recommendations, forecasting, and anomaly detection for enterprise customers. The core engineering challenge is scale and reliability: cloud billing structures are complex, volumes are high, and the data directly drives product decisions and customer spend outcomes. This is a product company, not a consulting engagement — the infrastructure you build runs in production and affects real customers.

Technology Stack: The platform works with cloud billing and usage data from AWS, Azure, and GCP — processed through Python and SQL, orchestrated with Airflow, and running across modern data platforms including Spark, ClickHouse, BigQuery, Databricks, and Snowflake. The stack was chosen for scale and cost-awareness: the same discipline applied to customer cloud spend applies internally. AWS is the primary cloud environment.

What You’ll Be Doing

Design and maintain production ETL/ELT pipelines that ingest, normalize, and model cloud billing and usage data at scale across multiple cloud providers.
Own the performance, reliability, and cost-efficiency of the data platform — query optimization, storage architecture, processing costs, and production stability.
Build backend data services in Python and SQL that power product capabilities: cost recommendations, usage forecasting, anomaly detection, and customer-facing insights.
Work with cloud billing source data including AWS CUR, Azure Cost Management exports, and GCP billing exports — including complex structures like marketplace billing and partner models.
Architect and improve orchestration flows using Airflow or equivalent, across platforms such as Spark, Databricks, Snowflake, BigQuery, or ClickHouse.
Own data quality, monitoring, and observability — not just the pipeline, but what comes out of it.
Review architecture and code, mentor other engineers, and drive engineering standards within the data domain.
Use AI/LLM tools (Cursor, GitHub Copilot, Claude, ChatGPT, or equivalent) as a daily development accelerator — for coding, debugging, testing, documentation, and technical research — while maintaining full engineering ownership of the output.

What We Expect

Must-have

7+ years in data engineering, data platform engineering, or backend engineering with heavy data focus.
Production-grade Python and SQL — not notebooks, not scripts. Code that runs reliably in production at scale.
Strong experience building and maintaining ETL/ELT pipelines with real ownership: design, deployment, monitoring, and incident response.
Experience with large-scale data processing using Spark or equivalent frameworks.
Workflow orchestration with Apache Airflow or similar.
Cloud experience, primarily AWS. GCP and/or Azure are a strong advantage.
Hands-on experience with at least one cloud data warehouse or query engine: Redshift, Athena, BigQuery, Snowflake, Databricks, ClickHouse, or equivalent.
Strong understanding of data modeling, data quality, and production monitoring.
Demonstrated experience optimizing query performance, storage usage, or infrastructure costs.
Ability to lead technical discussions, own domains end to end, and mentor other engineers.
Hands-on experience using AI/LLM tools as part of the software development workflow — and the engineering judgment to validate what they produce.

Nice to have

Experience with cloud billing data specifically: AWS CUR, Azure Cost Management, GCP billing exports, marketplace billing, or partner billing models.
Background in FinOps, cloud cost optimization, or usage-based billing data.
Experience building data products: recommendations, forecasting flows, dashboards, or anomaly-detection pipelines.
Experience leading a small team or acting as a technical owner for a data domain.
AWS services experience: S3, Glue, Lambda, Athena, Redshift, EMR, ECS, EKS.
Track record in a startup or product-company environment.

Why This Role Is Worth Your Time

The domain has real technical depth. Cloud billing data is genuinely complex — multi-source, inconsistently structured, high-volume, and directly tied to business outcomes. This isn’t CRUD pipelines over clean data.
You’ll own infrastructure that shapes the product. The data platform isn’t a support function — it’s the core layer that makes cost recommendations, forecasting, and anomaly detection possible. What you build determines what the product can do.
AI tooling is a first-class expectation, not a novelty. The team uses AI tools as daily productivity multipliers. You won’t be explaining why you use Cursor or Claude — you’ll be expected to.
End-to-end ownership. From design to production behavior, you’re accountable. The role suits engineers who want to see their decisions run in the real world, not hand off to someone else.