META-CHAIN

Joined in 2022
11% answers
META-CHAIN is a financial services software company, primarily in the cryptocurrency industry. We try to deliver products that are easy to use to the end user. Our main goal is to develop and improve DeFi sector.
  • · 137 views · 31 applications · 7d

    Senior DevOps / AWS Cloud Engineer

    Full Remote · Countries of Europe or Ukraine · Product · 3 years of experience · English - B1
    Stack: ASP.NET Core (C#) for most microservices, Go/Java for matching/wallet components (where applicable), PostgreSQL (transactional), ClickHouse (analytics), Kafka (AWS MSK), Redis, AWS, HSM (e.g., Thales Luna 7), CQRS/Event Sourcing where appropriate,...

    Stack: ASP.NET Core (C#) for most microservices, Go/Java for matching/wallet components (where applicable), PostgreSQL (transactional), ClickHouse (analytics), Kafka (AWS MSK), Redis, AWS, HSM (e.g., Thales Luna 7), CQRS/Event Sourcing where appropriate, gRPC + REST.

    Responsibilities

    • Design, build, and operate production AWS infrastructure for a highload exchange (security, scalability, reliability).
    • Own and evolve CI/CD pipelines (GitHub Actions / AWS CodeBuild/CodePipeline), release strategies (blue/green, canary), rollbacks, and safe migrations.
    • Run container workloads on ECS Fargate or EKS (based on final architecture decisions), including deployment automation and operational playbooks.
    • Operate and tune Kafka on AWS MSK: topics, partition strategy, retention, ACL/SASL, consumer lag, retry/DLQ patterns, schema/versioning practices.
    • Operate PostgreSQL (preferably Aurora): performance, replication, backup/restore, failover testing.
    • Maintain ClickHouse (cluster/replication/partitions/merges/backup) and Redis (ElastiCache) for caching/rate-limit and low-latency access.
    • Implement observability: OpenTelemetry, metrics/logs/traces, alerting, SLO/SLA, incident response, RCA and postmortems.
    • Security ownership: IAM design, KMS, Secrets Manager/Parameter Store, encryption at rest/in transit, secret rotation, hardening, network segmentation (VPC, SG, NACL).
    • Integrate with the signing/HSM perimeter (Thales Luna / PKCS#11, or CloudHSM/KMS where applicable): secure key workflows, audit trails.
    • Protect public endpoints: WAF/Shield, rate limiting, API exposure via ALB/NLB/API Gateway/CloudFront (per final design).
    • Capacity planning, cost optimization (FinOps), and DR/BCP readiness (multi-AZ now, multi-region roadmap).

    Required qualifications

    • 4+ years in DevOps/SRE with hands-on AWS production experience.
    • Strong knowledge of AWS core services: VPC, IAM, EC2, ECS/EKS, ALB/NLB, Route53, CloudWatch, CloudTrail, KMS, S3, RDS/Aurora, ElastiCache, ECR.
    • Proven experience with CI/CD and release engineering (blue/green, canary, rollback, safe DB migrations).
    • Production experience with Kafka (ideally AWS MSK): partitioning/retention, consumer groups, idempotency, at-least-once processing, monitoring lag.
    • Solid PostgreSQL skills (ops + performance basics).
    • Containers: Docker, orchestration on ECS/EKS, Infrastructure as Code (Terraform preferred).
    • Strong Linux fundamentals, networking (TLS, DNS), and practical security mindset.
    • Incident management experience: on-call, debugging under pressure, clear RCA, runbooks.

    Nice to have

    • Deep ClickHouse operations experience (replication, partitions, performance tuning, backups).
    • Experience with HSM/PKCS#11, secure signing flows, key custody, audit/compliance requirements.
    • FinTech/Crypto background, familiarity with AML/KYC and audit requirements.
    • Observability stacks: Prometheus/Grafana, Loki/ELK/OpenSearch (or equivalents) alongside OpenTelemetry.
    • Multi-region active-active / DR drills and failover automation.
    • Load testing, capacity planning, and performance engineering.

    What we offer

    • A genuinely challenging system: low latency, high throughput, strict security, real-time eventing and analytics.
    • Strong ownership and impact: you’ll shape our CI/CD, MSK strategy, security baseline, ClickHouse operations, and DR plan.
    • Remote/hybrid (by agreement) and competitive compensation.
    • Solid engineering culture: IaC, automation-first, blameless postmortems, and documentation.

     

    More
Log In or Sign Up to see all posted jobs