Site Reliability Engineer

Ludus


Date: 9 hours ago
City: Grand Rapids, MI
Salary: $115,000 - $125,000 per year
Contract type: Full time
We are looking for a Site Reliability Engineer to help us build and maintain our infrastructure, ensure system reliability, and optimize application performance at scale.

Ludus is a SaaS company that builds digital tools trusted by thousands of organizations of all sizes to power their event ticketing, marketing, fundraising, retail and registration needs.

The Site Reliability Engineer at Ludus is responsible for expanding and managing the web services that power the company’s ecosystem. This role involves designing and implementing scalable, high-performing solutions for servers, databases, and deployment strategies.

The position plays a key part in building a sustainable and reliable future for Ludus’ growing platform and user base. With a strong understanding of business goals, the Site Reliability Engineer helps shape architecture and development operations across the suite of applications.

Beyond modernizing core systems and optimizing performance, this role plays a strategic part in scaling reliability through improved redundancy and better distribution of infrastructure expertise across the team.

Responsibilities include hands-on work with database backups, server deployments, monitoring, and other key infrastructure tasks that help maintain system stability and reliability.

The Site Reliability Engineer will be an important part of a collaborative engineering team and have the opportunity to contribute meaningfully to a fast-growing product while shaping the future of infrastructure at Ludus.

Join us in our mission to bring people together through shared experiences and building digital tools for arts organizations.

Learn More About Ludus Here

Things you should know before applying

At Ludus, our mission is to bring people together through shared experiences. It’s a big goal that allows for limitless expansion to make a difference in the world.

We get shit done, move fast, and are constantly learning and adapting. We embrace low process and high trust to navigate change and figure out what works.

We believe success is never final and when you think it is, that is when you fail. For us, success is a sum of small efforts, including trial and error, so we move at an unrelenting weekly pace.

In order to grow as a company and individually, discomfort is necessary so we can continue exploring new ideas and push ourselves to build quality tools to offer the best solutions for our customers.

If you prefer a hand-holding environment where everything is black and white, that’s not us. If you’re a self-starter and can thrive in controlled chaos, Ludus is the place for you — it’s time to find your role

What You'll Be Doing

  • Infrastructure automation and configuration management
  • Helping to manage our database integrity and reliability through backups, read replicas, setting up monitoring tools, and other necessary database operations
  • Create, architect, monitor, and troubleshoot our system infrastructure.
  • Develop and provide operational support for full-stack software applications.
  • Capacity planning, testing, and performance optimization
  • Fault tolerance, disaster recovery, incident response and analysis
  • Release management and deployment automation
  • Help ensure network and server security across our various communicating servers
  • Increase system resilience and serve larger customer volumes through code, server configuration, and other system scaling methodologies
  • Optimizing and managing our CI/CD pipelines for improved release confidence and reduced regression
  • Manage cloud and database system maintenance, debugging production issues as they arise
  • Research, development, and leadership in regards to ways we can improve our overall systems at Ludus
  • Gain exposure to our overall infrastructure (database backup systems, server deployment, recovery processes, etc).
  • Monitor server health, security, and application performance, proactively preventing downtime.
  • Assist in MySQL performance tuning and optimizations when needed.
  • Help triage and handle infra-related incidents and participating in our rotating on-call schedule

Traits we're looking for

(In no certain order)

  • 5+ years of experience with cloud and server technologies
  • Strong understanding of Continuous Integration and deployment strategies with common DevOps tools, Linux servers, and the web application deployment
  • Strong understanding of MySQL relational databases, the ability to configure read replicas and backups, archiving/partitioning and troubleshooting performance issues
  • Good understanding of database migrations using Phinx, Artisan, or similar tools
  • Experience with Docker or similar containerization tools.
  • Familiarity with Nginx & php-fpm
  • Understanding of fundamental design principles behind a scalable application
  • Proficient with Git and code versioning tools.
  • Comfortable working in Linux environments and optimizing Nginx, MySQL, and caching strategies.
  • Experience with developer-focused cloud providers like DigitalOcean or Linode (we don’t use AWS, so an AWS-heavy background may not translate directly).
  • Some hands-on experience or familiarity with Docker, Ansible, Cloudflare (Workers, Firewall, Load Balancing), and MySQL replication.
  • Interest in observability, including monitoring, logging, and alerting tools.
  • Experience writing secure, well-tested code and identifying vulnerabilities.
  • Strong debugging skills, including profiling, query optimization, and performance tuning.

BONUS Qualifications

  • Familiarity with common web technologies and frameworks like PHP/Laravel
  • Experience with edge technologies like Cloudflare
  • Experience with observability and logging technologies
  • Familiarity with WebSockets or event-driven architectures.
  • Exposure to infrastructure-as-code tools like Terraform, Pulumi, etc.
  • Background in load testing and high-concurrency optimization.

Personal Attributes

  • Autonomous in their ability to develop software and ask questions
  • Ability to collaborate with humility and curiosity in a team environment
  • Ability to provide thoughtful technical solutions to code and architecture without over-engineering solutions

Perks

Health Insurance (Medical, Vision, Dental) — Provided by Blue Cross Blue Shields and Guardian. Ludus covers 90% of the premium of our employees and 50% of all dependents.

401(k) matching — Full match on the first 5% contribution and 50% match on the next 5% of contribution (7.5% contribution match by Ludus if you contribute 10%).

Profit Sharing and Stock Options — We believe in sharing our success and offer annual profit-sharing bonuses during profitable years, along with stock options that give employees a stake in our long-term growth and success.

Personal Wellness — $50 monthly reimbursement that can be used on anything personal wellness related.

Experience Credit — $100 yearly reimbursement toward concert tickets, theatre tickets, etc. to encourage shared experiences.

Flexible PTO — Take the time you need for vacation or personal days - simply work with your team to ensure everything runs smooth while you are away.

Sick Days — If you're under the weather, we expect you to take the time needed to recover within reason.

Role Details

  • Salary Range: 115k-125k
  • Location: West Michigan, Hybrid

Apply for the job

Interested in joining our growing team? Then we'd love to hear from you!

How to apply

To apply for this job you need to authorize on our website. If you don't have an account yet, please register.

Post a resume